"17만 개 유튜브 영상, AI 기업들 무단으로 학습"

An investigation by Proof News and Wired revealed that over 170,000 YouTube videos were part of a massive dataset used to train AI systems for major tech companies. Apple, Anthropic, Nvidia, and Salesforce are among those who used the 'YouTube Subtitles' data extracted without permission. The dataset includes subtitles from videos of popular creators and major news outlets. YouTube has not immediately responded to the findings. AI companies rarely show transparency about the data used in their AI systems. In previous interviews, YouTube CEO Neal Mohan stated that using video content for AI training would violate the platform's terms. Google CEO Sundar Pichai agreed with this assessment, emphasizing the importance of abiding by terms and conditions when building products.

조사에 따르면, 주요 빅테크 기업들이 AI 훈련을 위해 17만 개 이상의 유튜브 동영상을 데이터셋으로 활용했다고 밝혀졌다. 애플, 앤스로픽, 엔비디아, 세일즈포스 등이 허가없이 유튜브에서 추출한 '유튜브 자막(YouTube Subtitles)' 데이터를 사용했다. 이 데이터셋에는 MrBeast, Marques Brownlee 등 유명 크리에이터와 주요 뉴스 매체의 영상 자막이 포함되어 있다. 유튜브는 이에 대한 즉각적인 답변을 하지 않았다. AI 기업들은 자사 AI 시스템에 사용된 데이터에 대해 거의 투명성을 보이지 않고 있다. 유튜브 CEO는 이전 인터뷰에서 AI 훈련을 위한 동영상 콘텐츠 사용이 플랫폼 이용약관을 위반한다고 말한 바 있다. 구글 CEO도 이러한 견해에 동의했다.

Apple, Anthropic, and other companies used YouTube videos to train AI

버트

ai@tech42.co.kr
기자의 다른 기사보기
저작권자 © Tech42 - Tech Journalism by AI 테크42 무단전재 및 재배포 금지

관련 기사

오픈AI, 윈도우용 챗GPT 앱 프리뷰 출시

OpenAI has released a preview version of a dedicated ChatGPT app for the Windows operating system.

퀄컴, '기준 미달'로 스냅드래곤 개발 키트 중단

Qualcomm has canceled its Snapdragon Dev Kit, citing that it did not meet their "usual standards of excellence." Customers, including...

트럼프, "팀 쿡이 EU 벌금 불만 제기 위해 전화" 주장

Former U.S. President Donald Trump claimed he received a call from Apple CEO Tim Cook regarding fines imposed by the European Union (EU). Trump revealed this information during his appearance on the PBD Podcast.

X, AI 훈련용 데이터 제공 허용하는 개인정보 정책 변경

Social media platform X has announced changes to its privacy policy that will allow it to provide user data to third parties for AI model training purposes. The policy, set to take effect on November 15, 2024, is expected to pave the way for external companies to purchase X's data in the form of licenses. The new policy introduces a "third-party collaborators" section, but the specific methods for users to opt out of data sharing remain unclear. This move is seen as a potential new revenue stream for X, which has experienced a decline in advertising revenue. Meanwhile, X has also strengthened penalties for 'scraping', the unauthorized collection of large numbers of tweets. The company has decided to impose a fine of $15,000 for accessing more than one million posts per day. X's owner, Elon Musk, has maintained a tough stance on scraping, and this policy change appears to reflect his attitude.