Openai And Google Reportedly Used Youtube Transcripts To Train Their Ai Models

serveurs

YouTube on iPhone — Get ready for a brand new YouTube experience.

Maria Diaz/

Training artificial intelligence models requires a lot of data to help them better understand the context of queries and ultimately provide better responses. In the constant search for more data, both OpenAI and Google have turned to using YouTube videos, created by others, to train their large language models (LLMs), The New York Times reported over the weekend, citing people who claim to have knowledge of the companies' activities.

In 2023, OpenAI developed Whisper, a speech recognition tool that would help the company scrape YouTube, take audio from more than 1 million YouTube videos, and use that to inform GPT-4, according to the Times' sources.

Google, meanwhile, also transcribed YouTube videos, according to the report. What's more, the search giant changed its terms of service in 2023 to make it easier to sweep up public Google Docs, Google Maps restaurant reviews, and other publicly available content for use in its AI models, according to the Times.

Also: Have 10 hours? IBM will train you in AI fundamentals - for free

It's no secret that AI models require significant troves of data to operate efficiently. More data, including text, audio, and videos, gives models the ability to understand human context, human interaction, and other critical communication details that make them more effective.

However, there's increasing tension between the companies developing those models and the content creators. What content, if any, should be permissible to use in training AI models? In a growing number of cases, news outlets, websites, and content creators themselves are calling on OpenAI, Google, Meta, and other tech companies to pay for access to their content before they can be used to train LLMs.

In some cases, model makers have complied and signed agreements with companies, including Reddit and Stack Overflow, to get access to user data. In other cases, not so much.

According to The New York Times' report, for instance, OpenAI's alleged transcription of more than 1 million YouTube videos may run afoul of Google's own terms of service, which prevent third-party applications from using its YouTube videos for "independent" means. Additionally, the companies' decisions to allegedly transcribe videos may run afoul of copyright laws, since YouTube creators who upload videos to YouTube still retain the copyright to the content they create.

To be clear, the Times report cannot be independently verified. Also, neither Google nor OpenAI acknowledged that they scraped data illegally. We do know, however, that the companies are running out of ways to access more content. What's worse, a Times source said that it's possible tech companies will run out of content to ingest into their models by 2026.

Also: I spent a weekend with Amazon's free AI courses, and highly recommend you do too

What then? It's entirely possible - and perhaps, likely - that the tech companies move to sign licensing agreements with content creators, media outlets, and even musical artists to access their creations. It's also possible they will further change their terms of service, or worse, find ways to skirt privacy laws, to access the data they currently can't.

It's clear that the amount of data companies like Meta, Google, and OpenAI will need in the coming years will only increase. It's critical that as they access that data, they do so in a way that doesn't harm the people who created the content in the first place.

Featured

Google finally launches its Find My Device network. Here are the Android models that support it
5 Linux commands you must know to keep your device running smoothly
Apple is finally adding an iOS home screen feature that Android has had for 15 years
I changed this Android setting to instantly double my phone speed
The best AirTag for your wallet is flat, rechargeable, and isn't made by Apple

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

serveurs

Nouvelles chaudes

Huawei s5735-l24p4s-a-v2 Delivers Scalable, Secure, and Smart PoE Access for Modern IT Infrastructures

Huawei S5735-L48T4XE-A-V2 Switch Delivers Enterprise-Grade Performance in a Compact Design

Huawei S5735-L48P4XE-A-V2 Review: Versatile Campus Switch with iStack and Full L3 Support

Differences Between Huawei CE Series and S Series Switches

Huawei CloudEngine S5735 Switches Set the Benchmark for High-Performance, Energy-Efficient Switching

Huawei CloudEngine S5731‑S48P4X Datasheet

Huawei CloudEngine S5731‑S24P4X Datasheet

Huawei S5731-S Empowers Next-Generation Campus Networks with Advanced Capabilities

Huawei S5731-H24P4XC Switch Review: Power-Packed Performance and Smart PoE

Huawei S5731-H Series Switches Redefine Campus Networking with Intelligent High-Performance Architecture

Top Features of the Huawei S5731-S24T4X: The Ultimate Gigabit Access Switch for Modern Networks

General Power Module Fault Location Procedure (CE8800 & 7800 & 6800 & 5800)

How Do I Split a Stack? How to clear the stacking configuration?

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

Difference between campus switch and data center switch

Huawei S6730-H28Y4C Campus CloudEngine Switch Datasheet

S6730-H48Y6C: Unleashing Power and Flexibility for Modern Networking

CloudEngine S6730-H Series Switches Datasheet

Huawei CloudEngine Switch S6730-S24X6Q Datasheet

OpenAI and Google reportedly used YouTube transcripts to train their AI models

Featured

Tags chauds: Innovation et Innovation

Ordering Guide

Ressources ressources

À propos de nous

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

serveurs

Nouvelles chaudes

Huawei s5735-l24p4s-a-v2 Delivers Scalable, Secure, and Smart PoE Access for Modern IT Infrastructures

Huawei S5735-L48T4XE-A-V2 Switch Delivers Enterprise-Grade Performance in a Compact Design

Huawei S5735-L48P4XE-A-V2 Review: Versatile Campus Switch with iStack and Full L3 Support

Differences Between Huawei CE Series and S Series Switches

Huawei CloudEngine S5735 Switches Set the Benchmark for High-Performance, Energy-Efficient Switching

Huawei CloudEngine S5731‑S48P4X Datasheet

Huawei CloudEngine S5731‑S24P4X Datasheet

Huawei S5731-S Empowers Next-Generation Campus Networks with Advanced Capabilities

Huawei S5731-H24P4XC Switch Review: Power-Packed Performance and Smart PoE

Huawei S5731-H Series Switches Redefine Campus Networking with Intelligent High-Performance Architecture

Top Features of the Huawei S5731-S24T4X: The Ultimate Gigabit Access Switch for Modern Networks

General Power Module Fault Location Procedure (CE8800 & 7800 & 6800 & 5800)

How Do I Split a Stack? How to clear the stacking configuration?

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

Difference between campus switch and data center switch

Huawei S6730-H28Y4C Campus CloudEngine Switch Datasheet

S6730-H48Y6C: Unleashing Power and Flexibility for Modern Networking

CloudEngine S6730-H Series Switches Datasheet

Huawei CloudEngine Switch S6730-S24X6Q Datasheet

OpenAI and Google reportedly used YouTube transcripts to train their AI models

Featured

Tags chauds: Innovation et Innovation

Ordering Guide

Ressources ressources

À propos de nous

Huawei CloudEngine S5731‑S48P4X Datasheet