Meta will reportedly soon use people's Facebook and Instagram posts as training data for the tech giant's artificial intelligence tools, including posts originating from 2007. Several countries have the option to opt out of the new policy but others are forced to comply.
Beginning on June 26, Meta will begin using user data dating back to 2007 to train and enhance their AI technologies as part of the new privacy policy. Except for private messages, this covers user postings, images, captions, and messages to Meta's AI chatbot.
Only users in the US state of Illinois and the European Union are now able to opt-out due to AI protection rules like the General Data Protection Regulation (GDPR) in effect in these regions. Conversely, Australia is one of the countries that is unable to withdraw from the policy.
As a result, the advocacy group NOYB (none of your business) has filed 11 complaints with the EU, alleging that Meta intends to use years' worth of private photos, posts, and online tracking data for an unidentified "AI technology" that can gather personal information from any source and share it with any number of unidentified third parties. As a result, the group is pleading with authorities to intervene and revoke the policy.
According to NOYB, it would affect about 400 million customers throughout Europe. The fact that users would have to actively choose not to submit data in the future, according to the group, was troubling.
Hollywood as AI Training Data
Meta went so far as to offer Hollywood production and film studios AI licensing deals for millions of dollars, as it searches for methods to feed its AI models with high-quality data.
Along with Google, Meta waved tens of millions of dollars at Hollywood studios, hoping to collaborate with them on advancing AI technology.
According to sources, Google and Meta have not yet released an official statement about these new proposals for studios and businesses, which are focused on enhancing their AI video creation.
Concerns regarding AI's potential to influence video editing and production are enormous, with the SAG-AFTRA and WGA strikes being two of the most significant examples of resistance.
AI Training on AI Data
Training data for AI continues to be a significant concern for Big Tech as previous sources have reported that internet data is running out for AI companies to train their models.
In contrast to Meta's most recent strategy, AI companies are looking into other ways to get training data as traditional internet data sources are becoming more scarce.
Some are using AI algorithms to create synthetic data and publicly accessible video transcripts. However, because it relies on artificially generated data, this strategy has its own set of drawbacks, such as an increased chance of AI model hallucinations.
Experts are worried about the possible consequences of training AI models with synthetic data due to its dependence. Concerns have been raised regarding a process known as "digital inbreeding," in which AI models that are trained on data created by AI systems may experience instability problems, resulting in less-than-ideal performance or failure.
AI behemoths like OpenAI are taking a unique approach to model training in response to the data shortage issue. For example, OpenAI, the company that created ChatGPT, is reportedly considering training its GPT-5 model with transcriptions of publicly accessible YouTube videos. These strategies have been criticized, though, and video content creators may even file legal challenges in response.