Too much of a good thing can be bad, and that is what is happening over at Bluesky which is now facing criticisms because of its renowned 'open API' called Firehouse, as almost anyone can take data from it. This means that it is open season for data scrapers who want to make use of Bluesky's content and information, which also means that users' data are up for grabs.
There are plenty of reasons that make this open API dangerous for users, particularly as scrapers may use it to train artificial intelligence models, and these would not be reprimanded by the company.
Bluesky's Open API, Firehouse, Allows Data Scrapers to Access
Bluesky may have been commended for promising the world that it will not access user data for AI training, but the company is now embroiled in a new controversy that centers on its open API, Firehouse. This is because Firehouse leverages the open API access to developers which means that user data are up for grabs to whoever wants to obtain this information from the platform.
Reports claimed that while there are many developments possible thanks to this open API, it will also allow data scrapers, particularly the notorious AI companies, to use this content for whatever they want, including AI training.
Bluesky added to its thread that it would not be able to "enforce" consent outside of the platform and its systems and said that outside developers are left to decide whether to "respect" these consent settings.
Read Also: Bluesky Hits 20 Million Users After X Subscribers' Migration: Will It Surpass Instagram Threads?
Bluesky Data for AI Training is Possible As Per Hugging Face
This currently puts Bluesky users in a grey area where they cannot prevent scrapers from accessing their data and applying it to AI training datasets, particularly as it was sourced from Firehouse.
Hugging Face, the AI community platform for training and development, recently shared that it was able to obtain 1 million public posts via Bluesky's Firehouse API, publishing its dataset to a public repository.
Bluesky's Stellar Growth and Recent Developments
While Bluesky is already a popular platform thanks to Jack Dorsey's previous promise of a decentralized social media experience, the company has yet to see its big break until the recent US Presidential elections. After Trump won, many users disapproving of Elon Musk finally made the heavy decision to abandon X in favor of alternatives, with Bluesky seeing a whopping 700,000 users joining days after the polls.
It is known that Bluesky pushes for a platform that listens to its users and prioritizes their online experiences with significant features made available for all to access. Earlier this year, the company introduced the "Ozone" tool which offers developers and users a way to adopt, create, and join their preferred third-party moderation service to maximize their social media consumption.
Most recently, Bluesky promised its global user base that the company will not use their data for its current AI developments and training, along with the time when X updated its Terms of Service which informed users that their data will be used to train Grok. Bluesky may not be using user data for AI training, but its open API will essentially have it available for scrapers to access, and the company is leaving it to fate if developers ask for consent.