Tumblr's Parent Company Enters Agreements with OpenAI, Midjourney for Training Data

Automattic is in negotiations with AI firms Midjourney and OpenAI to obtain training data extracted from user posts.

Automattic is reportedly in negotiations with AI firms Midjourney and OpenAI to obtain training data extracted from user posts. Although details remain scarce, it suggests that deals between Automattic and the AI entities are on the verge of finalization.

(Photo : MARTIN BUREAU/AFP via Getty Images)
This illustration picture taken on July 24, 2019 in Paris shows the logo of the US social network application Tumblr on the screen of a tablet.

Negotiating with OpenAI, Midjourney

The parent company of Tumblr and WordPress.com, Automattic, is currently negotiating with AI firms Midjourney and OpenAI to supply training data extracted from user-generated posts.

404 Media indicates that agreements between Automattic and the two AI companies are on the verge of completion. These developments come amidst speculations circulating on Tumblr, hinting at a potential collaboration with Midjourney that could introduce a fresh revenue stream for the platform.

Automattic intends to introduce a new feature enabling users to opt out of sharing their data with third-party entities, including AI firms.

However, internal communications imply that the company inadvertently collected an "initial data dump" comprising all public post content on Tumblr from 2014 to 2023, potentially including non-publicly visible content from blogs.

The fate of this data remains uncertain, raising questions about whether any information was transmitted to Midjourney and OpenAI.

In response, Automattic directed to a statement titled "Protecting User Choice." In the statement, Automattic mentioned collaborations with undisclosed AI companies. They noted that major AI platform crawlers are blocked by default, with continuous updates to the block list for new additions.

They also stated that they would only share public content from sites that have not opted out, and emphasized collaboration with AI firms whose objectives align with community interests.

Companies Entering AI Agreements

Several companies have entered into agreements with AI tool developers to acquire training data, a practice that traditionally involves scraping publicly available online data. Recent legal developments have made this process increasingly risky.

For instance, Reddit has a $60 million annual deal with Google, while Shutterstock has partnered with OpenAI to utilize its photo library for training purposes.

However, this approach has faced criticism from artists and writers, particularly those within the creative community that Tumblr caters to, who oppose the use of their work for training AI models.

Balancing user satisfaction and the exploration of new AI technologies has proven challenging for companies, resulting in a backlash against online platforms like DeviantArt that have experimented with such technologies.

Currently, there is limited information available regarding the specifics of any potential deal, as well as the potential benefits for Automattic.

The company is well-established in the web hosting sector with WordPress.com and WordPress VIP, both of which are powered by open-source WordPress software.

However, monetizing Tumblr has posed challenges since its acquisition from Verizon in 2019. Last year, Automattic announced its decision to scale back its ambitions for the platform.

ⓒ 2024 TECHTIMES.com All rights reserved. Do not reproduce without permission.
Join the Discussion
Real Time Analytics