Reddit Inks Deal To Rake In Millions Selling Your Data To Train AI


In the age of AI, companies are looking to get their hands on large amounts of diverse data to train models on, so they become more useable and proficient. In the case of OpenAI, this means scraping the internet for content to train up ChatGPT, at least up through 2021. However, there is money to be made here as companies can sell the data they hold or control to these AI organizations who need the data. This is evidently what Reddit is doing now with its user-generated content data, as it has reportedly penned a $60 million deal with an AI startup.

Earlier this week, reports surfaced which explained that Reddit purportedly signed an agreement with an unnamed AI company allowing them to train its models on the content of the platform. In exchange, Reddit will receive roughly $60 million annually for access to the content, which likely includes posts and comments on the site. This is also believed to be one of the first deals of its kind and could serve as a real business model for this sort of arrangement in the future.

scroll reddit inks deal to rake in millions selling your content to train ai

The Bloomberg report, which originally broke this story, explained that Reddit is also approaching Initial Public Offering, but both the AI deal and IPO deliberations are “ongoing,” and details could change as time goes on. At the time of this publication, we reached out to Reddit for additional details, but it declined to comment. 

As alluded to, Reddit has had quite the year as it approaches its IPO, which has long been rumored. Earlier in June, the social media platform faced a “blackout” protest after it changed its API fees, leading to the demise of several popular Reddit apps. This was then followed by a security breach from threat actor group ALPHV, who claimed to have made off with 80GB of internal Reddit data that they threatened to release if the API changes were not rolled back. However, this data was held onto it would seem for a more opportune time, which may be fast approaching, or not, depending on the group’s motives.

Despite the rocky year, Reddit is still drawing in 10s of millions of active users and posts besides the estimated $800m in revenue last year. Between this and the goldmine of data that could be effectively “leased” for AI model training, this could make investors bullish when the IPO finally rolls around. However, users might just have something to say about their content being used to train AI models in the long run. We will have to see what happens, though, so stay tuned to HotHardware for the latest on the Reddit data selloff.