Reddit FTC Investigation Over Generative AI Training Data
The tech platform Reddit says that the Federal Trade Commission is investigating its deals to license data to artificial intelligence companies seeking to train their models.
The company disclosed the investigation in an SEC filing Friday afternoon, adding that it received word of the investigation on Thursday.
“On March 14, 2024, we received a letter from the Federal Trade Commission (the ‘FTC’) advising us that the FTC’s staff is conducting a non-public inquiry focused on our sale, licensing or sharing of user-generated content with third parties to train AI models,†the SEC filing says. “Given the novel nature of these technologies and commercial arrangements, we are not surprised that the FTC has expressed interest in this area. We do not believe that we have engaged in any unfair or deceptive trade practice. The letter indicated that the FTC staff was interested in meeting with us to learn more about our plans and that the FTC intended to request information and documents from us as its inquiry continues.â€
Reddit filed its form S-1 with the SEC last month, a key step before launching a widely expected initial public offering (IPO). The filing revealed a slew of details about the company, including a large stake (about 9 percent) owned by OpenAI CEO Sam Altman, and a pitch to investors that leaned heavily on the potential revenue from generative AI training, noting a $203 million agreement believed to be with Google.
“Our content is particularly important for artificial intelligence (‘AI’) — it is a foundational part of how many of the leading large language models (‘LLMs’) have been trained,†the company wrote in the S-1. “Reddit data constantly grows and regenerates as users converse. As the world becomes increasingly data-driven, we offer solutions that are human- and experience-focused. We expect our data advantage and intellectual property to continue to be a key element in the training of future LLMs.â€
The company said Friday that it was adjusting its risk factors to further account for the FTC investigation:
“Regulatory engagements can be lengthy and unpredictable,†the filing said. “Any regulatory engagement may cause us to incur substantial costs, and it is possible for any regulatory engagement to result in reputational harm or fines, cause us to discontinue or modify our products, services, features, or functionalities, require us to change our policies or practices, divert management and other resources from our business, or otherwise adversely impact our business, results of operations, financial condition, and prospects.â€
Data licensing for generative AI models has become a hot topic in Hollywood, as film and photos from Hollywood could be valuable as video and photo models evolve. Similarly, data from news outlets is seen as helpful for LLMs which want to have real-time information.
Read the original article here