RIAA Sues Suno and Udio for $150,000 Per Song Over Data Scraping
Sony, Universal, and Warner Music Group seek massive damages for unlicensed training on copyrighted audio datasets.
The Recording Industry Association of America (RIAA), representing industry titans Sony Music Entertainment, Universal Music Group, and Warner Records, has filed a landmark lawsuit seeking statutory damages of up to $150,000 per infringed work (https://www.reuters.com/legal/music-labels-sue-ai-startups-suno-udio-over-copyright-infringement-2024-06-24/) against AI music startups Suno and Udio. The litigation, filed today in federal courts in Massachusetts and New York, marks a critical escalation in the global battle over the valuation and legal protection of proprietary data assets used to train generative AI models.
The High Cost of Unlicensed Training
The lawsuits allege that Suno and Udio engaged in "massive" copyright infringement by scraping decades of recorded music to train their generative models. According to the filings, Suno allegedly infringed upon 662 copyrighted songs (https://www.theverge.com/2024/6/24/24184792/riaa-suno-udio-ai-music-copyright-lawsuit), while Udio is accused of misappropriating 1,670 recordings (https://www.billboard.com/business/legal/suno-udio-sued-major-labels-copyright-infringement-1235716123/). At the statutory maximum of $150,000 per work, the disclosed potential liabilities for these startups could reach hundreds of millions of dollars, creating a significant financial overhang for the generative audio sector.
The plaintiffs argue that these AI companies are not merely creating new tools but are "stealing" the expressive value of human artists to create competing commercial products. This case strikes at the heart of the "fair use" defense currently cited by many AI developers, who claim that training on public or scraped data is transformative and therefore legally permissible without a license.
A Shift Toward Enforced Licensing
The RIAA action arrives as the market for high-quality training data shifts from open scraping to structured licensing. While Suno and Udio face litigation, other players are opting for the "deal" route. For context, OpenAI recently secured a multi-year licensing agreement with News Corp valued at an estimated $250 million (https://www.nytimes.com/2024/05/22/business/media/openai-news-corp-deal.html) to access its vast archive of journalistic content. This dichotomy highlights a growing divide in the AI ecosystem: those who pay for data assets and those who risk existential litigation by bypassing the licensing market.
Furthermore, the demand for specialized data is driving massive capital inflows. Formation Bio recently disclosed a $160 million Series D round (https://www.bloomberg.com/news/articles/2024-06-26/openai-sanofi-back-formation-bio-s-160-million-funding-round) backed by Sanofi and OpenAI, specifically to build AI-driven drug development pipelines—a move that underscores the premium placed on high-integrity, vertically-specific datasets.
Infrastructure and Data Interoperability
The legal risks surrounding data acquisition are also influencing M&A activity in the data infrastructure layer. Databricks recently closed its acquisition of Tabular for over $1 billion (https://www.bloomberg.com/news/articles/2024-06-04/databricks-to-buy-data-management-startup-tabular-for-over-1-billion), a deal designed to unify data lakehouse formats and provide enterprises with cleaner, more compliant data pipelines for AI training. As regulators and rights holders tighten the net, the ability to trace and verify the provenance of training data becomes a core competitive advantage.
In Europe, the regulatory pressure is mounting as well. The European Commission recently charged Apple with breaching the Digital Markets Act (DMA) (https://www.cnbc.com/2024/06/24/eu-charges-apple-with-breaching-digital-markets-act.html), focusing on how the tech giant controls developer data and ecosystem access. This regulatory scrutiny, combined with the RIAA's aggressive litigation, suggests that the era of "unregulated data harvesting" is rapidly closing.
Why it matters for data owners
For owners of high-value datasets—whether in music, journalism, or healthcare—the RIAA lawsuit is a bullish signal. It reinforces the principle that proprietary data has a specific, high-dollar market value that cannot be bypassed under the guise of technological progress. As legal precedents establish the $150,000-per-work liability ceiling, the floor for licensing negotiations will naturally rise. Data owners now have a clear mandate: monetize your assets through structured partnerships or prepare to defend their value in court, where the potential returns from litigation may soon rival those of traditional licensing deals.
d-nvest turns the data assets behind these deals into scored, actionable opportunities.
Explore the pipeline →