News Corp and Meta Strike $50M Annual AI Data Licensing Deal
The five-year agreement grants Meta access to premium content from The Wall Street Journal and Barron’s for AI training.
News Corp has secured an estimated $50 million annually (https://www.journalismpakistan.com/news-details.php?id=32491) in a multi-year licensing pact with Meta Platforms to provide high-quality journalistic data for the tech giant’s generative AI ecosystem. The agreement, disclosed on June 26, 2026, grants Meta access to current and archived content from prestigious mastheads including The Wall Street Journal, Barron’s, and The New York Post (https://apnews.com/article/news-corp-openai-deal-250-million-5-years), as well as major British and Australian titles. This move signals a strategic shift for Meta, which has historically relied on public web scraping, as it now moves to secure "frontier data" to power its Llama and Nova model suites.
The New Floor for Premium Data Assets
The deal establishes a significant price floor for enterprise-grade data licensing in the news sector. Analysts note that this partnership follows a similar $250 million five-year deal (https://timesofindia.indiatimes.com/technology/tech-news/openais-250-million-deal-with-news-corp-gets-it-over-a-dozen-news-publications-to-train-its-ai-model/articleshow/110360492.cms) previously inked between News Corp and OpenAI. The market for high-fidelity training data is tightening as publishers increasingly bifurcate into two camps: those pursuing litigation, such as The New York Times, and those opting for commercial monetization. The trend toward licensing is accelerating globally, exemplified by the Brazilian newspaper Folha (https://www.journalismpakistan.com/news-details.php?id=32491), which settled its legal dispute with OpenAI this week by signing a commercial agreement shortly after partnering with Google.
Infrastructure and Sovereign Data Foundations
As licensing deals scale, the infrastructure required to manage these massive datasets is expanding. On June 27, 2026, VAST Data announced an expanded partnership with Sharon AI to build a 600-petabyte sovereign AI data foundation (https://www.tipranks.com/news/vast-data-weekly-recap) in Australia. This project aims to provide the secure, high-performance data layer necessary for large-scale Python and inference workloads, simplifying the path from pilot projects to real-time AI applications for enterprises. Meanwhile, the financial sector is witnessing the rise of "agentic" data monetization; Visa and Alchemy reported that their new AgentCard (https://www.americanbanker.com/news/visas-agentic-ai-push-includes-a-card-for-bots) for AI agents secured 78,000 sign-ups (https://www.americanbanker.com/news/visas-agentic-ai-push-includes-a-card-for-bots) in its first 48 hours, highlighting the rapid emergence of a machine-to-machine economy powered by real-time data tokens.
Regulatory Compliance and Transparency
The surge in licensing activity is also a response to the final implementation phases of the EU AI Act (https://www.consilium.europa.eu/en/press/press-releases/2024/05/21/artificial-intelligence-ai-act-council-gives-final-green-light-to-the-first-worldwide-rules-on-ai/), which mandates greater transparency regarding the datasets used to train general-purpose AI (GPAI) models. Companies like Google are expanding their News AI pilot program (https://www.mediapost.com/publications/article/396266/google-news-seeks-broader-publisher-permissions-fo.html) to include The Washington Post and The Guardian, offering publishers a path to monetization through "AI-powered article overviews" rather than traditional search referral traffic. This regulatory pressure is forcing AI developers to clean up their supply chains, making licensed, human-verified data the most valuable asset in the AI stack.
Why it matters for data owners
For data owners, the News Corp–Meta deal proves that premium archives are no longer just historical records—they are high-yield liquid assets. As AI labs face increasing legal and regulatory scrutiny over data provenance, the "take rate" for proprietary datasets is rising. Owners of specialized, high-integrity data now have unprecedented leverage to negotiate multi-year, multi-million dollar recurring revenue streams that offset the decline in traditional digital advertising and traffic models.
Sources
d-nvest turns the data assets behind these deals into scored, actionable opportunities.
Explore the pipeline →