biotech aifunding rounddata licensinginfrastructureJune 17, 2026

EvolutionaryScale Secures $142M for Biological Data AI Models

Former Meta researchers lead seed round to license and scale ESM-3 protein-design models for drug discovery.

EvolutionaryScale has secured $142 million (https://www.reuters.com/technology/ai-startup-evolutionaryscale-raises-142-million-with-backing-amazon-nvidia-2024-06-17/) in a disclosed seed funding round to accelerate the development of generative AI models for biology, marking a pivotal moment for the monetization of specialized scientific data assets. Led by former Meta AI researchers, the startup is launching ESM-3, a frontier language model trained on a proprietary dataset comprising 2.78 billion proteins (https://techcrunch.com/2024/06/17/evolutionaryscale-seed-biological-ai/). The round saw participation from industry titans including NVentures (Nvidia) and Amazon Web Services (AWS), signaling a strategic shift toward high-fidelity, domain-specific data as the next primary driver of AI valuation.

The Biological Data Frontier: ESM-3 and the $142M Seed

The core of EvolutionaryScale’s value proposition lies in its ability to simulate billions of years of biological evolution through data. The disclosed $142 million investment (https://www.bloomberg.com/news/articles/2024-06-17/ex-meta-scientists-raise-142-million-for-biological-ai-startup) values the company at an estimated $1 billion (https://www.reuters.com/technology/ai-startup-evolutionaryscale-raises-142-million-with-backing-amazon-nvidia-2024-06-17/), reflecting the high premium placed on the ESM-3 model's 98 billion parameters. Unlike general-purpose LLMs, ESM-3 is trained on structured biological sequences, allowing it to generate entirely new proteins that do not exist in nature. This "programmable biology" capability is built upon the acquisition and processing of massive genomic and proteomic datasets, which the company intends to license to pharmaceutical giants for drug discovery and environmental engineering.

Infrastructure as the Vessel: KKR’s $50B Power Play

While EvolutionaryScale focuses on the data-intelligence layer, the physical infrastructure required to process such assets is seeing unprecedented capital inflows. KKR and Energy Capital Partners (ECP) have announced a $50 billion strategic partnership (https://www.reuters.com/business/energy/kkr-energy-capital-partners-form-50-bln-strategic-partnership-ai-2024-06-17/) to accelerate the development of data centers and power infrastructure. This disclosed multi-year commitment addresses the "bottleneck" of the data economy: the massive energy demands of AI training clusters. For data asset owners, this infrastructure surge ensures that the liquidity and processing capacity for large-scale datasets will remain robust, even as model complexity scales exponentially.

European Sovereignty and the Mistral Data Moat

The global market for data is also being shaped by regional champions aiming for "data sovereignty." Paris-based Mistral AI recently closed a €600 million ($640 million) Series B (https://techcrunch.com/2024/06/11/mistral-ai-raises-600-million-at-a-5-8-billion-valuation/) at a disclosed €5.8 billion valuation (https://www.ft.com/content/88d68994-633b-419b-9c71-f76e736a617c). Mistral’s strategy relies heavily on curated, multi-lingual datasets that offer a competitive alternative to US-centric models. By securing massive funding from investors like General Catalyst and Lightspeed, Mistral is positioning itself to lead the European market in enterprise data licensing, where local regulation and data privacy are paramount.

Regulatory Chokepoints: Meta’s EU Data Impasse

However, the acquisition of data for AI training faces intensifying regulatory scrutiny. Meta Platforms has been forced to pause its plans to use data from European Facebook and Instagram users to train its AI models following a request from the Irish Data Protection Commission. This move, prompted by complaints from the advocacy group NOYB, highlights a growing divide in data availability. While US and Asian firms may continue to scrape vast public datasets, European firms must navigate a "consent-first" landscape, potentially driving up the market price for legally compliant, licensed datasets.

Why it matters for data owners

The EvolutionaryScale and KKR deals underscore a fundamental shift in the AI value chain: the transition from algorithmic supremacy to data and energy supremacy. For owners of proprietary datasets—whether in biology, finance, or law—the ESM-3 launch proves that specialized data can command billion-dollar valuations independent of general-purpose LLMs. As compute infrastructure expands through $50 billion pacts and regulatory walls rise in Europe, the scarcity of high-quality, "clean" data will likely drive a new wave of high-value licensing agreements. Data is no longer just an input; it is the primary capital asset of the 2026 intelligence economy.

d-nvest turns the data assets behind these deals into scored, actionable opportunities.

Explore the pipeline →
EvolutionaryScale Secures $142M for Biological Data AI Models | d-nvest