bio datafundingai licensingsovereign dataJune 18, 2026

EvolutionaryScale Secures $142M to Scale Biological Data-for-AI Models

The seed round, led by Nat Friedman and Daniel Gross, targets the monetization of massive biological datasets.

EvolutionaryScale has finalized a disclosed $142 million (https://www.bloomberg.com/news/articles/2024-06-18/ai-startup-evolutionaryscale-raises-142-million-to-design-proteins) seed funding round to commercialize its ESM3 model, a transformer-based AI trained on a proprietary dataset of 278 million protein sequences. The round, led by Nat Friedman, Daniel Gross, and Lux Capital (https://techcrunch.com/2024/06/18/evolutionaryscale-raises-142m-from-nat-friedman-daniel-gross-and-lux-capital-to-pioneer-generative-ai-for-biology/), positions the startup as a primary challenger in the race to apply large language model (LLM) architectures to biological data assets, a sector previously dominated by DeepMind’s AlphaFold.

The High Stakes of Biological Data Assets

The core value of EvolutionaryScale lies in its ESM3 model, which was trained using approximately 1 trillion teraflops of computing power—a massive investment in processing biological sequence data (https://www.evolutionaryscale.ai/blog/esm3-release). Unlike general-purpose LLMs trained on web-scraped text, EvolutionaryScale’s assets are built on high-fidelity genomic and proteomic data. This move signals a broader market trend where the highest valuations are shifting toward companies that own or curate specialized, non-public datasets. By treating the genetic code as a language, EvolutionaryScale is effectively creating a new marketplace for synthetic protein design, where the "data" is the instruction set for life itself.

Enterprise Data Sovereignty: The HPE-Nvidia Pact

While EvolutionaryScale focuses on biological data, the broader enterprise market is pivoting toward localized data control. Hewlett Packard Enterprise (HPE) and Nvidia recently launched "NVIDIA AI Computing by HPE" (https://www.hpe.com/us/en/newsroom/press-release/2024/06/hpe-and-nvidia-announce-nvidia-ai-computing-by-hpe-to-accelerate-the-generative-ai-industrial-revolution.html), a co-developed solution designed to allow corporations to train AI models on their own private data silos. This partnership addresses the growing demand for "Private AI," where data assets never leave the corporate firewall. As Nvidia’s market capitalization hit $3.34 trillion (https://www.reuters.com/technology/nvidia-set-overtake-microsoft-worlds-most-valuable-company-2024-06-18/) this week, the focus has shifted from mere hardware sales to the underlying data infrastructure that powers these industrial-scale models.

Legal Precedents in Data Scraping and Licensing

The valuation of data assets is also being shaped by new legal boundaries. In a landmark ruling, a U.S. judge recently sided with Bright Data in its legal battle against Meta (https://www.reuters.com/legal/meta-loses-bid-block-bright-data-scraping-its-sites-2024-06-18/), concluding that scraping publicly accessible data does not violate Meta’s terms of service. This decision is a critical win for data marketplaces and aggregators, reinforcing the legality of collecting public-facing data for AI training. Simultaneously, the Middle East is emerging as a hub for sovereign data investment, evidenced by Etisalat (e&) and AWS’s disclosed $2 billion (https://www.reuters.com/technology/etisalat-aws-invest-2-bln-uae-cloud-expansion-2024-06-17/) agreement to expand cloud and AI data capabilities in the UAE over the next decade.

Strategic Acquisitions in Cloud Data Intelligence

Consolidation in the data sector continues as major players move to acquire specialized intelligence layers. Nvidia is reportedly acquiring Shoreline.io for an estimated $100 million (https://www.bloomberg.com/news/articles/2024-06-17/nvidia-is-said-to-acquire-software-startup-shoreline). Shoreline’s technology focuses on automating incident response in cloud environments, providing Nvidia with a rich stream of operational data to optimize AI-driven data center management. This follows a pattern of "acqui-hiring" and data-asset acquisitions intended to bolster the reliability of the massive clusters required for next-generation model training.

Why it matters for data owners

For data owners, the EvolutionaryScale round and the Bright Data ruling underscore two diverging but lucrative paths: the monetization of hyper-specialized, proprietary datasets and the continued viability of public data aggregation. As enterprise demand for "Private AI" grows via the HPE-Nvidia alliance, the premium on clean, labeled, and legally compliant data assets has never been higher. Owners of unique datasets in biology, finance, and industrial operations are now positioned as the ultimate kingmakers in an AI economy that has moved past the "compute-first" era into a "data-first" reality.

Data Academy