What is a Dataset Worth? 4 Valuation Methods
Cost, market, buyer value, future cash flows: depending on the method, the value of the same file can vary by a factor of 25. Learn how to cross-reference methods.
What is a Dataset Worth?
4 Valuation Methods
10 slides · swipe or use the arrowsThe Challenge
Value is Not Shown on the Balance Sheet
~90% of S&P 500 value is intangible (vs 17% in 1975) — but data rarely appears on the balance sheet.
┌ Ocean Tomo, 2020 · Laney, Infonomics, 2017
Method 1
The Cost of Re-creation
How much would it cost to recreate this data? Useful as a safeguard. Limitation: measures expenditure, not value.
┌ OECD, Measuring the Value of Data 2022
Method 2
The Market / Comparables
At what price is similar data sold? Limitation: rare and opaque comparables → mainly a consistency check.
Method 3
Value to the Buyer (Uplift)
What gain does the data create for the buyer? (royalties avoided, additional margin). Limitation: difficult to isolate the data's own contribution.
Method 4
Discounted Future Cash Flows (DCF)
Present value of future revenues attributable to the data. The quantitative form of the 'buyer value' approach.
┌ Cheong et al., JRFM/MDPI 2023
Premium or Discount?
What Drives Prices Up (or Down)
- Freshness, exclusivity, volume, granularity
- GDPR Compliance: without a legal basis, value ≈ 0
- Supply/demand take precedence over intrinsic value
┌ Laney/Gartner (IVI) · DAMA-DMBOK
Price Benchmarks
Orders of Magnitude (≠ Contracts)
- Marketplace median ~ $1,400/month or ~ $2,200 one-time
- B2B contact ~ $0.01–$1.50 (base that depreciates ~30%/year)
- AI text licenses = lump sums (Reddit $60M/year)
┌ Azcoitia et al., arXiv 2021
The Proof (in numbers)
The Method Changes Everything: ×25
A B2B customer file ( $1M/year attributable): cost ≈ $150k, royalties avoided ≈ $133k, excess earnings ≈ $3.8M. → the method varies the value by a factor of ~25.
┌ Educational example (Eton VS / Deloitte)
Key Takeaway
Cross-Reference, Don't Choose
This is precisely what the d-nvest valuation report does.
- No single method yields 'the' price
- We cross-reference methods + real comparables
- A confidence index frames the estimation
Questions about monetising or buying data?
Talk to an expert — no strings attached.
The full guide
How much is a dataset worth? The question is tricky, as the value of data is not reflected on the balance sheet: approximately 90% of the S&P 500's value is intangible today (compared to 17% in 1975, Ocean Tomo), but data is almost never listed there (Laney, Infonomics). Four methods can be used to estimate it.
The cost method measures how much it would take to recreate the data: simple, useful as a safeguard, but it measures expenditure, not value (OECD, 2022). The market method compares it to similar data sold elsewhere; as comparables are rare and opaque, it primarily serves as a consistency check. The value-to-buyer method (uplift, relief-from-royalty, with-and-without) quantifies the gain the data provides to the acquirer; its difficulty lies in isolating the portion of value truly attributable to the data. Finally, the discounted future cash flows (DCF) method calculates the present value of future attributable revenues — this is the quantitative form of the buyer value approach (Cheong et al., 2023). Recognized frameworks often include only three (cost/market/revenue), with DCF being a variant of the third.
Several factors then play a role in premiums or discounts: freshness, exclusivity, volume, granularity, accuracy, rights/licenses, and especially GDPR compliance — without a legal basis, the value drops to almost zero. Dominant rule: supply and demand take precedence over intrinsic value. Regarding benchmarks, the marketplace median is around $1,400/month (or ~$2,200 for one-time purchases, arXiv 2021), a B2B contact is worth a few cents to $1.50, and AI text licenses take the form of lump sums (Reddit, ~$60M/year).
The key lesson: the same B2B customer file generating $1M/year in attributable revenue can be valued at ~$150k by cost, ~$133k by royalties avoided, or ~$3.8M by excess earnings — a factor of ~25 depending on the method. Hence the conclusion: we do not choose one method, we cross-reference them, compare them to real comparables, and frame the result with a confidence index. This is precisely what the d-nvest valuation report produces.
Sources
- Deloitte — Valuing Data Assets (2025)
- OCDE — Measuring the Value of Data (2022)
- Azcoitia et al. — Data marketplace prices (arXiv, 2021)
- Cheong et al. — DCF for data (JRFM/MDPI, 2023)
Educational content — not legal or financial advice. Figures carry their source and year.