Dataset opportunity
Ghy β Regulatory Records Dataset Opportunity
Moderate regulatory records dataset held by Ghy, usable for Regulatory RAG and Compliance Copilots.
Score
68.5
Score (0β100) blends weighted dimensions β dataset rarity, training value, buyer demand, evidence strength and right-to-license. 70+ is deal-ready. See the scored dimensions below for the breakdown.Confidence
49%
Action
Acquire
The recommended deal structure for this dataset: Acquire (full buyout), License (paid usage rights), Data Sharing Agreement (controlled access, no transfer of ownership), Partnership (co-development) or Annotation Program (labeling). Chosen from data ownership, licensing complexity and accessibility.Market
Global RegTech market valued at USD 19.06 billion in 2025, projected to reach USD 105.23 billion by 2034, exhibiting a CAGR of 20.00%. [8]
Recent dated external facts that triggered this opportunity β auditable provenance.
- π°press2026-06-15
Air freight spot rates spike 41% YoY in May, but relief expected soon
supplychaindive.com β - π°press2026-06-15
Routing guides are crumbling: βIt is different this timeβ
freightwaves.com β - π°press2026-06-15
U.S.-Iran peace deal reopens Strait of Hormuz
freightwaves.com β - π°press2026-06-15
How hackers allegedly stole $1.7 million worth of condoms
freightwaves.com β - π°press2026-06-15
Shippers say renewed tax on Chinese ships could put some U.S. ag producers out of business
freightwaves.com β
Lineage
How this lead was derived
The signal-first chain, end to end: recent external signals β qualified niche β resolved data-holder β site verification β scored opportunity. Every lead is explainable.
Concrete evidence this company actively cares about data β why it's ripe for the deal room.
- π£Press / announcement
Acquisition of W.G. McKay to expand trade service footprint
source β - π¦Data product
GHY eBiz & Tariff Tracker
source β
Profile
Dataset profile
Type
Regulatory Records Dataset
Modality
Text
Sector
mobility
Volume
Moderate
Freshness
Real-time
Rarity
High (proprietary)
Accessibility
Restricted
Legal
Mixed ownership β licensing rights to clarify Β· PII/regulated
Buyer persona
RegTech & compliance-AI vendors
Ghy possesses a Regulatory Records Dataset in Text modality, comprising event_streams, regulatory filings (CBSA/CBP), and transaction_data. This collection of structured and unstructured data provides a rich, proprietary foundation for a Regulatory RAG system, enabling it to answer complex customs and trade compliance questions with high accuracy by drawing directly from real-world operational evidence.
The business value of this data is underscored by the RegTech market, which was valued at approximately USD 19.06 billion in 2025 and is projected to grow at a 20.00% CAGR. [8] This significant growth signals intense market demand for AI-driven compliance solutions. Despite access complexities, such as sensitive client information and data extraction from legacy systems, the rarity and depth of this data for building advanced customs compliance tools present a compelling and valuable opportunity for AI buyers. β Diligence (valuable data, access to negotiate): Data involves sensitive client trade information and regulatory filings (CBSA/CBP).; Ownership may be shared with clients for specific shipment records.; Historical data likely resides in legacy brokerage systems requiring extraction. Β· corporate: independent.
Scoring
Scored dimensions
Explainable, evidence-based dimensions (0β100). The radar shows the investment axes.
This evidence collectively proves Ghy holds a rare, proprietary dataset rooted in over a century of customs brokerage and trade compliance operations. This data is a critical asset for RegTech and compliance-AI vendors seeking to build advanced Regulatory RAG systems. In a global RegTech market projected to exceed $100 billion by 2034, this unique historical and operational data offers a powerful competitive advantage for developing more accurate and comprehensive AI solutions.
See dimension details β- Dataset Specificity90
dominant 'regulatory', sector mobility, 3 specific types
How sharply the data targets a specific, hard-to-substitute domain or task. Niche, well-defined data scores higher than generic. - Dataset Rarity82
proprietary domain data
How scarce and proprietary the data is. Unique domain data scores high; openly available data lowers it. - Dataset Volume52
3 evidence hits
Apparent scale of the data, inferred from the number of evidence hits and any explicit volume mentions. - Dataset Freshness82
real-time/streaming
How current the data stays β real-time/streaming scores highest, periodic dumps lower. - Training Value84
fit for Regulatory RAG
How useful the data is for the target AI use-case β its fit for model training or fine-tuning. - Buyer Demand88
Demand is driven by the convergence of the AI in Mobility market (projected 44.6% CAGR) and the Retrieval-Augmented Generation (RAG) market, which is projected to grow at a CAGR of 49.1% as enterprises adopt it for use cases including regul
How strongly AI builders and companies are likely to want this data, based on market signals. - Legal Accessibility0
PII/regulated
How legally easy the data is to obtain and use β open/API access scores high; PII or regulated data scores low. - Acquisition Feasibility0
medium difficulty, independent
How realistic it is to actually obtain the data, given access difficulty and the holder's corporate structure. - Evidence Strength62
3 evidence types, 3 hits
How solid the proof is that the company holds this data β diversity of evidence types and number of hits. - Right to License36
ownership=mixed, licensing=rights_unclear
Whether the company can legally license the data out β based on ownership and licensing complexity. - Corporate Independence90
independent
Whether the holder can decide alone β an independent company scores higher than a subsidiary of a large group. - Data Orientation56
2 data-appetite signals (2 types)
How actively the company invests in data, measured by its data-appetite signals (hires, products, APIsβ¦). - Dormant Data Surplus92
surplus=high, 5 recent external signals β proprietary data beyond what's already monetised
Volume and value of proprietary data this company holds BEYOND what it already monetises β the dormant surplus we can unlock. A company can sell some insights AND still sit on a far larger dormant asset. - ICP Audit75
β review β GHY International is a customs brokerage whose core business is service-based, but they are already productizing their data expertise through software and analytics tools, making them a poor fit. Issues: Company's core business is providing trade services, not selling raw data, which is a good sign. [3, 5]; However, they offer several software/tech solutions that are derived from their data and expertise, such as an AI-powered classification tool, a Business Intell; This indicates the
Evidence
Dataset evidence & lineage
What the typed evidence proves the company holds β reframed for clarity and set against the market.
Transaction data
The company's claim of over 100 years in customs brokerage points to a uniquely deep and consistent tabular dataset, invaluable for modeling long-tail, cross-border trade scenarios.
Regulatory records
Ghy provides extensive Global Trade Services, generating proprietary text data on trade compliance that directly feeds the development of sophisticated AI models for regulatory intelligence.
Event streams
The firm operates real-time shipment trackers, creating valuable time-series data streams that provide operational context and verification for cross-border compliance events.
Coverage
Scanned sources
Deliverable
Premium dataset report
Ghy Regulatory Records β a Moderate regulatory records dataset (Text modality) in the mobility domain. Primary AI use-case: Regulatory RAG. Market signal: Global RegTech market valued at USD 19.06 billion in 2025, projected to reach USD 105.23 billion by 2034, exhibiting a CAGR of 20.00%. [8]. Investment score 68.5/100 (confidence 0.49). Recommended action: Acquire.