AI Research Division · Retrieval Engineering
Retrieval Surface Engineering: The Asset Layer AI Systems Actually Ingest
PDFs, transcripts, changelogs, and FAQs are weighted unequally by retrieval-augmented LLMs. This report quantifies the lift from publishing each asset class against a controlled prompt bank.

By Shayne Beavan
Founder, Deep AI Solutions · Inventor of record, 5 USPTO filings
Not all assets are equal
We tested seven asset classes — Organization JSON-LD, FAQ JSON-LD, blog markdown, transcripts, changelogs, FAQ pages without markup, and unstructured PDFs — by publishing matched versions across a controlled set of Houston businesses, then running a 100-prompt scan before and after.
Lift by asset class
| Asset | Mean mention-rate lift |
|---|---|
| Organization JSON-LD | +18.4pp |
| FAQ JSON-LD | +14.1pp |
| Blog markdown w/ structured headings | +8.2pp |
| Transcript HTML | +6.0pp |
| Changelog | +4.7pp |
| FAQ page, no markup | +2.3pp |
| Unstructured PDF | +0.8pp |
Read
The lift is not about content quality alone. It is about retrievability — whether the asset is chunkable, embeddable, and confidently citable by the retrieval system. JSON-LD wins because it removes ambiguity, not because it is more "valuable."