Early AI Project Machine-Readability Report 2026
A restrained Seed 100 snapshot of crawler-readable metadata, structured data, and public profile readiness.
Executive Summary
The Seed 100 machine-readability dataset includes 100 early AI projects. The bounded audit found 88 official websites with usable HTML responses and 12 website-level failures. JSON-LD appeared on 38 pages, while 21 pages exposed SoftwareApplication schema. This report describes observed metadata readiness only. It does not predict external discovery outcomes or business performance.
Key Findings
Seed projects in the committed dataset.
Official sites with an ok bounded fetch outcome.
Recorded as facts, not discarded.
Pages with at least one JSON-LD block.
Pages with SoftwareApplication type detected.
Pages with canonical marker present.
Same-origin robots check available.
Same-origin sitemap check available.
Top Issue Codes
| Issue | Count | Meaning |
|---|---|---|
| missing jsonld | 50 | No JSON-LD block was detected in the bounded HTML sample. |
| missing canonical | 34 | No canonical metadata marker was detected. |
| sitemap unavailable | 20 | Same-origin sitemap check did not return an available response. |
| missing open graph | 14 | Open Graph metadata was not detected. |
| low readable text | 10 | Readable text bucket was below the report threshold. |
| http 4xx | 9 | Official website fetch returned a client-side status class. |
| robots unavailable | 8 | Same-origin robots check did not return an available response. |
| missing meta description | 6 | Meta description marker was not detected. |
| redirect limit | 2 | Redirect chain exceeded the bounded redirect policy. |
| dns error | 1 | Host lookup did not resolve during the bounded fetch. |
| missing title | 1 | HTML title marker was not detected. |
Machine-Readability Failure Modes
| Issue | Count | Meaning |
|---|---|---|
| missing JSON-LD | 50 | Structured machine-readable metadata was absent from the captured page. |
| missing canonical | 34 | Canonical metadata was not detected in the bounded sample. |
| unreachable homepage | 10 | The official landing URL did not produce a usable HTML response. |
| timeout | 0 | The request exceeded the fixed five-second limit. |
| redirect limit | 2 | Redirect handling stopped at the configured limit. |
| empty HTML | 0 | The response did not contain usable HTML text. |
| missing sitemap | 20 | Same-origin sitemap discovery was unavailable. |
| missing robots | 8 | Same-origin robots discovery was unavailable. |
Dataset Methodology
Seed 100 input
The report reads the committed PR33 Seed 100 audit dataset. It does not modify the source data repository.
Bounded audit
Each project uses the official website URL, same-origin robots file, and same-origin sitemap file with fixed limits.
No JavaScript execution
The audit uses static HTML fetches only. It does not execute scripts or run a browser.
No LLM or paid APIs
The dataset is deterministic and does not call model providers, search providers, or paid external services.
Failures retained
Website-level failures remain in the aggregate dataset so the report can show coverage limits honestly.
What Founders Can Do
Run the AI Search Readiness Checker
Use the public checker to inspect crawler-readable metadata on a project landing page.
Submit a structured public profile
Share official project details for editorial review using the public submission flow.
Review observed profiles
Compare published project pages with official public sources and request corrections through the founder path.
Add structured metadata to your own site
Keep title, meta description, canonical, JSON-LD, robots, and sitemap signals easy for crawlers to read.
88CN Boundary Statement
88CN reports observed public machine-readability signals. It does not promise placement in external systems, third-party inclusion, or downstream discovery outcomes.
88CN does not sell machine-signal placement. If visual placement formats are introduced later, they remain separate from sitemap, API, MCP, Signal Score, and Source Confidence.