88CN
Founder Intent Signal Report

Early AI Project Machine-Readability Report 2026

A restrained Seed 100 snapshot of crawler-readable metadata, structured data, and public profile readiness.

Dataset date: Jun 18, 2026Corpus: 100 projectsAudit records: 100

Executive Summary

The Seed 100 machine-readability dataset includes 100 early AI projects. The bounded audit found 88 official websites with usable HTML responses and 12 website-level failures. JSON-LD appeared on 38 pages, while 21 pages exposed SoftwareApplication schema. This report describes observed metadata readiness only. It does not predict external discovery outcomes or business performance.

Key Findings

Corpus
100

Seed projects in the committed dataset.

Usable HTML
88

Official sites with an ok bounded fetch outcome.

Website-level failures
12

Recorded as facts, not discarded.

JSON-LD present
38

Pages with at least one JSON-LD block.

SoftwareApplication schema
21

Pages with SoftwareApplication type detected.

Canonical metadata
54

Pages with canonical marker present.

Robots available
87

Same-origin robots check available.

Sitemap available
71

Same-origin sitemap check available.

Top Issue Codes

IssueCountMeaning
missing jsonld50No JSON-LD block was detected in the bounded HTML sample.
missing canonical34No canonical metadata marker was detected.
sitemap unavailable20Same-origin sitemap check did not return an available response.
missing open graph14Open Graph metadata was not detected.
low readable text10Readable text bucket was below the report threshold.
http 4xx9Official website fetch returned a client-side status class.
robots unavailable8Same-origin robots check did not return an available response.
missing meta description6Meta description marker was not detected.
redirect limit2Redirect chain exceeded the bounded redirect policy.
dns error1Host lookup did not resolve during the bounded fetch.
missing title1HTML title marker was not detected.

Machine-Readability Failure Modes

IssueCountMeaning
missing JSON-LD50Structured machine-readable metadata was absent from the captured page.
missing canonical34Canonical metadata was not detected in the bounded sample.
unreachable homepage10The official landing URL did not produce a usable HTML response.
timeout0The request exceeded the fixed five-second limit.
redirect limit2Redirect handling stopped at the configured limit.
empty HTML0The response did not contain usable HTML text.
missing sitemap20Same-origin sitemap discovery was unavailable.
missing robots8Same-origin robots discovery was unavailable.

Dataset Methodology

Seed 100 input

The report reads the committed PR33 Seed 100 audit dataset. It does not modify the source data repository.

Bounded audit

Each project uses the official website URL, same-origin robots file, and same-origin sitemap file with fixed limits.

No JavaScript execution

The audit uses static HTML fetches only. It does not execute scripts or run a browser.

No LLM or paid APIs

The dataset is deterministic and does not call model providers, search providers, or paid external services.

Failures retained

Website-level failures remain in the aggregate dataset so the report can show coverage limits honestly.

What Founders Can Do

88CN Boundary Statement

88CN reports observed public machine-readability signals. It does not promise placement in external systems, third-party inclusion, or downstream discovery outcomes.

88CN does not sell machine-signal placement. If visual placement formats are introduced later, they remain separate from sitemap, API, MCP, Signal Score, and Source Confidence.

Continue From This Report