Watercolor painting of a formal resume document with sections glowing in warm amber light
groundwork·4 min read

Your Page's Resume

You let the crawlers in. What they found looks nothing like your website.

Share
Copied!

The Brief

This article explains how schema markup, FAQ structured data, and llms.txt files help AI search engines find, understand, and cite your website. It covers the practical steps for making your pages readable to crawlers and quotable in AI-generated responses.


What is schema markup?
Schema markup is structured data in your page's HTML that labels key information like author, date, and topic for search engines and AI crawlers. It acts as credentials for your content, helping machines understand what the page is about without reading the full text.
Why does FAQ schema matter for AI citation?
Pages with FAQ schema get cited 41% of the time by AI systems, compared to 15% without it. The question-and-answer format matches how users query AI, making it easy for the system to pull a clean, self-contained response directly from your page.
What is llms.txt?
A Markdown file at your site's root that tells AI crawlers what's worth reading. Unlike robots.txt which says where crawlers can't go, llms.txt provides a structured summary of who you are and what your key pages cover. Most AI platforms don't check for it yet, but it gives you control over how AI summarizes your site.
What makes a paragraph quotable by AI?
Quotable paragraphs are short, sit right after a heading, and make one clear point with specific numbers or named sources. If a paragraph can be lifted out of your page and dropped into an AI response while still making sense on its own, it's likely to get cited.

I pulled up my own site in a text-only browser last week. Lynx, the kind of thing nobody uses anymore. I wanted to see what a crawler sees.

No watercolors. No warm cream palette. No carefully chosen fonts. Just text, headings, and a lot of silence where the design used to be. That's the version of my site that decides whether AI cites me or ignores me.

In "The Toggle" I wrote about letting the crawlers in. I set my Cloudflare dashboard to Allow. But allowing access and being worth citing are two different things. The bots arrived. Now I had to think about what they found.

The Three-Second Introduction

I looked at my own page source. The content was all there, but it had no introduction. Nothing telling the crawler who wrote this and when, or why it should be trusted. That's the job of schema markup. It's a bit of code in your page's header that labels the basics: author, date, topic, organization. Think of it as the credentials section of the resume.

The format that caught my attention was FAQ schema. You write a question and a clean answer, and the markup packages it so an AI can pull it straight into a response. Pages with FAQ schema get cited 41% of the time. Without it, 15%.1 The shape of the answer is doing the work.

A neatly organized portfolio with labeled sections glowing warmly while scattered loose papers fade into soft shadow The structure is what the crawler reads first.

What Gets Quoted

The citation doesn't pull your whole page. It pulls a paragraph. Maybe two. The ones that get picked tend to be short, sit right after a heading, and make one clear point. I started noticing the pattern once I knew what to look for.

Think of it as the quotable paragraph. If an AI lifts it out of your page and drops it into a response, does it still make sense on its own? If nothing on your page reads that cleanly, the AI quotes someone else.

What's inside the paragraph matters as much as how it's shaped.2 "Our service is significantly faster" gets passed over. "Reduces processing time by 47%" gets cited. Specific numbers. Named sources. The kind of claims a journalist would trust. Turns out AI has the same instinct.

A single paragraph on a page illuminated by warm golden light from above, surrounding text in soft watercolor focus The passage the AI decides to quote.

The Cover Letter

There's one more piece, still early but worth knowing about. A file called llms.txt that sits at your site's root next to robots.txt. Where robots.txt tells crawlers where they can't go, llms.txt tells them what's worth reading.3 It's a plain Markdown file. Who you are, your key pages, a short description of each. The cover letter that arrives with the resume.

Most AI platforms don't check for it yet.4 But the idea makes sense. If you can shape how an AI summarizes you in a sentence, why leave it up to the crawler?

I went back to Lynx after writing this. Same bare page. But now I can see what's missing. The credentials aren't labeled. The quotable paragraphs aren't clean. The cover letter doesn't exist yet.

If any of this feels unfamiliar, try asking an AI to help. Paste your URL into ChatGPT, Claude, or Perplexity and ask:

"Review this page. What schema markup is present? What's missing? If you were deciding whether to cite this page, what would make it easier?"

If you want AI to surface your website, let it tell you what it needs to find there. Your page's resume is what they read first. Might as well let them help you write it.


References

Footnotes

  1. Safri, A. (2026). "The 2026 Guide to AI Citations: How to Get Cited in ChatGPT, Perplexity, and Claude." LinkedIn

  2. Dataslayer. (2025). "Generative Engine Optimization: The AI Search Guide." Dataslayer

  3. Coyle, A. (2025). "GEO and the LLMs.TXT File." Andrew Coyle

  4. Goodie. (2026). "LLMs.txt & Robots.txt: Optimizing for AI Bots." Goodie

Found this useful? Share it with others.

Share
Copied!

Browse the Archive

Explore all articles by date, filter by category, or search for specific topics.

Open Field Journal