
Blog Post
SEO

Nadine
Wolff
published on:
Structured data for AI search
Table of Contents
The essentials in brief
Today, structured data plays a key role in deciding whether AI systems like ChatGPT, Perplexity, and Google AI Overviews recognize and cite your brand as a source.
The real competitive edge doesn't come from FAQPage and Product, but from the rarely used types – first and foremost DefinedTerm and sameAs (Wikidata/Wikipedia).
Schema is an amplifier, not a magic switch: The markup must match the visible content.
For years, using structured data was a topic exclusive to Google. Under the umbrella term "markup for rich snippets," Google continues to have its own rules for handling structured data on a website. With the rise of ChatGPT, Perplexity, Google AI Overviews, and Google AI Mode (and others), this has evolved into something else: the infrastructure through which AI systems recognize, categorize, and cite your brand as a source.
The exciting news for the current handling of structured data: The biggest leverage no longer lies in the classic implementations for FAQPage and Product (which everyone has adopted by now), but in the schema.org types that almost no one uses. That is precisely where a head start is being created.
From Rich Snippets to Entity Infrastructure
Anyone who does SEO knows structured data as a means to an end: integrate markup to get star ratings and FAQ accordions into Google search. This job still exists and remains important. But the actual shift is happening one level deeper.
AI search engines synthesize answers from multiple sources instead of displaying ten blue links. For a brand to appear in this answer at all, the system must understand: What is this? Which entity? Which facts belong to it? Is the source trustworthy? A cleanly implemented schema markup answers exactly these questions.
The turning point came in March 2025. Within a few days, both major players commented on the role of structured data for their AI systems: Fabrice Canel (Principal Product Manager at Microsoft Bing) confirmed on stage at SMX Munich that schema markup helps Microsoft's LLMs understand web content (Source LinkedIn). Shortly after, Google emphasized at Search Central Live in New York (March 20, 2025) that structured data is valuable for their AI systems. (Source Search Engine Roundtable).
With that, the years-long debate over whether AI systems "even use" schema was officially answered, at least for search-driven systems (Bing Copilot, Google AI Overviews, and AI Mode).
The well-known schema.org types. The mandatory program
Before diving into the exciting types, let's briefly look at the foundation. These belong on every serious page. One could even go so far as to say that these mandatory types are no longer a competitive advantage, as they have become industry standard.
Organization/LocalBusiness:anchors the brand as an entityArticle:with author, publisher, and date as credibility signalsFAQPage:question-answer pairs that LLMs love to use directly as answersProduct/Offer:for e-commerce areasHowToandBreadcrumbList:process content and page hierarchy
The underestimated types. This is where you get the head start
DefinedTerm and DefinedTermSet
This is by far the most underrated markup. If you take away only one type from this article, let it be this one. Hardly any site uses it, but it is incredibly valuable for AI systems. The effort is usually minimal because the glossary content is already on the page anyway.
DefinedTerm turns your glossary into a structured key-value resource: term, synonyms, definition, URL. Instead of parsing flowing text, the AI system gets a clean "this term means exactly this." For any brand with specialized vocabulary (e.g., in B2B, SaaS, niche products), this is a direct lever for definition queries.
An example of usage in JSON-LD {
"@context": "https://schema.org",
"@type": "DefinedTermSet",
"name": "GEO Glossary",
"url": "https://www.internetwarriors.de/glossar",
"hasDefinedTerm": [
{
"@type": "DefinedTerm",
"name": "Generative Engine Optimization",
"alternateName": "GEO",
"description": "The optimization of content for visibility in AI search engines such as ChatGPT, Perplexity, and Google AI Overviews.",
"url": "https://www.internetwarriors.de/glossar/geo",
"inDefinedTermSet": "https://www.internetwarriors.de/glossar"
}
]
}
The structure has two levels: a container and its entries:
The outer level = the glossary itself (DefinedTermSet)
@context:tells every parser "the vocabulary here is schema.org". It almost always sits right at the top.@type: "DefinedTermSet":the declaration "This is a collection of technical terms," i.e., a glossary.name / url: Name and address of this exact glossary collection: your glossary overview page goes here.
The inner level = the individual entries (hasDefinedTerm)
hasDefinedTerm:the square brackets […] make this a list. All the individual terms live in here: in the example above only one, but you can chain as many as you like (separated by commas).
Each entry in this list is a DefinedTerm with:
@type:"DefinedTerm":
"This is a single defined term."name:
the term itself: "Generative Engine Optimization".alternateName:
synonyms or abbreviations, in this case "GEO". This is extremely practical because it covers different search queries.description:
the actual definition of the term. AI often pulls its information directly from this content.url:
the specific detailed/subpage (or an anchor) for this exact term.inDefinedTermSet:
the backlink to the parent glossary (the same URL as the set above). This clearly assigns the entry to the glossary, closing the loop between the two levels.
sameAs – the inconspicuous property with the biggest impact
Technically speaking, sameAs is not a schema.org type in its own right, but a property—and of all things, it is almost globally neglected. Most implementations just link to LinkedIn, for example, and call it a day. The real added value lies elsewhere: Wikidata and Wikipedia.
Wikidata is the canonical knowledge registry behind Google, ChatGPT, Claude, and Perplexity. If you anchor your entity there, you tap directly into the source where these systems get their knowledge of the world. This is the most verifiable step possible, not least because it connects directly with the Knowledge Graph instead of vague LLM assumptions.
An example of usage in JSON-LD
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "internetwarriors GmbH",
"url": "https://www.internetwarriors.de",
"sameAs": [
"https://www.wikidata.org/wiki/Q...",
"https://de.wikipedia.org/wiki/...",
"https://www.linkedin.com/company/internetwarriors",
"https://www.crunchbase.com/organization/..."
]
}
Dataset - If you have your own data, show it as data
Do you have your own studies, benchmarks, market figures, or evaluations? Then signal with Dataset that this is original data, not recycled facts. AI systems prefer primary sources because they lower the risk of hallucination. This is exactly how you stand out from the crowd of secondary content sites.
Info and implementation examples at: https://schema.org/Dataset
ItemList and ClaimReview – structure for unique statements
With ItemList, you make rankings, comparisons, and enumerations machine-readable, for instance for "best X for Y" articles that users search for before making a purchasing decision. Instead of having to extract a list from continuous text, the search engine gets the exact order served on a silver platter.
ClaimReview identifies individual, verified statements, originally intended for fact-checking. Google has scaled back its functionality by now, so don't expect miracles. But if you want to clearly indicate what a statement is based on, this is still a solid choice.
Info and implementation examples at: https://schema.org/ItemList
and at https://schema.org/ClaimReview
Achieving the greatest impact: combine types instead of using them individually
The biggest mistake is relying on a single "magical" type. Analyses consistently point in one direction: it's the combination that works. In practice, a stacked approach of Article + FAQPage + BreadcrumbList + DefinedTerm + HowTo clearly beats pages with only one schema type. But even here, we must remain realistic: more isn't always better.
Let's be honest: schema is an amplifier, not a magic switch
A word of perspective, as the market is currently overflowing with golden promises. Much of what circulates as "340% more AI citations" statistics is not independently verified and often comes from sources that sell exactly this service. Google itself clarifies that schema alone does not guarantee inclusion in AI Overviews.
And there is an important technical caveat: tests show that LLMs sometimes simply read JSON-LD as additional text on the page rather than necessarily as a parsed structure.
In plain English, this means a good part of the effect does not come from the schema label, but from the fact that structured data forces you to organize your facts cleanly, unambiguously, and in a machine-readable way. The label helps search-based systems like Bing or Google, while clean content helps everyone.
This is not a weakness of the strategy—on the contrary. It just means markup without clean, matching page content is useless. The two must fit together.
Unsure if your structured data is ready for AI search, or if your brand even appears in ChatGPT, Perplexity, and Google AI Overviews? This is exactly where we come in. The internetwarriors will audit your existing schema markup, anchor your brand as an entity (think Wikidata), and show you the levers that will make the biggest difference for you. Schedule your free initial consultation now.
FAQ
Which schema type brings the most benefit for GEO?
The most underestimated and at the same time most verifiable lever is sameAs with links to Wikidata and Wikipedia, closely followed by DefinedTerm for specialized vocabulary. The greatest overall effect comes from combining several types.
Is JSON-LD enough, or do I need Microdata?
JSON-LD is the format preferred by Google and all major platforms. Microdata and RDFa work, but are not recommended.
Does schema guarantee visibility in AI answers?
No. Schema is an amplifier, not a switch. It makes your brand and facts unique and clear. Inclusion also depends on content quality, authority, and the alignment of your markup with the actual page content.
How do I check if my markup is correct?
By using Google's Rich Results Test and the Schema.org Validator. Both will show you errors and warnings. Invalid markup has no utility. This test should therefore be part of your routine before every launch.
You can find the links to the tools here

Nadine
Wolff
As a long-time expert in SEO (and web analytics), Nadine Wolff has been working with internetwarriors since 2015. She leads the SEO & Web Analytics team and is passionate about all the (sometimes quirky) innovations from Google and the other major search engines. In the SEO field, Nadine has published articles in Website Boosting and looks forward to professional workshops and sustainable organic exchanges.
Comments on the post
no comments yet
Write a comment
Your email address will not be published. Required fields are marked with *
