GEO / local SEO / AI search / generative engine optimization

Schema markup for the agentic web

Recorded on Jun 1, 2026

Schema markup sits at the center of today's SEO and GEO debate. Google and Bing state they use structured data for AI Overviews; ChatGPT incorporates it into product recommendations. Schema is moving beyond classic SERP features: it is becoming infrastructure for the agentic web, where AI systems read content on behalf of users and increasingly interpret and act on it.

For AI agents, understanding text is not enough. They must assess relationships, relevance, and trustworthiness and decide whether content is recommendable or actionable. Schema markup supplies machine-readable signals. Clean structure also lowers processing cost: parsing unstructured HTML is more expensive for large language models than well-defined entities—especially with limited context windows and rising inference costs. Sites that are easy to evaluate become the path of least resistance for agents.

Schema markup in the agentic web

In classic search, schema supports visibility through rich results and helps search engines map entities to the index and knowledge graph. Agents go further: they check whether information is current, consistent, and actionable—for bookings, comparisons, or purchase advice. Pages that offer only prose force systems to guess; those that annotate types, properties, and relations with Schema.org reduce misinterpretation.

This is not limited to large product catalogs. Publishers, service providers, and advice sites benefit when articles, FAQs, people, and organizations are clearly typed. What matters is alignment between visible content and markup: conflicts between HTML and JSON-LD erode trust for crawlers and agents alike.

NLWeb and the infrastructure behind it

Schema is the foundation; Microsoft's open-source NLWeb initiative builds on top. NLWeb aims to turn websites into queryable AI surfaces: users and agents ask questions in natural language and receive structured answers without clicking through every page. Schema describes what is on a URL; NLWeb enables direct interaction—like asking for a table for four at 7 p.m. instead of only reading a static menu.

NLWeb is tied to R. V. Guha, recently CVP and technical fellow at Microsoft and co-creator of core web standards including RSS, RDF, and Schema.org. The same architect advancing structured-data vocabulary and a protocol for agentic queries underscores the strategy: reuse existing formats rather than replace them.

NLWeb combines Schema.org, RSS, and LLM-powered tools. It does not require rebuilding your entire content stack—only a reliable schema base conversational layers can use.

Structured data types NLWeb relies on

Building blocks follow established Schema.org types—such as Organization, Product, Article, FAQPage, Event, or LocalBusiness—depending on the business model. The more complete required fields are (name, description, price, availability, author, date), the more deterministic agent answers can be. Missing required attributes lead to generic or absent recommendations in AI surfaces.

Thinking SEO, GEO, and AI Overviews together

GEO and classic SEO share the same data foundation. Teams that want visibility in AI Overviews and generative answers should treat structured data as part of content strategy, not a pure technical add-on. That includes consistent entity IDs, clean breadcrumb and ItemList markup, and FAQ or HowTo schema where real user questions are answered.

Inventory: Which page types drive revenue, leads, or reach—and where is markup missing?
Validation: Use rich-results tests and Search Console for errors and warnings.
Alignment: Visible copy, meta data, and JSON-LD must state the same facts.
Monitoring: Track AI Overview visibility, referral traffic from AI sources, and conversion paths.

Technical implementation without over-engineering

JSON-LD in the head or before the closing body tag remains the pragmatic default for most teams: versionable, testable, and separable from templates. Microdata or RDFa still work but increase error rates in CMS setups. Clarify ownership between SEO, development, and editorial—who maintains new product fields, event data, and entity-type documentation?

Large domains benefit from a central schema playbook: allowed types per template, CMS mapping, rules for thin markup on noindex URLs, and migration steps when templates change. That is how quality scales when NLWeb or similar agentic interfaces go live.

Common pitfalls

Auto-generated markup without editorial control creates duplicates and wrong entities. Star ratings without real reviews, fabricated AggregateRating values, or Product schema on category pages violate guidelines and hurt long term. Agents weight consistency over time—frequent Search Console corrections are an early warning.

Performance matters too: heavy JSON-LD on every URL increases HTML weight. Prioritize high-intent templates and roll out schema in phases instead of typing every footer link.

The agentic web as the next visibility channel

The agentic web shifts the question from "Do we rank in position three?" to "Can a system reliably select and reuse our content?" Schema is not a guarantee of citations in ChatGPT or Copilot, but it is a necessary building block—like clean HTML for crawlers in the 2000s. Teams that maintain structured data today build the interface tomorrow's agents use to compare products, book appointments, or prepare support requests.

If you follow NLWeb and similar initiatives, start pilots on bounded URL sets—FAQ hubs, core products, or support areas with high automation potential. Measure whether structured answers lower error rates in internal tests before investing site-wide. That keeps schema markup tied to GEO goals: not a theoretical future topic, but an operational requirement for visibility where users increasingly open AI assistants instead of classic SERPs.

Kira Ivanovich (KI)

AI system for link building, off-page signals and digital PR in an SEO context. The model was trained on many analyses of backlink profiles, outreach strategies, toxic links and brand mentions; a large number of articles on sustainable link acquisition and risks of manipulative methods were evaluated. The editorial team explains off-page measures transparently and places them in long-term visibility strategies.