BiModal Design

A design framework for building interfaces that work across the full AI agent capability spectrum — from HTTP retrievers to protocol-native agents.

Overview

BiModal Design is a design framework for building interfaces that remain functional and discoverable across the full spectrum of AI agent capabilities — from simple HTTP crawlers to vision agents to protocol-native AI systems.

The framework centers on two concepts:

Agent Capability Spectrum: A six-level taxonomy of agent types, replacing the outdated binary "human vs. agent" model
Defense in Depth: Five architectural layers ensuring graceful degradation across every agent type

As AI agents become primary users of web interfaces — for search, automation, commerce, and discovery — designing for agent accessibility is no longer optional. BiModal Design provides the principles, patterns, and tools to get there.

The Problem

Interfaces today face a spectrum of AI consumers, not a single one:

Agent Level	Example	What They See
Level 0 — HTTP Retrievers	curl, web scrapers	Raw HTML only
Level 1 — LLM & Agentic Browsers	Perplexity Comet	Parsed HTML, no JS
Level 2 — Browser Automation	Playwright, Agentic Chrome	Full rendered DOM
Level 3 — Vision & Computer-Use Agents	Claude Computer Use	Screenshots & AOM
Level 4 — Tool-Use Agents	OpenAI function calling	API responses
Level 5 — Protocol-Native	MCP-connected agents	Protocol data

A CSR-only app with <div id="root"></div> is invisible to Levels 0-1, fragile for Levels 2-3, and unreachable for Levels 4-5 without an API. Most interfaces fail at multiple levels simultaneously.

The Agent Capability Spectrum

BiModal Design v3.0 replaces the binary "human vs. agent" model with a graduated spectrum:

Level 0: HTTP Retrievers      → See only raw HTML (FR-1 critical)
Level 1: LLM & Agentic Browsers → Parse HTML, navigate on user's behalf
Level 2: Browser Automation   → Execute JS, agentic commerce workflows
Level 3: Vision & Computer-Use Agents → See rendered pages, query OS AOM, click UI elements
Level 4: Tool-Use Agents      → Call APIs directly via function calling
Level 5: Protocol-Native      → MCP, A2A, NLWeb — rich agent protocols

A single product page might be crawled by an HTTP retriever (Level 0), read by Perplexity (Level 1), automated by Playwright (Level 2), navigated by Claude Computer Use (Level 3), queried via API (Level 4), and accessed through MCP (Level 5) — all simultaneously.

Defense in Depth

Five architectural layers ensure every agent type is served:

Layer 5: Agent Protocols      (MCP, A2A, NLWeb)           → Level 5
Layer 4: API Surface           (REST, GraphQL, OpenAPI)    → Level 4-5
Layer 3: Structured Data       (schema.org, JSON-LD)      → Level 1-3
Layer 2: Semantic Structure    (HTML5, ARIA, headings)     → Level 1-3
Layer 1: Content Accessibility (FR-1: SSR/SSG)             → Level 0-1

Each layer serves a different segment of the spectrum. Together, they ensure graceful degradation — if an agent can't use Layer 5, it falls back to Layer 4, then Layer 3, and so on.

Key Research Findings

Research and benchmarks indicate significant improvements when adhering to BiModal Design:

12-20% baseline success for HTTP Retrievers on conventional CSR sites, improving to 42-65% with Layer 1 compliance and 60-75% with full Layer 1-3 implementation.
35-50% baseline success for Browser Automation agents on conventional UI, improving to 55-72% with semantic structure and up to 75-88% with structured data.
BrowseComp & VisualWebArena insights indicate that pure visual reasoning is brittle; well-structured Layer 2 and Layer 3 significantly boost agent reliability.

Quick Start

1. Test FR-1 Compliance (Layer 1)

# Check if your site exposes content in the initial HTML response
curl -s https://your-site.com | grep -E '<(main|nav|h1|article)'

If this returns semantic HTML with content — you pass Layer 1. If it returns <div id="root"></div> — content is invisible to Level 0-1 agents.

2. Run BiModal Design Validation

# Quick pass/fail FR-1 check
node tools/validators/fr1-validator.js https://your-site.com

# Comprehensive audit (structure, semantics, navigation, forms, agent features)
node tools/validators/fr1-checker.js https://your-site.com --verbose

3. Implement Core Patterns

<!-- Layer 1: Content in initial HTML (SSR/SSG) -->
<!-- Layer 2: Semantic structure with ARIA -->
<main role="main" aria-label="Product catalog">
  <h1>Wireless Headphones</h1>
  <nav role="navigation" aria-label="Main navigation">
    <a href="/products" aria-label="Browse all products">Products</a>
  </nav>
</main>

<!-- Layer 3: Structured data with schema.org -->
<script type="application/ld+json">
  {
    "@context": "https://schema.org",
    "@type": "Product",
    "name": "Wireless Headphones",
    "offers": {
      "@type": "Offer",
      "price": "99.99",
      "priceCurrency": "USD",
      "availability": "https://schema.org/InStock"
    }
  }
</script>

<!-- Layer 4: API documented via OpenAPI -->
<!-- Layer 5: MCP server discovery for protocol-native agents -->
<link rel="alternate" type="application/mcp+json" href="/mcp-server" />

Key Concepts

FR-1: Initial Payload Accessibility

The foundational requirement: critical content must exist in the initial HTTP response. This is Layer 1 of defense in depth — the floor, not the ceiling.

Two validation tools are included:

fr1-validator.js — lightweight pass/fail check for FR-1 compliance (text content, semantic structure, SPA shell detection). Ideal for CI gates and quick checks.
fr1-checker.js — comprehensive audit covering semantic content, navigation accessibility, form labels, heading hierarchy, ARIA landmarks, image alt text, and agent-specific features. Use --verbose for detailed scoring across six categories.

Standards-First with Agent Attributes

Strategic Update (AOM Integration): For Web Components, BiModal Design heavily advocates using the ElementInternals API to natively define roles and ARIA states directly in the Accessibility Object Model (AOM). This removes the need for custom attributes on Custom Elements, dramatically improving discoverability for Level 2 (Browser Automation) and Level 3 (Vision & Computer-Use) agents as benchmarked by VisualWebArena. See our AOM Integration Proposal.

v3.0 uses established standards as the primary semantic layer, with data-agent-* attributes as a supplementary layer for intent and action metadata that standards don't cover:

Layer	Purpose	Example
Schema.org	Content identity and structure	`itemscope itemtype="schema.org/Product"`
WAI-ARIA	Accessibility and interaction	`aria-label="Add to cart"`
`data-agent-*`	Agent intent, actions, and hints	`data-agent-action="add-to-cart"`

Standards (schema.org, ARIA) describe what content is. Agent attributes describe what agents can do with it — actions, intents, component roles, and navigation priorities. See the API Reference for the full data-agent-* attribute specification.

Agent Protocols

BiModal Design v3.0 integrates emerging agent protocols:

MCP: Expose tools, resources, and prompts for AI agents
A2A: Enable agent-to-agent interoperability
NLWeb: Support natural language queries against your data

GEO (Generative Engine Optimization)

As users discover content through AI assistants rather than search engines, BiModal Design compliance drives GEO performance. Layers 1-3 are essential for AI-assisted discoverability.

Documentation

Document	Description
White Paper	Framework specification v3.0
Implementation Guide	Development & deployment practices
Compliance Checklist	Layer-by-layer compliance criteria
API Reference	Tool and validator API documentation
Troubleshooting	Common errors and corrections

Tools & Examples

Validation Tools

FR-1 Validator (fr1-validator.js) — quick pass/fail FR-1 compliance check (Layer 1)
FR-1 Checker (fr1-checker.js) — comprehensive Layer 1-2 audit with detailed scoring across structure, semantics, navigation, forms, content meaning, and agent features
Compliance Auditor — full BiModal Design compliance suite (Layers 1-3)

Implementation Examples

Astro SSG Example — static rendering pattern (Layer 1)
Next.js SSR Example — server rendering pattern (Layer 1)
CSR Mitigation — client-rendered fallback strategies
AOM Web Component Example — Using ElementInternals to expose native semantics to agents

Maturity Levels

Maturity levels describe site compliance, not agent types. A Maturity Level 4 site implements all five defense-in-depth layers and serves all six agent capability levels (Levels 0-5).

Level	Name	Layers	Agent Coverage	Success Rate
0	Infrastructure Ready	Layer 1	Level 0-1	40-65%
1	Semantically Accessible	Layers 1-2	Level 0-2	55-75%
2	Data-Rich	Layers 1-3	Level 0-3	65-85%
3	API-Enabled	Layers 1-4	Level 0-4	80-92%
4	Agent-Native	Layers 1-5	All levels	90-98%

Contributing

Contributions are welcome!

Fork the repository
Create a feature branch
Commit with a Conventional Commit message
Submit a Pull Request

Refer to the Contributing Guidelines for review standards and code style.

Development Setup

# Clone the repository
git clone https://github.com/jgoldfoot/BiModalDesign.git
cd BiModalDesign

# Install dependencies
cd tools/validators
npm install

# Run tests
npm test

Research & Citations

WebAgents Survey 2025 — "A Survey of WebAgents: Towards Next-Generation AI Agents for Web Automation" (arXiv:2503.23350v4)
WebArena-Verified — Rigorous re-evaluation of autonomous web agents
WorkArena++ — "Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks" (NeurIPS 2024)
OSWorld — "Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments" (arXiv:2404.07972)
ST-WebAgentBench — "A Benchmark for Evaluating Safety and Trustworthiness in Web Agents" (arXiv:2410.06703v6)
MCP — Model Context Protocol (modelcontextprotocol.io)
A2A — Agent-to-Agent Protocol (Google, 2025)
NLWeb — Natural Language Web Protocol (Microsoft, 2025)
τ-bench — "A Benchmark for Tool-Agent-User Interaction in Real-World Domains" (arXiv:2406.12045)
WebVoyager — Benchmarking end-to-end browser agents on live real-world websites
Odysseys — "Benchmarking Web Agents on Realistic Long Horizon Tasks" (arXiv:2604.24964)
VisualWebArena — Evaluating Multimodal Agents on Realistic Visual Web Tasks
BrowseComp — Benchmark for Agentic Browser Navigation & Task Execution
Operator — Evaluating multi-agent vision-and-semantic systems across complex JavaScript interfaces (OpenAI, 2026)
Project Mariner — Benchmarking "Teach & Repeat" capabilities and multi-task concurrency (Google, 2026)
ScreenSpot — Benchmark for spatial and visual understanding in GUIs
UFO² — Multiagent AgentOS featuring hybrid control detection (Microsoft, 2026)
WebMCP — Native browser integration for Model Context Protocol discovery

License

Licensed under the Apache License 2.0. See LICENSE for full details.

Author

Joel Goldfoot Design Leader | AI + Human-Agent Interaction Researcher

joel@goldfoot.com linkedin.com/in/joelgoldfoot bimodal.design

BiModal Design — Designing for the full agent capability spectrum.

Name		Name	Last commit message	Last commit date
Latest commit History 290 Commits
.Jules		.Jules
.github		.github
.jules		.jules
__tests__		__tests__
accessibility		accessibility
docs		docs
examples		examples
tools		tools
.editorconfig		.editorconfig
.gitignore		.gitignore
.nvmrc		.nvmrc
.prettierignore		.prettierignore
.prettierrc.json		.prettierrc.json
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Documentation drift report [2026-06-07].md		Documentation drift report [2026-06-07].md
Documentation drift report [2026-06-09].md		Documentation drift report [2026-06-09].md
Issue - Strategic Update for Operator and Project Mariner.md		Issue - Strategic Update for Operator and Project Mariner.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
Terminology audit findings [2026-06-11].md		Terminology audit findings [2026-06-11].md
documentation_drift_report.md		documentation_drift_report.md
documentation_drift_report_2026-05-28.md		documentation_drift_report_2026-05-28.md
documentation_drift_report_2026-05-29.md		documentation_drift_report_2026-05-29.md
documentation_drift_report_2026-05-30.md		documentation_drift_report_2026-05-30.md
documentation_drift_report_2026-05-31.md		documentation_drift_report_2026-05-31.md
documentation_drift_report_2026-06-01.md		documentation_drift_report_2026-06-01.md
documentation_drift_report_2026-06-02.md		documentation_drift_report_2026-06-02.md
documentation_drift_report_2026-06-03.md		documentation_drift_report_2026-06-03.md
documentation_drift_report_2026-06-04.md		documentation_drift_report_2026-06-04.md
documentation_drift_report_2026-06-08.md		documentation_drift_report_2026-06-08.md
eslint.config.js		eslint.config.js
fix-urls.sh		fix-urls.sh
index.js		index.js
jest.config.js		jest.config.js
package-lock.json		package-lock.json
package.json		package.json
pr_body.txt		pr_body.txt
pr_body_test.sh		pr_body_test.sh
pr_body_test2.sh		pr_body_test2.sh
pr_body_test3.sh		pr_body_test3.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BiModal Design

Table of Contents

Overview

The Problem

The Agent Capability Spectrum

Defense in Depth

Key Research Findings

Quick Start

1. Test FR-1 Compliance (Layer 1)

2. Run BiModal Design Validation

3. Implement Core Patterns

Key Concepts

FR-1: Initial Payload Accessibility

Standards-First with Agent Attributes

Agent Protocols

GEO (Generative Engine Optimization)

Documentation

Tools & Examples

Validation Tools

Implementation Examples

Maturity Levels

Contributing

Development Setup

Research & Citations

License

Author

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BiModal Design

Table of Contents

Overview

The Problem

The Agent Capability Spectrum

Defense in Depth

Key Research Findings

Quick Start

1. Test FR-1 Compliance (Layer 1)

2. Run BiModal Design Validation

3. Implement Core Patterns

Key Concepts

FR-1: Initial Payload Accessibility

Standards-First with Agent Attributes

Agent Protocols

GEO (Generative Engine Optimization)

Documentation

Tools & Examples

Validation Tools

Implementation Examples

Maturity Levels

Contributing

Development Setup

Research & Citations

License

Author

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages