💡

AIPREF Use Cases

Real-world scenarios and practical examples for implementing AI preferences

Open Access

Public Documentation

Open source projects and public documentation often welcome AI training to improve developer tools and code assistants. This configuration maximizes discoverability and utility while maintaining proper attribution.

Recommended Configuration

Content-Usage: bots=y, train-ai=y, train-genai=y, search=y

Example Scenarios

Open source project documentation (React, Python, Rust docs)
Public API reference guides
Educational tutorials and learning resources
Community wikis and knowledge bases
Public code repositories

Protected Content

Premium Content and Paywalls

Subscription-based content, paywalled articles, and premium resources need protection from AI training while remaining discoverable through search engines to attract subscribers.

Recommended Configuration

Content-Usage: train-ai=n, train-genai=n, search=y

This blocks AI training while allowing search indexing. The bots preference is unstated, allowing general crawling for non-AI purposes.

Example Scenarios

News articles behind paywalls (NYT, WSJ, The Atlantic)
Online course content and premium tutorials
Subscription-based research databases
Members-only community content
SaaS product documentation for paying customers

Implementation Example (Next.js)

// middleware.ts
export function middleware(request: NextRequest) {
  const response = NextResponse.next();

  // Premium articles - block AI training
  if (request.nextUrl.pathname.startsWith('/premium')) {
    response.headers.set(
      'Content-Usage',
      'train-ai=n, train-genai=n, search=y'
    );
  }

  return response;
}

Creative Protection

Creative Works and Art

Artists, photographers, writers, and creators want their work discoverable but protected from generative AI that could create derivative works. This configuration specifically targets generative AI while allowing other uses.

Recommended Configuration

Content-Usage: train-genai=n, search=y

This specifically blocks generative AI training (image generators, text synthesis) while leaving other AI uses unstated. Search indexing remains enabled for discoverability.

Example Scenarios

Photography portfolios and stock photo sites
Digital art galleries and artist websites
Original fiction and creative writing
Music composition and lyrics
Video content and cinematography

Academic Content

Research Publications

Academic institutions and researchers often have nuanced needs depending on publication status, licensing, and institutional policies. Different sections may require different preferences.

Published Open Access Research

Content-Usage: bots=y, train-ai=y, train-genai=y, search=y

Open access publications can allow all AI training to advance scientific discovery and research tools.

Pre-print or Unpublished Research

Content-Usage: train-ai=n, train-genai=n, search=y

Protect unpublished work while maintaining discoverability through academic search engines.

Proprietary Research Data

Content-Usage: bots=n, train-ai=n, train-genai=n, search=n

Block all automated access to proprietary research data and datasets.

E-commerce

E-commerce and Retail

Online stores need product pages discoverable through search while protecting proprietary product descriptions, pricing strategies, and customer reviews from AI scraping.

Product Pages

Content-Usage: train-ai=n, train-genai=n, search=y

Allow search indexing for product discovery while protecting unique descriptions and reviews.

Pricing and Inventory APIs

Content-Usage: bots=n, train-ai=n, train-genai=n, search=n

Completely block automated access to sensitive business data like real-time pricing.

robots.txt Example

User-Agent: *
Allow: /products/
Content-Usage: train-ai=n, train-genai=n, search=y

User-Agent: *
Disallow: /api/
Content-Usage: bots=n, train-ai=n, train-genai=n, search=n

User-Agent: *
Disallow: /checkout/
Content-Usage: bots=n, train-ai=n, train-genai=n, search=n

News Media

News and Journalism

News organizations want articles discoverable through search and news aggregators while protecting original reporting from AI summarization that could reduce direct readership.

Current News Articles

Content-Usage: train-genai=n, search=y

Prevent AI from generating summaries while allowing search indexing and general AI training for fact-checking models.

Example Scenarios

Breaking news and investigative journalism
Opinion columns and editorial content
Photo journalism and multimedia stories
Local news coverage
News aggregator feeds

Private

Internal Documentation and Tools

Internal tools, admin panels, and private documentation should block all automated access including search indexing.

Recommended Configuration

Content-Usage: bots=n, train-ai=n, train-genai=n, search=n

Example Scenarios

Admin dashboards and control panels
Internal wikis and knowledge bases
Employee directories and contact information
Development and staging environments
Private API endpoints

Quick Configuration Reference

Allow Everythingbots=y, train-ai=y, train-genai=y, search=y
Block AI Training Onlytrain-ai=n, train-genai=n, search=y
Block Generative AI Onlytrain-genai=n, search=y
Block Everythingbots=n, train-ai=n, train-genai=n, search=n

Learn More

How AIPREF Works →AIPREF Vocabulary →Implementation Guide →