AIPREF
AIPREF Generator
Control AI usage of your content
๐Ÿ’ก

AIPREF Use Cases

Real-world scenarios and practical examples for implementing AI preferences

Open Access

Public Documentation

Open source projects and public documentation often welcome AI training to improve developer tools and code assistants. This configuration maximizes discoverability and utility while maintaining proper attribution.

Recommended Configuration

Content-Usage: bots=y, train-ai=y, train-genai=y, search=y

Example Scenarios

  • Open source project documentation (React, Python, Rust docs)
  • Public API reference guides
  • Educational tutorials and learning resources
  • Community wikis and knowledge bases
  • Public code repositories
Protected Content

Premium Content and Paywalls

Subscription-based content, paywalled articles, and premium resources need protection from AI training while remaining discoverable through search engines to attract subscribers.

Recommended Configuration

Content-Usage: train-ai=n, train-genai=n, search=y

This blocks AI training while allowing search indexing. The bots preference is unstated, allowing general crawling for non-AI purposes.

Example Scenarios

  • News articles behind paywalls (NYT, WSJ, The Atlantic)
  • Online course content and premium tutorials
  • Subscription-based research databases
  • Members-only community content
  • SaaS product documentation for paying customers

Implementation Example (Next.js)

// middleware.ts
export function middleware(request: NextRequest) {
  const response = NextResponse.next();

  // Premium articles - block AI training
  if (request.nextUrl.pathname.startsWith('/premium')) {
    response.headers.set(
      'Content-Usage',
      'train-ai=n, train-genai=n, search=y'
    );
  }

  return response;
}
Creative Protection

Creative Works and Art

Artists, photographers, writers, and creators want their work discoverable but protected from generative AI that could create derivative works. This configuration specifically targets generative AI while allowing other uses.

Recommended Configuration

Content-Usage: train-genai=n, search=y

This specifically blocks generative AI training (image generators, text synthesis) while leaving other AI uses unstated. Search indexing remains enabled for discoverability.

Example Scenarios

  • Photography portfolios and stock photo sites
  • Digital art galleries and artist websites
  • Original fiction and creative writing
  • Music composition and lyrics
  • Video content and cinematography
Academic Content

Research Publications

Academic institutions and researchers often have nuanced needs depending on publication status, licensing, and institutional policies. Different sections may require different preferences.

Published Open Access Research

Content-Usage: bots=y, train-ai=y, train-genai=y, search=y

Open access publications can allow all AI training to advance scientific discovery and research tools.

Pre-print or Unpublished Research

Content-Usage: train-ai=n, train-genai=n, search=y

Protect unpublished work while maintaining discoverability through academic search engines.

Proprietary Research Data

Content-Usage: bots=n, train-ai=n, train-genai=n, search=n

Block all automated access to proprietary research data and datasets.

E-commerce

E-commerce and Retail

Online stores need product pages discoverable through search while protecting proprietary product descriptions, pricing strategies, and customer reviews from AI scraping.

Product Pages

Content-Usage: train-ai=n, train-genai=n, search=y

Allow search indexing for product discovery while protecting unique descriptions and reviews.

Pricing and Inventory APIs

Content-Usage: bots=n, train-ai=n, train-genai=n, search=n

Completely block automated access to sensitive business data like real-time pricing.

robots.txt Example

User-Agent: *
Allow: /products/
Content-Usage: train-ai=n, train-genai=n, search=y

User-Agent: *
Disallow: /api/
Content-Usage: bots=n, train-ai=n, train-genai=n, search=n

User-Agent: *
Disallow: /checkout/
Content-Usage: bots=n, train-ai=n, train-genai=n, search=n
News Media

News and Journalism

News organizations want articles discoverable through search and news aggregators while protecting original reporting from AI summarization that could reduce direct readership.

Current News Articles

Content-Usage: train-genai=n, search=y

Prevent AI from generating summaries while allowing search indexing and general AI training for fact-checking models.

Example Scenarios

  • Breaking news and investigative journalism
  • Opinion columns and editorial content
  • Photo journalism and multimedia stories
  • Local news coverage
  • News aggregator feeds
Private

Internal Documentation and Tools

Internal tools, admin panels, and private documentation should block all automated access including search indexing.

Recommended Configuration

Content-Usage: bots=n, train-ai=n, train-genai=n, search=n

Example Scenarios

  • Admin dashboards and control panels
  • Internal wikis and knowledge bases
  • Employee directories and contact information
  • Development and staging environments
  • Private API endpoints

Quick Configuration Reference

Allow Everythingbots=y, train-ai=y, train-genai=y, search=y
Block AI Training Onlytrain-ai=n, train-genai=n, search=y
Block Generative AI Onlytrain-genai=n, search=y
Block Everythingbots=n, train-ai=n, train-genai=n, search=n

Learn More