AIPREF Generator: Free AI Preferences Tool

Quick Start

Configure Your Preferences

Automated Processing

General automated content processing

AI Training

Training machine learning models

Generative AI

Training generative AI models

Search engine indexing

Select preferences above to generate code

What is AIPREF?

AIPREF (AI Preferences) is an internet standard being developed by the IETF AIPREF Working Group to provide a standardized mechanism for declaring how digital content may be used by automated systems and AI models.

The specification defines a vocabulary for expressing preferences and methods for associating these preferences with content through HTTP headers and robots.txt directives. This enables content owners to declare whether their content may be used for purposes such as AI model training, automated processing, or search indexing.

The AIPREF standard is built on existing web standards including RFC 9651 (Structured Fields) and RFC 9309 (Robots Exclusion Protocol).

How AIPREF Works

1. Define Your Preferences

Choose from four preference categories: automated processing (bots), AI training (train-ai), generative AI training (train-genai), and search indexing (search). Each can be set to allow, disallow, or left unstated.

2. Implement Using HTTP Headers

Add the Content-Usage header to your HTTP responses. This header uses a structured dictionary format to communicate your preferences to compliant AI systems and automated crawlers.

3. Configure robots.txt

Alternatively, declare preferences in your robots.txt file using Content-Usage directives. This method supports path-specific preferences for different sections of your website.

Understanding Preference Categories

bots - Automated Processing

This is the broadest category, covering all automated processing and analysis of content. When set, it applies as a baseline for all subcategories unless they specify more specific preferences.

train-ai - AI Model Training

Controls whether content may be used to train machine learning and AI models. This includes both supervised and unsupervised learning systems.

train-genai - Generative AI Training

A more specific category for training models that generate synthetic content such as text, images, or code. This is a subset of train-ai.

search - Search Engine Indexing

Controls indexing by search engines that direct users back to your content. This helps search engines understand your preferences for appearing in search results.

Implementation Methods

HTTP Header Method

The Content-Usage HTTP header communicates your preferences with each response. This method provides real-time preference information and works with any web server or CDN.

Example configurations:

Nginx:

add_header Content-Usage "train-ai=n, train-genai=n";

Apache:

Header set Content-Usage "train-ai=n, train-genai=n"

Next.js Middleware:

response.headers.set('Content-Usage', 'train-ai=n');

robots.txt Method

The robots.txt approach uses the Content-Usage directive to declare preferences. This method supports path-specific rules and integrates with existing crawler directives.

Example with path scoping:

User-Agent: *
Allow: /
Content-Usage: train-ai=n

User-Agent: *
Content-Usage: /public/ train-ai=y
Allow: /public/

Preference Conflict Resolution

When multiple preference declarations apply to the same content, the AIPREF specification defines clear rules for resolving conflicts:

Most Restrictive Wins: When preferences conflict, the most restrictive preference takes precedence. A disallow preference always overrides an allow preference.

Specific Over General: More specific preference categories override broader ones. For example, train-genai preferences override train-ai preferences for generative AI use cases.

Path Priority: In robots.txt, preferences associated with longer path prefixes take precedence over shorter prefixes.

Common Use Cases

Public Documentation

Allow search indexing and general AI training, but restrict generative AI to prevent direct content reproduction.

Content-Usage: train-ai=y, train-genai=n, search=y

Premium Content

Allow search visibility to drive traffic, but block all AI training to protect proprietary content.

Content-Usage: bots=y, train-ai=n, search=y

Open Research

Permit all uses including AI training to maximize research impact and knowledge sharing.

Content-Usage: bots=y, train-ai=y, train-genai=y, search=y

Technical Specifications

The AIPREF standard consists of two primary specifications currently in development:

AIPREF Vocabulary Specification

Defines the vocabulary for expressing AI usage preferences, including the semantics of each preference category and the rules for combining and interpreting preferences.

AIPREF Attachment Specification

Specifies how preferences are associated with content through HTTP headers and robots.txt directives, including syntax and processing requirements.

Both specifications are works in progress. Implementation details may change as the standards evolve through the IETF process.

About This Generator

This AIPREF generator is a free and open source community tool designed to help website owners implement the IETF AIPREF standard. It generates standards-compliant configurations based on the current draft specifications.

The tool provides real-time generation of Content-Usage headers, robots.txt directives, and JSON configurations. All generated code follows the latest AIPREF specification syntax and can be copied or downloaded for immediate use.

This is an independent community tool and not an official IETF website. All configurations are based on published IETF drafts from the AIPREF Working Group.