Code examples and configuration guides for implementing AIPREF across different platforms
AIPREF can be implemented using two standard methods:
HTTP headers are preferred because they apply per-resource and take precedence over robots.txt. Both methods can be used together for maximum compatibility.
Add the Content-Usage header to all responses or specific locations in your Nginx configuration.
server { listen 80; server_name example.com; # Add Content-Usage header to all responses add_header Content-Usage "train-ai=n, train-genai=n" always; location / { root /var/www/html; index index.html; } }
server { listen 80; server_name example.com; # Public documentation - allow all location /docs/ { add_header Content-Usage "bots=y, train-ai=y, train-genai=y, search=y" always; root /var/www/html; } # Premium content - block AI training location /premium/ { add_header Content-Usage "train-ai=n, train-genai=n, search=y" always; root /var/www/html; } # Private API - block everything except authenticated bots location /api/ { add_header Content-Usage "bots=n, train-ai=n, train-genai=n, search=n" always; proxy_pass http://backend; } }
always
parameter ensures the header is added even for error responses (4xx, 5xx).Configure Content-Usage headers using the mod_headers
module in Apache.
# Enable mod_headers if not already enabled # LoadModule headers_module modules/mod_headers.so # Add Content-Usage header to all responses Header set Content-Usage "train-ai=n, train-genai=n"
<VirtualHost *:80> ServerName example.com DocumentRoot /var/www/html # Default: block AI training Header set Content-Usage "train-ai=n, train-genai=n, search=y" # Public docs: allow all <Directory "/var/www/html/docs"> Header set Content-Usage "bots=y, train-ai=y, train-genai=y, search=y" </Directory> # Premium content: strict controls <Directory "/var/www/html/premium"> Header set Content-Usage "bots=n, train-ai=n, train-genai=n, search=y" </Directory> </VirtualHost>
Implement Content-Usage headers in Next.js using middleware or custom headers in next.config.js.
Create middleware.ts
in your project root:
import { NextResponse } from 'next/server'; import type { NextRequest } from 'next/server'; export function middleware(request: NextRequest) { const response = NextResponse.next(); // Site-wide preference response.headers.set( 'Content-Usage', 'train-ai=n, train-genai=n, search=y' ); // Path-specific preferences if (request.nextUrl.pathname.startsWith('/docs')) { response.headers.set( 'Content-Usage', 'bots=y, train-ai=y, train-genai=y, search=y' ); } else if (request.nextUrl.pathname.startsWith('/api')) { response.headers.set( 'Content-Usage', 'bots=n, train-ai=n, train-genai=n, search=n' ); } return response; } export const config = { matcher: [ '/((?!_next/static|_next/image|favicon.ico).*)', ], };
/** @type {import('next').NextConfig} */ const nextConfig = { async headers() { return [ { source: '/:path*', headers: [ { key: 'Content-Usage', value: 'train-ai=n, train-genai=n, search=y', }, ], }, { source: '/docs/:path*', headers: [ { key: 'Content-Usage', value: 'bots=y, train-ai=y, train-genai=y, search=y', }, ], }, ]; }, }; module.exports = nextConfig;
Add Content-Usage headers using Express middleware.
const express = require('express'); const app = express(); // Global AIPREF middleware app.use((req, res, next) => { res.setHeader('Content-Usage', 'train-ai=n, train-genai=n, search=y'); next(); }); // Your routes app.get('/', (req, res) => { res.send('Hello World'); }); app.listen(3000);
const express = require('express'); const app = express(); // Middleware factory for AIPREF function aipref(preferences) { return (req, res, next) => { res.setHeader('Content-Usage', preferences); next(); }; } // Public docs - allow all app.use('/docs', aipref('bots=y, train-ai=y, train-genai=y, search=y'), express.static('public/docs') ); // Premium content - block AI training app.use('/premium', aipref('train-ai=n, train-genai=n, search=y'), express.static('public/premium') ); // API - block all automated access app.use('/api', aipref('bots=n, train-ai=n, train-genai=n, search=n') ); app.listen(3000);
Add Content-Usage directives to your robots.txt file at the root of your domain.
User-Agent: * Allow: / Content-Usage: train-ai=n, train-genai=n, search=y
User-Agent: * Allow: / # Default preference for most content Content-Usage: train-ai=n, train-genai=n, search=y # Public documentation - allow AI training User-Agent: * Allow: /docs/ Content-Usage: bots=y, train-ai=y, train-genai=y, search=y # Premium content - strict controls User-Agent: * Allow: /premium/ Content-Usage: bots=n, train-ai=n, train-genai=n, search=y # Private sections - block all User-Agent: * Disallow: /private/ Content-Usage: bots=n, train-ai=n, train-genai=n, search=n
Use gatsby-plugin-netlify or gatsby-ssr.js:
// gatsby-ssr.js export const onPreRenderHTML = ({ getHeadComponents }) => { if (typeof window !== 'undefined') { return; } }; export const onRenderBody = ({ setHeadComponents }) => { setHeadComponents([]); }; // Use gatsby-plugin-netlify for headers // In gatsby-config.js: module.exports = { plugins: [ { resolve: 'gatsby-plugin-netlify', options: { headers: { '/*': [ 'Content-Usage: train-ai=n, train-genai=n, search=y', ], }, }, }, ], };
For Hugo, configure headers in your deployment platform (Netlify, Vercel) or use a _headers file:
# static/_headers (for Netlify) /* Content-Usage: train-ai=n, train-genai=n, search=y /docs/* Content-Usage: bots=y, train-ai=y, train-genai=y, search=y
After implementing AIPREF, verify that your headers are being sent correctly:
curl -I https://example.com # Look for: # Content-Usage: train-ai=n, train-genai=n, search=y
1. Open your website in a browser
2. Open DevTools (F12)
3. Go to Network tab
4. Reload the page
5. Click on the main document request
6. Check Response Headers for Content-Usage