# Autheo.com: robots.txt # https://www.autheo.com # ── Content Signals (IETF draft-romm-aipref-contentsignals) ────────────────── # Autheo actively wants to appear in AI answers and search results. # Training on our content is not permitted. # search=yes → index us for search results # ai-input=yes → use our content for RAG / grounded AI answers # ai-train=no → do not use our content to train models # ── i18n locale gate ──────────────────────────────────────────────────────── # Per-locale SEO readiness is gated by readyLocales in src/i18n/config.ts. # Ready locales (currently en, zh-CN, es, ko, fr, de, vi) are indexable: no Disallow # lines are emitted for them and they self-canonicalize in their HTML head. # If a locale is flipped back to not-ready, this route emits Disallow rules # for that locale prefix so crawlers stop indexing it during rollout. # ── Default: allow all well-behaved bots ───────────────────────────────────── User-Agent: * Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / Disallow: /studio Disallow: /studio/ Disallow: /admin Disallow: /admin/ Disallow: /api/ # ── OpenAI ──────────────────────────────────────────────────────────────────── User-Agent: GPTBot Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / User-Agent: OAI-SearchBot Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / User-Agent: ChatGPT-User Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / # ── Google ──────────────────────────────────────────────────────────────────── User-Agent: Google-Extended Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / User-Agent: Googlebot Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / # ── Anthropic ───────────────────────────────────────────────────────────────── User-Agent: anthropic-ai Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / User-Agent: ClaudeBot Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / User-Agent: Claude-Web Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / # ── Perplexity ──────────────────────────────────────────────────────────────── User-Agent: PerplexityBot Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / # ── Meta ───────────────────────────────────────────────────────────────────── User-Agent: Meta-ExternalAgent Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / User-Agent: FacebookBot Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / # ── Cohere ─────────────────────────────────────────────────────────────────── User-Agent: cohere-ai Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / # ── Mistral ────────────────────────────────────────────────────────────────── User-Agent: MistralBot Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / # ── Amazon / AWS ───────────────────────────────────────────────────────────── User-Agent: Amazonbot Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / # ── Apple ──────────────────────────────────────────────────────────────────── User-Agent: Applebot Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / User-Agent: Applebot-Extended Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / # ── Common Crawl ───────────────────────────────────────────────────────────── User-Agent: CCBot Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / # ── Firecrawl (trusted data partner) ───────────────────────────────────────── User-Agent: firecrawl Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / # ── Microsoft / Bing ───────────────────────────────────────────────────────── User-Agent: Bingbot Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / User-Agent: BingPreview Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / # ── Ahrefs Site Audit (owner-initiated crawls) ─────────────────────────────── # AhrefsSiteAudit is a separate UA from AhrefsBot (the indexing crawler). # Explicitly allow it so the site owner can run audits without hitting WAF # blocks or rate limits that would inflate error counts. User-Agent: AhrefsSiteAudit Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / # ── Blocked crawlers ───────────────────────────────────────────────────────── # AI crawlers — explicitly allowed for AEO/GEO User-Agent: Bytespider Content-Signal: ai-train=no, search=yes, ai-input=yes Allow: / User-Agent: SemrushBot Disallow: / User-Agent: AhrefsBot Disallow: / User-Agent: DotBot Disallow: / User-Agent: MJ12bot Disallow: / # ── Sitemaps ────────────────────────────────────────────────────────────────── Sitemap: https://www.autheo.com/sitemap.xml # ── AI Agent context ────────────────────────────────────────────────────────── # Machine-readable site overview (llms.txt standard) # https://www.autheo.com/llms.txt