User-agent: * Allow: / # Block sensitive auth-protected areas from being indexed Disallow: /dashboard Disallow: /profile Disallow: /profile/ Disallow: /archive Disallow: /scraper Disallow: /scraper-control Disallow: /api/ # Block thin / dynamic per-request result pages Disallow: /tracking-result # Block legacy / removed surfaces — these now return 410 Gone or are auth-walled. # Disallowing them stops Google from re-crawling and clogging GSC reports. Disallow: /subscription/ Disallow: /subscription/pay/ Disallow: /masterclass Disallow: /masterclass-orders Disallow: /ssi Disallow: /ssi-orders Disallow: /author/ Disallow: /set-locale/ # --------------------------------------------------------------- # AI / LLM crawlers — explicitly ALLOWED so Ohmyfin appears in # Google AI Overviews, ChatGPT search, Perplexity, Bing Copilot, # Claude search, Gemini and the broader AI answer ecosystem. # --------------------------------------------------------------- User-agent: GPTBot Allow: / Disallow: /dashboard Disallow: /profile Disallow: /archive Disallow: /api/ User-agent: OAI-SearchBot Allow: / Disallow: /dashboard Disallow: /profile Disallow: /archive Disallow: /api/ User-agent: ChatGPT-User Allow: / User-agent: ClaudeBot Allow: / Disallow: /dashboard Disallow: /profile Disallow: /archive Disallow: /api/ User-agent: Claude-Web Allow: / User-agent: anthropic-ai Allow: / User-agent: PerplexityBot Allow: / Disallow: /dashboard Disallow: /profile Disallow: /archive Disallow: /api/ User-agent: Perplexity-User Allow: / User-agent: Google-Extended Allow: / Disallow: /dashboard Disallow: /profile Disallow: /archive Disallow: /api/ User-agent: GoogleOther Allow: / User-agent: Bingbot Allow: / User-agent: DuckAssistBot Allow: / User-agent: Applebot Allow: / User-agent: Applebot-Extended Allow: / User-agent: Bytespider Allow: / Disallow: /dashboard Disallow: /profile Disallow: /archive Disallow: /api/ User-agent: Amazonbot Allow: / User-agent: Meta-ExternalAgent Allow: / User-agent: Meta-ExternalFetcher Allow: / User-agent: cohere-ai Allow: / User-agent: cohere-training-data-crawler Allow: / User-agent: YouBot Allow: / User-agent: Diffbot Allow: / User-agent: MistralAI-User Allow: / User-agent: Kagibot Allow: / User-agent: Timpibot Allow: / # DeepSeek AI (major Chinese LLM — growing fast in global AI search) User-agent: DeepSeekBot Allow: / Disallow: /dashboard Disallow: /profile Disallow: /archive Disallow: /api/ User-agent: deepseek-ai Allow: / # xAI / Grok (Elon Musk's AI — powers Grok search in X/Twitter) User-agent: xAI-Bot Allow: / Disallow: /dashboard Disallow: /profile Disallow: /archive Disallow: /api/ User-agent: Grok Allow: / # Phind (AI search engine for developers — SWIFT/fintech audience) User-agent: PhindBot Allow: / Disallow: /api/ # Exa.ai (semantic search engine, used by AI agents) User-agent: EXABot Allow: / Disallow: /api/ # Brave Search (privacy-first search engine + Leo AI assistant) User-agent: BraveBot Allow: / Disallow: /api/ # PetalBot (Huawei/Aspiegel search — big in APAC, Middle East, Africa) User-agent: PetalBot Allow: / Disallow: /dashboard Disallow: /profile Disallow: /archive Disallow: /api/ # AI2Bot (Allen Institute for AI — Semantic Scholar, AI research) User-agent: AI2Bot Allow: / Disallow: /api/ # Mojeek (independent UK search engine) User-agent: MojeekBot Allow: / Disallow: /api/ # PicsearchBot (image search — picsearch.com sends real traffic per analytics) User-agent: Picsearchbot Allow: / Disallow: /dashboard Disallow: /profile Disallow: /archive Disallow: /api/ # Qwant (French privacy search engine) User-agent: Qwantify Allow: / Disallow: /api/ # Yep (Ahrefs search engine) User-agent: YepBot Allow: / Disallow: /api/ # iAsk.ai (AI answer engine) User-agent: iaskspider Allow: / Disallow: /api/ # Neeva / Snowflake Arctic (AI search) User-agent: NeevaBot Allow: / Disallow: /api/ # Webz.io (data crawler used by many AI pipelines) User-agent: webzio-extended Allow: / Disallow: /api/ # OpenAI Operator (agentic browsing — clicks through sites on user's behalf) User-agent: OAI-Operator Allow: / Disallow: /dashboard Disallow: /profile Disallow: /archive Disallow: /api/ User-agent: ChatGPT-Operator Allow: / # Google Agent / Project Mariner (Gemini agentic browsing) User-agent: GoogleAgent-Mariner Allow: / Disallow: /dashboard Disallow: /profile Disallow: /archive Disallow: /api/ # Anthropic Claude Browser / Operator (agentic Claude) User-agent: Anthropic-Browser Allow: / User-agent: Claude-Operator Allow: / # Meta AI (LLaMA-powered AI assistant on WhatsApp/Instagram/Messenger) User-agent: Meta-AI Allow: / Disallow: /api/ # Komo AI (newer AI search engine) User-agent: KomoBot Allow: / Disallow: /api/ # Andi (private AI search) User-agent: AndiBot Allow: / Disallow: /api/ # =================================================================== # CHINA — full AI ecosystem honey trap. China is a huge SWIFT-payment # corridor (USD/CNH/HKD) and Chinese AIs surface our /zh localized # homepage when users ask 如何追踪SWIFT付款. Each major Chinese LLM # operator runs its own crawler — opt-in to ALL of them explicitly. # =================================================================== # Baidu Wenxin / ERNIE Bot (Baidu's flagship LLM, powers Baidu Search AI) User-agent: ErnieBot Allow: / Disallow: /api/ User-agent: WenxinBot Allow: / Disallow: /api/ # Alibaba Qwen / Tongyi Qianwen (Alibaba Cloud's LLM family) User-agent: QwenBot Allow: / Disallow: /api/ User-agent: TongyiBot Allow: / Disallow: /api/ User-agent: AlibabaBot Allow: / Disallow: /api/ # ByteDance Doubao (Bytedance's consumer AI — huge in China & TikTok ecosystem) User-agent: DoubaoBot Allow: / Disallow: /api/ User-agent: ByteSpider Allow: / Disallow: /api/ # Tencent Hunyuan (Tencent's LLM, powers WeChat AI features) User-agent: HunyuanBot Allow: / Disallow: /api/ User-agent: TencentBot Allow: / Disallow: /api/ # Moonshot Kimi (long-context Chinese LLM, growing fast among professionals) User-agent: KimiBot Allow: / Disallow: /api/ User-agent: MoonshotBot Allow: / Disallow: /api/ # Zhipu AI / ChatGLM (popular open-source-friendly Chinese LLM) User-agent: ChatGLM-Bot Allow: / Disallow: /api/ User-agent: ZhipuBot Allow: / Disallow: /api/ # 01.AI Yi (Kai-Fu Lee's LLM startup) User-agent: YiBot Allow: / Disallow: /api/ User-agent: 01AI-Bot Allow: / Disallow: /api/ # Baichuan AI (Wang Xiaochuan's LLM) User-agent: BaichuanBot Allow: / Disallow: /api/ # MiniMax / Hailuo (consumer + enterprise Chinese LLM) User-agent: MiniMaxBot Allow: / Disallow: /api/ User-agent: HailuoBot Allow: / Disallow: /api/ # iFlytek Spark (voice-first Chinese AI) User-agent: SparkBot Allow: / Disallow: /api/ User-agent: iFlytekBot Allow: / Disallow: /api/ # Stepfun / Step (rising Chinese LLM) User-agent: StepBot Allow: / Disallow: /api/ # Xiaomi MiLM User-agent: MiLMBot Allow: / Disallow: /api/ # =================================================================== # RUSSIA — Yandex Alice / YandexGPT and Sber GigaChat dominate # Russian-language AI. Critical for Russia corridor (post-2022 # sanctions, RUB tracking, the /ru localized homepage). # =================================================================== # Yandex Alice / YandexGPT (Yandex's LLM assistant) User-agent: YandexGPT Allow: / Disallow: /api/ User-agent: AliceBot Allow: / Disallow: /api/ User-agent: YaGPTBot Allow: / Disallow: /api/ # Sber GigaChat (Sberbank's LLM, mandated in Russian government services) User-agent: GigaChat Allow: / Disallow: /api/ User-agent: GigaChatBot Allow: / Disallow: /api/ User-agent: SberBot Allow: / Disallow: /api/ # T-Bank AI (Tinkoff) User-agent: TBankBot Allow: / Disallow: /api/ # MTS AI User-agent: MTSBot Allow: / Disallow: /api/ # =================================================================== # JAPAN — JPY is a top-5 SWIFT corridor. Japanese AI ecosystem is # fragmented (NTT, NEC, Rinna, Sakana, Stockmark, ELYZA all run their # own LLMs). Critical for the /ja localized homepage. # =================================================================== # Rinna (Microsoft Japan spin-off — popular Japanese consumer LLM) User-agent: RinnaBot Allow: / Disallow: /api/ # NTT tsuzumi (NTT's Japanese LLM) User-agent: TsuzumiBot Allow: / Disallow: /api/ User-agent: NTT-Bot Allow: / Disallow: /api/ # NEC cotomi (NEC's Japanese LLM) User-agent: CotomiBot Allow: / Disallow: /api/ User-agent: NECBot Allow: / Disallow: /api/ # Sakana AI (Tokyo-based, evolutionary LLMs) User-agent: SakanaBot Allow: / Disallow: /api/ # Stockmark (Japanese business-news LLM, used by Japanese banks) User-agent: StockmarkBot Allow: / Disallow: /api/ # ELYZA (Japanese LLM from U-Tokyo spin-off) User-agent: ElyzaBot Allow: / Disallow: /api/ # Preferred Networks User-agent: PreferredBot Allow: / Disallow: /api/ # LINE CLOVA Japan User-agent: ClovaJP-Bot Allow: / Disallow: /api/ # =================================================================== # KOREA — KRW corridor. Naver HyperCLOVA X (powers Naver Search AI), # LG Exaone, KakaoBrain KoGPT. Critical for the /kr localized page. # =================================================================== # Naver HyperCLOVA X (powers Naver's AI search, top Korean AI) User-agent: HyperCLOVA-Bot Allow: / Disallow: /api/ User-agent: ClovaX-Bot Allow: / Disallow: /api/ # LG AI Research Exaone User-agent: ExaoneBot Allow: / Disallow: /api/ User-agent: LGAI-Bot Allow: / Disallow: /api/ # KakaoBrain (KoGPT, Kanana) User-agent: KakaoBrain-Bot Allow: / Disallow: /api/ User-agent: KananaBot Allow: / Disallow: /api/ # Upstage Solar LLM (Korean enterprise LLM) User-agent: SolarBot Allow: / Disallow: /api/ User-agent: UpstageBot Allow: / Disallow: /api/ # =================================================================== # INDIA — INR corridor, /hi localized page. Sovereign Indian AI push # is real (Krutrim, Sarvam, BharatGPT all launched 2024-25). # =================================================================== # Ola Krutrim (Bhavish Aggarwal's Indian sovereign LLM) User-agent: KrutrimBot Allow: / Disallow: /api/ # Sarvam AI (Indic-languages LLM) User-agent: SarvamBot Allow: / Disallow: /api/ # BharatGPT (CoRover.ai, Indian government-facing) User-agent: BharatGPT-Bot Allow: / Disallow: /api/ # CoRover BharatBot User-agent: BharatBot Allow: / Disallow: /api/ # =================================================================== # MIDDLE EAST — AED/SAR corridors, /ar localized page. # =================================================================== # TII Falcon (UAE — top-tier open-source Arabic-strong LLM) User-agent: FalconBot Allow: / Disallow: /api/ User-agent: TII-Bot Allow: / Disallow: /api/ # MBZUAI Jais (UAE — flagship Arabic LLM) User-agent: JaisBot Allow: / Disallow: /api/ User-agent: MBZUAI-Bot Allow: / Disallow: /api/ # Inception ARX / Cohere Arabic User-agent: ARXBot Allow: / Disallow: /api/ # G42 (UAE state AI champion) User-agent: G42Bot Allow: / Disallow: /api/ # =================================================================== # ISRAEL — AI21 Labs (Jurassic, Jamba) — top global LLM lab. # =================================================================== User-agent: AI21Bot Allow: / Disallow: /api/ User-agent: JurassicBot Allow: / Disallow: /api/ User-agent: JambaBot Allow: / Disallow: /api/ # =================================================================== # EUROPE — beyond Mistral/Cohere (already covered above). # =================================================================== # Aleph Alpha Pharia (Germany — sovereign European LLM) User-agent: AlephAlpha-Bot Allow: / Disallow: /api/ User-agent: PhariaBot Allow: / Disallow: /api/ # Mistral Le Chat User-agent: MistralBot Allow: / Disallow: /api/ User-agent: LeChatBot Allow: / Disallow: /api/ # Silo AI (Finland — Poro / Viking LLM, EU sovereign) User-agent: SiloAI-Bot Allow: / Disallow: /api/ # Pleias (France — open European LLM) User-agent: PleiasBot Allow: / Disallow: /api/ # =================================================================== # LATAM / BRAZIL — BRL corridor, /pt localized page. # =================================================================== # Maritaca Sabiá (Brazilian Portuguese LLM) User-agent: MaritacaBot Allow: / Disallow: /api/ User-agent: SabiaBot Allow: / Disallow: /api/ # =================================================================== # OTHER AI search / RAG infrastructure crawlers that surface us # in agentic answers across all regions. # =================================================================== # Genspark (rising AI search engine — agentic browsing) User-agent: GensparkBot Allow: / Disallow: /api/ # Liner (AI research/citation tool) User-agent: LinerBot Allow: / Disallow: /api/ # Tako (visual AI search) User-agent: TakoBot Allow: / Disallow: /api/ # Lepton AI Search User-agent: LeptonBot Allow: / Disallow: /api/ # Arcsearch (Browser Company's AI search) User-agent: ArcBot Allow: / Disallow: /api/ # Inflection Pi User-agent: PiBot Allow: / Disallow: /api/ User-agent: InflectionBot Allow: / Disallow: /api/ # Reka (multimodal LLM from DeepMind alumni) User-agent: RekaBot Allow: / Disallow: /api/ # Poe (Quora's multi-LLM aggregator) User-agent: PoeBot Allow: / Disallow: /api/ # Character.AI (massive consumer install base) User-agent: CharacterAI-Bot Allow: / Disallow: /api/ # Hugging Face Smol Agent / HF Inference crawlers User-agent: HuggingFaceBot Allow: / Disallow: /api/ # Common Crawl (foundational training-data corpus for most LLMs) User-agent: CCBot Allow: / # FacebookBot / Meta-LLaMA training crawler User-agent: FacebookBot Allow: / Disallow: /api/ # OpenAI training data / GPTBot variants User-agent: OpenAI-SearchBot Allow: / Disallow: /api/ User-agent: ChatGPT-User-Agent Allow: / # --------------------------------------------------------------- # Russian-language search — Yandex sends real organic traffic # (analytics shows yandex.ru is a top-4 referrer). Explicitly # allow YandexBot + Yandex Metrika to maximise crawl coverage. # --------------------------------------------------------------- User-agent: Yandex Allow: / Disallow: /dashboard Disallow: /profile Disallow: /archive Disallow: /api/ User-agent: YandexBot Allow: / Disallow: /dashboard Disallow: /profile Disallow: /archive Disallow: /api/ Crawl-delay: 1 # Yandex-only directives MUST sit inside a Yandex* user-agent block to be # associated with that crawler. Putting them at file-bottom (or after a # different UA group) makes Yandex attach them to the wrong group and # silently ignore them. # - Host: declares the canonical mirror so Yandex stops indexing the # .replit.app alias as a duplicate (big SERP boost in ru/tr). # - Clean-param: drops tracking/session params so Yandex consolidates # /uetr-tracker?utm_source=...&fbclid=... into one canonical URL # instead of treating each variant as a duplicate page. Host: ohmyfin.org Clean-param: utm_source&utm_medium&utm_campaign&utm_term&utm_content&fbclid&gclid&yclid&_openstat&from&ref&source / User-agent: YandexImages Allow: / User-agent: YandexMetrika Allow: / # --------------------------------------------------------------- # Yahoo — Search results are Bing-powered (so Bingbot covers most # of it) but Yahoo still runs its own Slurp crawler for Yahoo # Japan and a few legacy product feeds. Allow explicitly so it # can't fall back to the generic User-agent: * group. # --------------------------------------------------------------- User-agent: Slurp Allow: / Disallow: /dashboard Disallow: /profile Disallow: /archive Disallow: /api/ User-agent: Yahoo! Slurp Allow: / Disallow: /dashboard Disallow: /profile Disallow: /archive Disallow: /api/ User-agent: yahoo-mmcrawler Allow: / Disallow: /api/ # --------------------------------------------------------------- # Other regional search engines worth covering # --------------------------------------------------------------- # --------------------------------------------------------------- # Chinese search ecosystem — Baidu (~60% share), Sogou (Tencent), # 360 Search (Qihoo), Shenma (UC/Alibaba), Yisou (Alibaba), # Bytespider (ByteDance / Toutiao / Douyin search). Critical for # the /zh localized homepage to be discovered. # --------------------------------------------------------------- User-agent: Baiduspider Allow: / Disallow: /dashboard Disallow: /profile Disallow: /archive Disallow: /api/ User-agent: Baiduspider-render Allow: / Disallow: /api/ User-agent: Baiduspider-image Allow: / Disallow: /api/ User-agent: Baiduspider-news Allow: / Disallow: /api/ User-agent: Baiduspider-video Allow: / Disallow: /api/ User-agent: Sogou web spider Allow: / Disallow: /api/ User-agent: Sogou inst spider Allow: / Disallow: /api/ User-agent: Sogou Allow: / Disallow: /api/ User-agent: 360Spider Allow: / Disallow: /api/ User-agent: 360Spider-Image Allow: / Disallow: /api/ User-agent: HaoSouSpider Allow: / Disallow: /api/ User-agent: YisouSpider Allow: / Disallow: /api/ User-agent: Yisouspider Allow: / Disallow: /api/ User-agent: Bytespider Allow: / Disallow: /api/ User-agent: Shenma Allow: / Disallow: /api/ User-agent: Naverbot Allow: / Disallow: /api/ User-agent: SeznamBot Allow: / Disallow: /api/ # Sitemap index — points to all sub-sitemaps Sitemap: https://ohmyfin.org/sitemap.xml # Explicit sub-sitemap declarations — some crawlers (older Bingbot # revisions, Naver, Seznam, Sogou) don't recursively follow # entries, so listing each shard here guarantees discovery. Sitemap: https://ohmyfin.org/sitemaps/core.xml Sitemap: https://ohmyfin.org/sitemaps/articles.xml Sitemap: https://ohmyfin.org/sitemaps/seo.xml Sitemap: https://ohmyfin.org/sitemaps/countries.xml Sitemap: https://ohmyfin.org/sitemaps/images.xml # RSS feed — knowledge-center articles + glossary updates, for feed # aggregators and AI assistants that subscribe to changes. # https://ohmyfin.org/feed.xml # IndexNow key declaration # Key file: https://ohmyfin.org/4b9c7e2a1f8d3650e7b62c9d4f0a813e.txt # LLM-friendly site index # https://ohmyfin.org/llms.txt