Robots.txt Generator
Create robots.txt files for your website. Control how search engines crawl and index your content.
Rules
Rule #1
Use * to target all bots, or specify a bot name like Googlebot, GPTBot, etc.
Enter paths that should be accessible to crawlers (one per line)
Enter paths that should be blocked from crawlers (one per line)
Optional: Time in seconds between crawler requests
Optional: Add your sitemap URL to help search engines find your content
Generated robots.txt
User-agent: * Allow: /
How to Use
- Configure user agent (use * for all bots)
- Add paths to allow or disallow for each rule
- Optionally add sitemap URL and crawl delay
- Copy the generated robots.txt content
- Upload to your website root directory as robots.txt
What is robots.txt?
A robots.txt file is a text file placed in the root directory of a website that tells web crawlers and bots which pages or sections of the site they can or cannot access. It is part of the Robots Exclusion Protocol (REP), a group of web standards that regulate how robots crawl the web.
Why Do You Need robots.txt?
- Control Crawling: Prevent crawlers from accessing private, duplicate, or unimportant pages
- Save Bandwidth: Reduce server load by limiting crawler access to essential content
- Protect Sensitive Areas: Block access to admin pages, staging environments, and internal tools
- Manage AI Crawlers: Control whether AI training bots like GPTBot can access your content
- Improve SEO: Help search engines focus on your most important pages
Common User Agents
Search Engine Bots
- Googlebot (Google)
- Bingbot (Bing)
- Slurp (Yahoo)
- DuckDuckBot (DuckDuckGo)
- Baiduspider (Baidu)
AI Crawlers
- GPTBot (OpenAI)
- CCBot (Common Crawl)
- Google-Extended (Google AI)
- anthropic-ai (Anthropic)
- ClaudeBot (Anthropic)
Best Practices
- Always include a sitemap directive to help crawlers discover your content
- Test your robots.txt before deploying using Google Search Console
- Do not use robots.txt to hide sensitive information - use authentication instead
- Keep your robots.txt file simple and well-organized
- Review and update your robots.txt regularly as your site changes
Common Examples
Allow All Crawlers
User-agent: * Allow: / Sitemap: https://example.com/sitemap.xml
Block AI Crawlers
User-agent: GPTBot Disallow: / User-agent: CCBot Disallow: / User-agent: * Allow: / Sitemap: https://example.com/sitemap.xml
Block Specific Directories
User-agent: * Allow: / Disallow: /admin/ Disallow: /private/ Disallow: /tmp/ Sitemap: https://example.com/sitemap.xml