Server-Robots-Header
What is the X-Robots-Tag?
The X-Robots-Tag is an HTTP header that allows website operators to give search engine crawlers precise instructions for indexing and crawling web pages. Unlike Robots-Meta-Tags, which are placed in the HTML head of a page, the X-Robots-Tag is transmitted at server level as an HTTP response header.
Advantages of the X-Robots-Tag
The X-Robots-Tag offers several decisive advantages over conventional Meta-Robots-Tags:
- Server-Side Control: Works with non-HTML files (PDFs, images, videos)
- Early Processing: Crawlers receive instructions already during HTTP response
- Flexibility: Can be set dynamically based on various conditions
- Reliability: Less susceptible to HTML parsing errors
Syntax and Implementation
Basic Syntax
X-Robots-Tag: [directive1], [directive2], [directive3]
Common Directives
Directive
Description
Use Case
block indexing
Prevents page indexing
Test pages, internal areas
ignore links
Prevents link following
User-generated content
noarchive
Prevents caching
Dynamic content
nosnippet
Prevents snippet display
Confidential content
noodp
Prevents ODP descriptions
Controlled meta descriptions
notranslate
Prevents translations
Language-specific content
Implementation Examples
Apache (.htaccess)
# Single page
<Files "test.html">
Header set X-Robots-Tag "noindex, nofollow"
</Files>
# Directory-wide
<Directory "/admin">
Header set X-Robots-Tag "noindex, nofollow"
</Directory>
# File type specific
<FilesMatch "\.(pdf|doc)$">
Header set X-Robots-Tag "noindex"
</FilesMatch>
Nginx
# Single location
location /admin/ {
add_header X-Robots-Tag "noindex, nofollow";
}
# File type specific
location ~* \.(pdf|doc)$ {
add_header X-Robots-Tag "noindex";
}
PHP (Dynamic)
<?php
// Conditional X-Robots-Tag setting
if ($user->isLoggedIn() && $user->isAdmin()) {
header('X-Robots-Tag: noindex, nofollow');
}
// For specific pages
if (strpos($_SERVER['REQUEST_URI'], '/test/') !== false) {
header('X-Robots-Tag: noindex');
}
?>
Combinations and Best Practices
Common Combinations
Combination
Purpose
Application Area
noindex, nofollow
Complete exclusion
Admin areas, test pages
noindex, noarchive
No indexing, no cache
Dynamic, time-critical content
nofollow, noarchive
Follow links, but don't cache
External links
noindex, nosnippet
No indexing, no snippets
Confidential documents
Best Practices
- Maintain Consistency: Don't use X-Robots-Tag and Meta-Robots-Tags simultaneously for the same directives
- Test: Check implementation with tools like Google Search Console
- Documentation: Keep track of all X-Robots-Tag implementations
- Monitoring: Monitor the impact on crawling behavior
Crawler-Specific Directives
Google-Specific Directives
X-Robots-Tag: googlebot: noindex, nofollow
X-Robots-Tag: googlebot-image: noindex
Bing-Specific Directives
X-Robots-Tag: bingbot: noindex
X-Robots-Tag: msnbot: nofollow
General Crawler Directives
X-Robots-Tag: noindex, nofollow
X-Robots-Tag: googlebot: noindex, bingbot: nofollow
Common Errors and Solutions
Error 1: Duplicate Directives
Problem:
X-Robots-Tag: noindex, noindex, nofollow
Solution:
X-Robots-Tag: noindex, nofollow
Error 2: Wrong Syntax
Problem:
X-Robots-Tag: "noindex, nofollow"
Solution:
X-Robots-Tag: noindex, nofollow
Error 3: Spaces in Directives
Problem:
X-Robots-Tag: no index, no follow
Solution:
X-Robots-Tag: noindex, nofollow
Testing and Checking
Tools for Verification
- Google Search Console: Monitor indexing status
- HTTP Header Checker: Online tools for header verification
- Browser Developer Tools: Network tab for header inspection
- cURL Commands: Command line tests
cURL Test Example
curl -I https://example.com/admin/
# Check X-Robots-Tag in response
Monitoring and Analytics
Important Metrics
- Crawl Rate: How often are protected areas crawled?
- Indexing Status: Are X-Robots-Tag directives being followed?
- Error Rate: Are there parsing errors in headers?
Google Search Console Monitoring
- Coverage Report: Check that protected pages are not indexed
- Crawl Errors: Monitor crawling problems
- Sitemaps: Ensure protected URLs are not in sitemaps
Advanced Use Cases
Dynamic X-Robots-Tags
<?php
// Based on User-Agent
$userAgent = $_SERVER['HTTP_USER_AGENT'];
if (strpos($userAgent, 'Googlebot') !== false) {
header('X-Robots-Tag: noindex');
}
// Based on IP address
$clientIP = $_SERVER['REMOTE_ADDR'];
if (in_array($clientIP, $blockedIPs)) {
header('X-Robots-Tag: noindex, nofollow');
}
?>
Content Management System Integration
WordPress Platform:
// In functions.php
add_action('send_headers', function() {
if (is_admin() || is_user_logged_in()) {
header('X-Robots-Tag: noindex, nofollow');
}
});
Drupal CMS:
// In template.php
function theme_preprocess_html(&$variables) {
if (arg(0) == 'admin') {
drupal_add_http_header('X-Robots-Tag', 'noindex, nofollow');
}
}
X-Robots-Tag Implementation Checklist
- [ ] Goal Defined: Which pages should be protected?
- [ ] Directives Chosen: Which X-Robots-Tag directives are needed?
- [ ] Server Configuration: Apache/Nginx configured correctly?
- [ ] Testing Done: Does the implementation work?
- [ ] Monitoring Set Up: Google Search Console monitored?
- [ ] Documentation Created: All implementations documented?
- [ ] Team Informed: All stakeholders know about the changes?