Many have heard of the meta robots tag used in HTML pages to tell search engines how they should crawl and index it. But did you know that there's a new way for non-HTML files – like images, PDFs, and text documents – to communicate instructions? It’s called X-Robots-Tag.
X-Robots-Tag is an optional component of HTTP response headers designed with one goal. This is to help search engine crawlers better understand your website content.
Here's what an X-Robots-Tag header response looks like:
The X-Robots-Tag can also use a comma-separated list of directives and can specify a user agent (crawler). For example:
X-Robots-Tag is an invaluable tool for those who want to gain more control over how search engines index and crawl their sites. Unlike the robots meta tag, it enables greater flexibility and executes crawler directives on non-HTML files at a site level rather than page by page.
It can be trickier to navigate compared to other tags. But there are some benefits from using this, such as:
To maximize your visibility and minimize potential risks caused by search engine algorithms, it's important to understand how they work. And while no one knows exactly how Google's algorithm works, we do know that SEO is constantly changing and evolving. That's why staying up-to-date on the latest trends is so important.
Luckily, SEOLeverage™️ provides a Deep Dive SEO audit which involves seeing your current performance, implementing essential tags such as Meta and X-Robots-Tags, checking your content and link, and many more. Our team of experts is always on top of the latest SEO news and changes.
If you're an advanced user, X-Robots-Tags can help to improve your web pages. But they should be used with caution--taking a backup beforehand is always wise!
It's worth employing them in two specific cases:
According to Google Search Central, any rule used in a meta robots tag can also be specified as an X-Robots-Tag. With access to your header .php, .htaccess, or server configuration file comes the ability to use x-robots-tags. Unfortunately, if you don't have this, then meta robots tags are available as an alternative for guiding search engine crawlers in the right direction.
Google supports Robots directives, which are listed in the following table:
|noindex||Don’t show this media, page, or resource in search results. If you don't use this directive, the media, page, or resource may be indexed and shown in search results.|
|nofollow||Don’t follow the links on this page. If you don't use this directive, you instruct the search engine crawlers to follow the links found in a document or page.|
|noarchive||Don’t show a cached link in search results. If you don't use this directive, you instruct the search engine to show the “cached page” in search results.|
|nosnippet||Don’t show a video preview or text snippet for this page. If available, an image thumbnail may be visible. This applies to all search results at Google, such as Google Images, Web Search, and Discover.|
If you use this X-Robots-Tag directive, you instruct the search engine to show a preview or snippet of that page in search results.
|noimageindex||Don’t index images on this page. If you don't use this directive, images on the web page may be indexed and shown in search results.|
|none||This X-Robots-Tag directive is equivalent to “noindex, nofollow.”|
Take note: Some search engines might interpret the same rules differently. If you want to know some directives, Google provides a full list here.
If you're looking to find the X-Robots-Tag on your page, here's a step by step instructions for Chrome users:
You can easily control the way content is crawled and indexed by adding an X-Robots-Tag to HTTP responses. This type of tag will apply globally across a site, meaning that it directs search engines on what they are allowed to do with specific pieces of data.
For example, if you want all PDFs in your domain not to be indexed and followed (noindex/nofollow), simply add the relevant snippet to the site’s root httpd.conf file or .htaccess file on Apache web server.
You can also add the site's .conf file on nginx.
Additionally, use the tag for non-HTML files where the usage of meta robots tags in HTML is not possible. Here's an example of adding a noindex directive across an entire site:
Reputable web crawlers are only able to interpret X-Robots-Tags when they can access the URL. If you've blocked a particular page from being crawled in your robots.txt file, then any instructions contained within the directives will be disregarded entirely. That’s why it pays to keep an eye on that!
Do you use the X-Robots-Tag on your website? If you're running a website, it's important to understand how the X-Robots-Tag works and how it can be used to your advantage.
The X-Robots-Tag is an HTML tag that tells search engines what they can and cannot index on a website. By using this tag, you can ensure that your pages are properly indexed and ranked in search results. Additionally, this tag can help you prevent your pages from being displayed in unwanted ways, such as in cached versions of search results.