Even those who’ve been in the SEO business for a while can get confused about whether to use noindex meta tags or robots.txt files to control how web pages are “seen” (and whether they should appear in search results) by search engines.
We wrote in this post about some of the reasons to use robots.txt files on certain pages and these apply to the use of noindex tags as well. That’s about it for similarities between using robots.txt and noindex tags, though, as you’ll see.
What’s the difference?
In the very simplest terms:
- A robots.txt file controls crawling. It instructs robots (a.k.a. spiders) that are looking for pages to crawl to “keep out” of certain places. You place this file in your website’s root directory.
- A noindex tag controls indexing. It tells spiders that the page should not be indexed. You place this tag in the code of the relevant web page. Here is an example of the tag:
<meta name=”robots” content=”noindex,follow”/>
When to use robots.txt.
Not all content on your site needs to be or should be found. There are instances in which you may not want to have sections on your site appear in search results, such as information meant only for employees, shopping carts, or thank-you pages.
Use the robots.txt file when you want control at the directory level or across your site. However, keep in mind that robots are not required to follow these directives. Most will, such as Googlebot, but it is safer to keep any highly sensitive information out of publicly-accessible areas of the site.
When to use noindex meta tags.
As with robots.txt files, noindex tags will exclude a page from search results. The page will still be crawled, but it won’t be indexed. Use these tags when you want control at the individual page level.
An aside on the difference between crawling and indexing: Crawling (via spiders) is how a search engine’s spider tracks your website; the results of the crawling go into the search engine’s index. Storing this information in an index speeds up the return of relevant search results—instead of scanning every page related to a search, the index (a smaller database) is searched to optimize speed. If there was no index, the search engine would look at every single bit of data or info in existence related to the search term, and we’d all have time to make and eat a couple of sandwiches while waiting for search results to display. The index uses spiders to keep its database up to date.
Let’s be careful out there!
As we warned in our post on robots.txt files, there’s always the danger that you may end up making your entire website uncrawlable, so pay close attention when using these directives.
Susan Sisler
Latest posts by Susan Sisler (see all)
- Noindex Meta Tags vs. Robots.txt: Which Should You Use? - July 13, 2017
- Improve Your SEO Results With a Robots.txt File - April 6, 2017
- Ten Fun and Creative Custom 404 Pages - April 5, 2016
Leave a Reply