Noindex Explained – All You Need To Know

Noindex Explained – All You Need To Know

The noindex tag is used to prevent search engines from indexing a specific page.

You might think that all pages on your website should be indexed, but this is not accurate. In fact, pRemove some pages from appearing in search results It is an integral part of your indexing strategy.

What is a noindex tag?

The noindex tag is an HTML tag that is used to control how robots interact with a specific page or file on your site and prevent them from indexing that page or file.

You can tell search engines not to index a page By adding a noindex command in the robots meta tag Just add the following code to a section

In the HTML:


Alternatively, the noindex tag can be added as the x-robots tag in the HTTP header:

x-robots-tag: noindex

When a search engine bot like Googlebot crawls a page with a noindex tag, it won’t index it. If the page was previously indexed and the tag was added later, Google will drop it from the search results, even if other sites link to it.

generally, Search engine crawlers are not required to follow descriptive directions Because they act as suggestions, not rules that you must respect. Some search engine crawlers may interpret robots meta values ​​differently.

However, most search engine crawlers, such as Googlebot, are subject to the noindex directive.

Noindex vs nofollow

there Other meta robots directives Supported by Google – the most popular ones include nofollow and follow. However, the follow tag is the default if no robots meta tags are added, so Google considers it unnecessary.

The nofollow tag prevents search engines from crawling the links on the page. As a result, the ranking signals for that page will not be passed on to the pages you link to.

The noindex command can be used alone, but it can also be combined with other directives. For example, you can Add noindex and nofollow tags If you do not want the search engine robots to index the page and follow the links on it.

If you implement a noindex tag, but your page still appears in search results, it’s possible that Google hasn’t crawled the page since the tag was added. To ask Google to recrawl a page, you can use URL Inspection Tool.

When should the noindex tag be used?

The noindex tag must be used to prevent pages from being indexed by Google.

Making less important pages not indexable is crucial because Google doesn’t have enough resources to crawl and index every page it finds on the web. At the same time, you need to decide which of your valuable pages should be indexed and prioritize their optimization.

Let’s see what kinds of pages you have to apply noindex tag to make them not indexable.

Tag noindex:

  • Pages for products that are out of stock will not be available again.
  • Pages that should not be accessed in search results, such as staging environments or password-protected pages.
  • Pages that are valuable to search engines and not to users – such as pages with links that help robots discover other pages.
  • Pages with duplicate content are often prevalent on e-commerce websites. Also recommended using key tags To direct search engines to canonical versions of your pages and prevent content issues from recurring.
See also  Terms of self-employment and profit from the Internet

Making pages non-indexable should be done as part of a well-established indexing strategy.

You should never include noindex on valuable pages, such as:

  • most popular product pages,
  • blog articles (unless old),
  • about me and contact pages,
  • Pages that describe the services you provide.

generally, Never place a noindex tag on pages that you expect to bring in high organic traffic.

Is your page ‘excluded by noindex’ in Google Search Console?

Read our article on how to fix this problem to unlock your indexing capabilities.

How to implement the noindex tag

The noindex tag can be placed in the site’s HTML code or HTTP response headers.

Some CMS plugins like Yoast noindex automatically for the pages you post.

Let’s go through the two basic implementation methods step by step and analyze their advantages and disadvantages.

Insert a noindex tag into the HTML code of the page

The noindex tag can be implemented as a meta tag For robots in the HTML of the page.

Meta tags for bots They are codes used to control crawling and indexing of a website. Users can’t see it, but bots find it while crawling the page.

Let’s explain how the robots meta tag is constructed.

Inside the meta tag there are pairs of attributes and values:


The robots meta tag has two attributes:

  • name – specifies the name of the search engine bots,
  • Content – ​​contains directions for robots.

Both attributes require different values ​​depending on what you want the bots to do. Also, both the Name and Content attributes are case insensitive.

The name attribute usually takes the value “bots”, Indicates that directive targets all bots.

It is also possible to use the name of a specific bot instead, such as “googlebot,” although you will encounter this much more often. If you want to process different bots, you will need to create separate meta tags for each of them.

Keep in mind that Search engines have different crawlers for different purposes – Verify List of crawling spiders in Google.

Meanwhile, the content attribute contains the directive that the bots should follow. In our case, it is “noindex”. You can put more than one value there and separate the attributes with commas.

Pros and cons of meta tags for bots

The HTML method is easier to implement and modify than the HTTP header method. It also does not require you to access your own server.

However, implementing the noindex tag in HTML can take a long time – you’ll need to manually add it to every page you want to noindex.

Add the noindex tag to the HTTP headers

Another solution is to specify the noindex command in the x-robots tag.

This is an element of response http header. HTTP headers are used for communication between the server and the client (Browser or search engine bot).

You can configure it on your HTTP web server. The code will look a little different depending on the server you’re using – like Apache, Nginx, or others.

See also  What is the first step in the search engine optimization process for your website?

Here is an example of what an HTTP response with the x-robots-tag might look like:

HTTP/1.1 200 OK (…) x-robots-tag: noindex (…)

Apache server

If you have a server Apache based And if you want noindex of all files ending in “.pdf”, you must add the directive to .pdf htaccess file.

This is the sample code:

Header set x-robots-tag "noindex" 

nginx server

If you have Nginx-based serverExecute the command in the. conf file:

location ~*.pdf$ {  add_header x-robots-tag "noindex"; }

The pros and cons of using HTTP headers

One of the important features to use noindex in the HTTP headers In it you can Use it on web documents that are not HTML pages, such as PDF files, videos or images. Moreover, this method allows you to target a specific part of the page.

In addition, support x-robots-tag Use regular expressions (RegEx). In other words, you can target pages that should not be indexed by identifying what they have in common. For example, you can target pages that contain URLs that contain specific parameters or code.

On the other hand, you need access to your server to implement the x-robots tag.

Adding the tag also requires technical skills and is more complex than adding robots meta tags to a website’s HTML.

How do you check your implementation of the noindex tag?

If you want to check if noindex or other robots declarative directives have been executed, you can do so depending on how they are added to the page.

So, if the noindex tag is added to the HTML of the page, you can check its source code, while for HTTP headers, you can use the option Inspect in Chrome. These tools will show you which directions were recognized on a particular page.

Other options include entering the URL into the URL Inspection Tool for Google Search Console or use the extension Link Redirect Trace.

More information about using the noindex tag

Here are some additional guidelines for using the noindex tag and details on its properties:

  1. When you don’t include noindex in your code, The default is Robots can index your page.
  2. Watch out for any errors in the code, such as including commas in the right places – bots won’t understand your commands if the syntax is wrong.
  3. Add tags in your HTML code or HTTP response headers, but not both. Doing so can often have a negative impact if the directions in the places in question conflict with each other. In this case, Googlebot will choose which limit to index.
  4. You can use the noimageindex directive which will work similar to noindex but will only prevent images on a specific page from being indexed.
  5. After a while, the bots start showing the noindex as also a nofollow. Many people disable indexing of pages with noindex but combine it with the following directive to make sure that bots still crawl the links on the page. but Google explained The noindex, continue command will eventually be treated as noindex, nofollow Because at some point, they stop crawling the links on noindexed pages. As a result, link destination pages may not be indexed and ranking signals can be diminished which may negatively affect their ranking.
  6. Do not use noindex in robots.txt files. Although these and some other rules are not officially supported, search engine bots have followed noindex directives in robots.txt files. However, as of September 2019, Google announced that it had removed code that handled unsupported, non-deployed rules in robots.txt files – such as noindex – in September 2019.
See also  ما هو معدل الارتداد (Bounce Rate) وكيفية تقليله

Compare noindex tags, robots.txt files, and canonical tags

noindex tags and robots.txt filesand basic tags linked together – can be used To control the crawling and/or indexing of pages.

However, they do have some distinct characteristics that make them suitable in different situations.

We have proven that noindex tags Controls whether certain pages on a website should be indexed, and operates at the page level.

Let’s look at how this compares to robots.txt files and canonical tags.

Robots.txt files

Robots.txt files can be used To control how search engine bots crawl parts of your website at the directory level.

Specifically, robots.txt files include directives to search engine robots, with an emphasis on “disallowing” or “allowing” their behavior. If the robots follow the directive, they will not crawl disallowed pages, and the pages will not be indexed.

Robots.txt directives are widely used to save a website’s crawl budget .

Be careful when implementing noindex tags and setting rules in robots.txt files. For the noindex command to be effective, the specified page must be available for crawling, which means it cannot be blocked by robots.txt.

If the crawler can’t access the page, it won’t see the noindex tag and won’t respect it. The page can then be crawled and appear in search results – for example, if other pages link to it.

For a noindex of a page, allow it to be crawled in robots.txt and use the noindex meta tag To block them from being indexed – Googlebot will then follow the noindex directive.

conventional signs

conventional signs They are HTML elements It tells search engines which page out of many similar pages is the primary version and should be indexed. They are placed on secondary pages and specify the canonical URL – as a result, these secondary pages should not be included in the index.

Canonical tags may limit indexing of non-essential pages, however Google will not always honor these tags. For example, if Google finds more links to another page, it may treat it as more important than the given canonical URL and consider it the canonical version.

Also, canonical tags can only be detected by bots while crawling. Unlike robots.txt files, they cannot be used to prevent a page from being crawled.

One of the fundamental differences between canonical tags and noindex tags is that Canonical pages incorporate ranking tags within a single URL. at the same time, Noindexed pages will not pass ranking signalswhich is vital with regards to internal links – you will not pass ranking signals to the URLs you link to.

Conclusion

Making low-quality pages non-indexable is one of the best SEO practices to improve your indexing strategy – and Using the noindex meta tag is one of the best ways to keep a page out of the Google index.

With tag, you can Prevent unimportant pages from being indexed thus helping search engine crawlers focus on your most valuable content.

This makes the noindex tag one of the essential tools in SEO, which is why we audit all of your noindex tags as part of our technical SEO services.

Effective crawling and indexing of your website is key to getting the most out of the organic traffic that valuable pages can direct to your site. To learn more about the indexing process, be sure to read our guide to SEO indexing next!

Leave a Reply

Your email address will not be published. Required fields are marked *