Let’s fix the Access Forbidden Pages Error (403)
Access Forbidden (403) is the status of Google Search Console. This means that some of your pages were not indexed because your server denied Googlebot access to them.
This is not typical, so this status may be an indication that your website requires a technical revision.
Access Forbidden (403)
The usual indexing process begins with Googlebot detecting the URL. Google does not include it in the index immediately but crawls it to find out as much information as possible about its content.
Thanks to crawling, the search engine knows which queries are worth showing your pages for and whether they are of value to users.
Google rarely indexes a page it hasn’t crawled. And when that happens, it’s bad for SEO. Learn more by reading my article on the “Indexed, though blocked by robots.txt” status.
To crawl a page, the Google search engine must behave similarly to the user’s browser. Googlebot sends a request for the URL to your server. Servers respond to such requests with HTTP status codes , Which tells browsers and crawlers if and how they can access the contents of this URL.
403 status code is one of the possible server answers. This means that:
- Your server understand the request. It knows where to find the page,
- The browsers or crawlers that make the request need permission to access that specific resource.
- Your server rejected the request because the credentials provided do not guarantee that permission will be granted.
The 403 status code may be a regular thing. It is a way to protect sensitive data from unauthorized visitors. However, When the server displays this status code to Googlebot, it indicates a problem.
Googlebot never provides any credentials while making the request, so in its case a 401 status code would be more appropriate. Code 401 means that the request was not completed due to lack of valid authentication credentials.
What does this error produce? There are two possibilities:
- No search engine or server is perfect, so there may be an error where the server returns a 403 code instead of the more accurate 401 code. The 401 page is still not indexed, however The problem can be solved by changing the server settings.
- Behind the 403 response, there is a deeper technical problem on your website, the source of which must be investigated.
How to troubleshoot “Access Forbidden (403)” pages?
You can find your affected pages with the “Access Forbidden (403)” status in the Page Indexing report. It’s easy to access from the left navigation bar in Google Search Console.
After clicking on the status name, you’ll see a graph showing how the number of affected pages has changed over time and the list of URLs. You can export the list using the button in the upper right corner.
What is very useful is that you can filter your pages only to those that you have included in the sitemap before opening the list with the status “Blocked due to access forbidden (403)”.
This way, you can easily identify the URLs that need immediate fixing. Since you have them included in your sitemap, they are strategically important to you and must be indexed in order to bring organic traffic to your site.
Should all “Access Forbidden (403)” pages be indexed?
Evaluating the most important URLs leads to the first step in troubleshooting Access Forbidden (403), which determines whether the affected pages should be in the Google index.
There are three possible scenarios:
1. You may want to avoid indexing pages that contain data that should not be found in a Google search.
However, returning a 403 status by your server is not the best way to keep it out of the index. If you want pages to remain unindexed without adding clutter to your site, block them with the noindex tag.
2. There may be pages on your website that you want to appear in search but block all content from being seen by users who are not logged in. A good example of this is Unpaid news article.
Googlebot will never log into your website, so to index these pages you need to give Googlebot access to your pages without blocking them with a login wall. This means Change your server settings and treat our crawler differently than your users’ browsers.
It should be noted that Google is cautious about situations where Googlebot displays different content than users. That’s why you need to provide structured data to let the crawler know that it’s dealing with paywalled content.
You can find instructions on adding structured data to subscriptions and paywalled content at www Google Guidelines. Without sticking to it, you risk getting a manual penalty.
3. Finally, there may be pages on your site that you want to be public but are still returning a 403 status code to Googlebot.
Fixing these pages can take a long time, as it is not always possible to find the cause of the error right away.
Reasons why public pages return 403 status code
|Possible cause||How to fix it|
|Errors in your .htaccess file||Provides a file .htaccess Your server configuration changes when using shared hosting. Your content management system will usually generate it automatically.
.htaccess file deactivate and create a new one. Then, crawl your pages with the Googlebot user agent to see your website from its viewpoint and make sure the problem goes away.
|Faulty WordPress plugins||The “Access Forbidden (403)” status on WordPress pages may be caused by an incompatible plugin. Try to deactivate plugins one by one to find the root of the mess.|
|Wrong IP address||The error may occur if your domain name refers to an incorrect IP address. Check log your.|
|Malware infection||Scan your websites for signs of malware infection. Malware can generate and maintain errors in your .htaccess file.|
The long-term solution to your indexing problems
The above solutions will help you take care of indexing of certain pages and fix the “Access Forbidden (403)” status temporarily. However, they do not guarantee that the problem will not return.
The best way to maintain satisfactory index coverage is to perform regular technical SEO audits. Let us help you nip any growing threats to your back in the bud.
The “Forbidden because access is forbidden (403)” error occurs when Googlebot cannot crawl your page because the server rejected its request. It can be fixed by:
- provide a noindex on pages that should remain unindexed,
- change your server settings and provide appropriate structured data for pages you want to be indexed but still protected by a login wall,
- ) Investigate .htaccess file, WordPress plugins, A-log and website security violations for pages you want to be indexed and visible to every visitor.
The “Access Forbidden (403)” status could be a sign of deeper technical SEO issues holding your website back. Get in touch with Go Start Business to investigate and beat them.