The Google is able to discover all the content that exists within your website is by following links from one page to the next. Googlebot will continue to follow the links until all of the pages have been crawled, which then allows them to be indexed. However, an index coverage issue / Crawl Errors / URL errors are specific to a particular page when Googlebot tried to crawl the URL, it was able to resolve your DNS, connect to your server, fetch and read your robots.txt file, and then request this URL, but something went wrong after that. This index coverage issue may be negatively affected in Google and Bing Search results. We encourage you to fix this issue.
Here below is listed some common index coverage issue / Crawl Errors / URL errors:
1. 404 Not Found Errors / Page Not Found Errors:
The 404 not found error / Page not found error is an HTTP standard response code to indicate the client has requested a not working web page on server.
When you see 404 not found errors or Page not found errors for your URLs, it means that server couldn’t access your URL, the request timed out, or your site was busy. As a result, server was forced to abandon the request. Usually, when a visitor requests a page on your site that doesn’t exist, a web server returns a 404 (not found) error.
2. Soft 404 Errors:
A soft 404 error isn’t an official response code sent to a web browser. It’s just a label Google adds to a page within their index. However, there are some servers that are poorly configured and their missing page loads a 200 code when it should display a 404-response code. If the invisible HTTP header displays a 200 code even if the web page clearly states that the page isn’t found, the page might be indexed, which is a waste of resources for Google.
To combat this issue, Google notes the characteristics of 404 pages and attempts to discern whether the 404 page really is a 404 page. In other words, Google learned that if it looks like a 404, smells like a 404, and acts like a 404, then it’s probably a genuine 404 page.
3. Server Errors:
Server errors or HTTP status codes from 500 to 599 are returned by a web server when it is aware that an error has occurred or is otherwise not able to process the request.
These are common server errors:
- 500 Internal Server Error
- 502 Bad Gateway
- 503 Service Unavailable
- 504 Gateway Timeout
4. Access Denied:
An access denied error means exactly that Googlebot was unable to properly access a page. You absolutely want Google to see all of your pages, so the good news is that this error is relatively easy to correct.
Now you are familiar with the index coverage issue / crawl errors / URL errors, you should have a good basis for troubleshooting issues with your web servers or applications.
If you encounter any index coverage issue / crawl errors / URL errors that were not mentioned in this post, or if you know of other likely solutions to the ones that were described, feel free to discuss them in the comments!