What do the status codes shown by the seoClarity crawler mean?

What do the status codes shown by the seoClarity crawler mean?

Introduction

Status codes are responses issued by a server to a client' request. A status code is a three digit code, the first of which defines the classification of response. 

Standard Status Codes

The HTTP response status codes can be broken down into the following categories: 

1XX - Informational - Request is in motion.
2XX - Success - Request was successful. Client's request was received, understood and accepted by the server. 
3XX - Redirection - Request was received by the server but further steps are needed from the client's end to successfully process the request. 
4XX - Client Error - Request was received by the server but cannot be processed. Typically means that the request from the client was either incorrect, had bad syntax or cannot be fulfilled. 
5XX - Server Error - Request from the client was valid but server has failed to fulfill the request. 

Custom Status Codes

There is however, one more class of status code you may encounter when our crawler traverses your site - when the errors that are received from the site do not fall within the standard responses.  These are captured and indicated using a custom status code. We classify the custom response as a 9XX category. 

9XX - Custom - Shows a custom response received when crawling  the client site.
These are the below custom status codes that may be shown as a crawl status:
  1. 900 - Max Page Size Exceeded. - Request failed because the size of the HTML was above 8 MB
  2. 901 - Unsupported Content Type - Crawler received a Content Type that is not text/html. For example a content type of PDF/text would throw this error.
  3. 940 - Bad Request Error - The request was invalid due to malformed input.
  4. 980 - System Timeout Error - The system exceeded the allotted time for processing the request.
  5. 981 - Dependency Failure - An upstream or downstream service dependency failed.
  6. 982 - Site Unreachable - The target site could not be reached.
  7. 983 - Malformed Site Response - The response from the target site was not well-formed or parseable.
  8. 984 - Navigation Failure - Navigation blocked by client site.
  9. 985 - Resource Limit Exceeded - The resource quota or limits were exceeded.
  10. 996 - Blocked by robots.txt - The page could not be crawled due settings in the robot.txt file, the client needs to review the robots.txt settings.
  11. 998 - JavaScript Crawl Timeout - Page was not loaded in the JavaScript timeout it was assigned. 
  12. 999 - Http Fetch Failed - Page Could not be fetched or timed out. 


    • Related Articles

    • Allowing The seoClarity Crawler To Crawl Your Site

      Overview The seoClarity crawler can only crawl your site if you allow it to. With the volume of bad bots increasing day by day, most sites are enhancing their security to block unknown bots from accessing their site. In that respect, it is important ...
    • How do I add custom goals/events to my Google Site Analytics (GA4)?

      To add custom goals or events to your Google Site Analytics integration (GA4), please upload them to your SFTP first. Once the upload is complete, contact support@seoclarity.net for assistance with integrating them into the platform. Be sure to ...
    • Site Audit Details

      Site Audit Details Overview Site Audit Details is a new version of Site Health. The UI is designed with a similar look and feel of the earlier Site Health but it has been rebuilt using our Clarity Grid Infrastructure. This page provides a variety of ...
    • seoClarity Crawler

      Crawling your site is a necessity when trying to improve your on-site SEO efforts and with everything worthwhile, challenges arise. These are challenges are not debilitating, but they are challenges that need to be thought about and worked out. ...
    • Page Crawler settings

      Overview Page Crawler settings allow you to specify custom content to be tracked for your managed pages in Page Clarity. To set up email alerts for Page Crawler settings see Setting Up Email Alerts in Page Clarity. Managed elements can utilize the ...