Setting up a Site Audit

Setting up a Site Audit

Overview

A Site Audit will crawl pages on your site and return a summary report of the audit results through Site Audit Reports along with a detailed analysis of of pages crawled, redirect chain analysis, audits for duplicate content, canonical, hreflang, pagination, schema, crawl comparisons and a lot more via Site Audit Details

Requesting A Crawl

Click on the + New Site Audit button on the top right of the page





                       




Basic Settings tab

The bare essential information needed to initiate a new Site Audit can be found in this tab.

Project Type: Select Existing Project if you want to re-use a previously setup project including the same custom settings or New Project to setup a fresh crawl with no inherited settings.

Project Name: Selecting an existing project will display that project's name or create a new project by specifying a name in the text field.

Language: This field is used to tokenize and store the crawl data based on the language entered that will allow for efficient broad match searching for the title, meta description, h1 and h2. The default language used is English.

Choose what to crawl: Crawls can be based on a specific URL, sitemap(s), an RSS Feed, or an upload CSV list. 

Starting URL: Select the protocol (http or https) and input the URL where the crawl should begin. A validation string will appear to confirm the current status code of the URL. 

    Sitemap(s): Select the protocol (http or https) and input the URL where the sitemap is located.

    RSS: Select the protocol (http or https) and input the URL where the RSS feed is located.

    Upload CSV: If you already know what URLs you want to crawl, place them in a column list of the URLs in a .csv format to upload.

Crawler Type: The Standard Crawl is the most common crawler and functions like most crawlers our there. The Javascript Enabled crawl renders JS when crawling similar to how a browser would.

      Standard Crawl: This would crawl the source of the page without any rendering. Crawls your website similar to the vast majority of crawlers out there. Use this to check maximum compatibility.
      JavaScript Crawl: An advanced version of our crawler that renders every page exactly as it would appear in a browser. Use this to check for issues that Google may encounter with it's own JavaScript crawl capabilities. Crawling speed will be slightly slower since the crawler has to wait for JavaScript to finish rendering on each page.                  
            Block Resources: The resource urls passed in this option are blocked from rendering when loading the javascript on the page. This field accepts multiple url patterns (one per line). Learn more.
            Javascript Timeout: The amount of time the crawler should wait for the page to render before continuing on Google typically will wait up to 3 seconds. Using a value of 5 seconds is recommended.
Crawl Speed: This is the number of pages crawled per second. The time it takes to crawl a site will depend on this and the number of URLs. Speeds greater than 8 pages per second will establish a cluster crawl where multiple pages are crawled simultaneously. 

    Advanced: Limit the number of pages crawled per day

Crawl Depth: Custom is the number of links (levels) away from the starting URL the crawl will look for pages. You can also switch to limit the crawl by the number of pages crawled. Full Site Crawl will crawl all URLs found for that domain (depending on configuration, this could take a significant amount of time). Crawl only pages uploaded/found  only shows up for CSV or Sitemap crawls and will crawl just the URLs that are specified in the csv/sitemap.

Description: This optional text field allows for any additional notes to be entered related to the crawl project.

Additional Settings

For more advanced information on setting up a crawl review the Site Audits Projects.


Kill Switch

The ability to stop or pause a crawl is available via the Site Audits page of the particular crawl. Please note it can take up to 60 minutes to completely stop. In case of questions, contact support@seoclarity.net.


Crawl Walkthrough Video




    • Related Articles

    • Site Audit Projects

      Site Audit Projects Overview The Site Audit Projects List gives you a high level view of the different crawls that have been setup for the domain. Watch the video below: "How to Create a Clarity Audit Project" Background & Requirements Some sites ...
    • Site Audit Report

      Site Audit Report Overview This overview will help you understand exactly what Site Audit Reports displays, which is a summary of the most recently completed crawls. It contains a summarized view of site health scores of crawls run within a project, ...
    • Site Audit Details

      Site Audit Details Overview Site Audit Details is a new version of Site Health. The UI is designed with a similar look and feel of the earlier Site Health but it has been rebuilt using our Clarity Grid Infrastructure. This page provides a variety of ...
    • Site Audit Settings

      Site Audit Settings Overview Site Audits provides a variety of reports and analysis based on a crawl, that can impact the health of a site. Site Audit Settings allow for the customization of Site Audit reports. The settings enable prioritizing issues ...
    • Site Health: Hreflang Audit

      Overview Hreflang is an extremely important part of Global SEO. Regular audits of hreflang for your site can help to avoid common mistakes that impact impressions and organic traffic for your site. Background The Hreflang Audit in Clarity Audits is ...