Site Audit Details

Site Audit Details Overview

Site Audit Details provides a variety of reports and analysis based on a crawl, that can impact the health of a site.

Site Audit Details Use Cases

Identify and fix potential user experience issues. Learn more
Identify issues with site speed or performance of mobile pages. Learn more
Discover under-performing pages to update and assets that should not be in Google's index. Learn more
How to Check the Indexability of Your Site
How to Do an Hreflang Audit

Video overview of Site Audit Details

Background and Requirements for Site Audit Details

To view crawl data, a crawl needs to be run. Note: Site Audit Details contains data for all crawls run after June 6th, 2020.

Site Audit Details

To view specific crawl data, select the project and date from the dropdown menu.

Project Selector: Select a project to view specific crawl summaries.

Date Selector: If multiple crawls have been run, select which crawl date you would like to view data from.

View Clarity Audit: Click this button to view the audit summary (this data can be viewed in Clarity Audits)

Compare: Compare the crawl details for 2 crawls within a project. This can be selected only when you have data for 2 crawls run within a project.

Crawl Configuration: This shows the details of the configuration of the crawl.

Details tab

Details Chart

Pages Crawled: This displays the total number of pages found for the selected crawl and the depth (links away from the starting URL) of the crawl.

Depth Crawled: Crawl depth is the extent to which a search engine indexes pages within a website.

Response Status: Displays a summary of the status codes found for the selected crawl.

Protocol: Displays a summary of the URL protocols found for the selected crawl.

Details Table

Title/URL: Displays the URL and Title of the page.

Status: Displays the redirect status code found for that page on the date of the crawl.

Count of Issues: Displays the count of issues found for each URL. We also capture and display the html fragment that triggered the issue.
URLs that are not indexable will not be crawled for issues.

Viewing html fragments for crawl issues: https://kb.seoclarity.net/portal/en/kb/articles/view-html-fragments-for-crawl-issues-found

Depth: Displays the depth, which is approximately the number of clicks away from the homepage this page is.

Description: Displays the meta description found in the page's code.

Canonical: Displays the canonical URL found in the page's code.

Robots: Displays the robot tags found in the page's code.

H1: Displays the primary header found in the page's code.

H2: Displays the secondary header found in the page's code.

Word Count: This displays the count of words found on a page.

URL Length: This displays the length for each URL.

Title Length: This displays the length of the page's title for each URL.

Description Length: This displays the length of the page's description for each URL.

AMP URL: This displays the AMP URL found for each page.

View Port Content: This displays the content of the View Port tag found on the page.

H1 Length: This displays the length of the primary header found on the page's code.

Detail Table Downloads

Copy Visible Rows: Select this option to copy the table rows that are currently visible (based on the number of entries displayed) to your clipboard. This can be used to paste into other data repositories.

Download All: Choose this option to extract table data in it's entirety based on your selected table. For example, if you are in the h1 tab and click on download all, you can will get a download of each url, status and the h1 on the page. Similar logic applies to all other options available in the detail dropdown.

Download Full Details: Select this option to extract data from all tables available in the Details Tab in a single file.

Download Sitemap: Select this option to download the sitemap based on the urls in the details table. (Note: The sitemap can be downloaded after applying filters as well)

Download Issues: Select this option to download all issues found per url.

Redirects tab

Redirects charts

Redirected Pages: Displays the count of Redirected Pages found in a crawl.

Percent of Total Crawled Pages: This percentage is calculated based off the count of redirected pages compared to the total count of pages found in a crawl.

Redirect Types: Displays a pie chart of the redirect types by status code for the selected crawl.

Redirect Count: Displays the count of redirects for the selected crawl.

Redirects table

Title/URL: Displays the URL and Title of the page.

Status Code: Displays the redirect status code found for that page on the date of the crawl.

Redirect Final URL: Displays the final URL found from the redirect.

Redirect Times: Displays the count of consecutive redirects.

Long Redirect: Displays True if the count of redirects are greater than or equal to 4.

Mixed Redirects: Displays True if there are multiple types of redirects found. For ex: a 301 and a 302 redirect is found for the same url.

Indexability tab

Indexability charts

Indexable Pages: Displays a count of URLs that are indexable by search engines.

Error Reasons: This displays the count of error reasons found during the crawl. 3xx means Redirection, 4xx means Client error and 5xx means Server error.

Blocked Reasons: This displays the count of blocked reasons found during the crawl.

By Robots.txt: This indicates the count of pages that are disallowed by Robots.txt

By Robots Meta Tag: This indicates the count of pages that are blocked by the Robots Meta Tag on the page.

By X-Robots Header: This indicates the count of pages that are blocked by a X-Robots Header on the page.

Canonical: This indicates the count of pages that are not indexable because of a canonical to another page.

Indexability By Depth: Displays the count of URLs that are and are not indexable by depth (number of clicks away they are from the starting URL).

Indexability table

Title/URL: Displays the URL and Title of the page.

Status Code: Displays the status code found for that page on the date of the crawl.

Blocked by Robots Meta Tag: Displays Yes if the robots directive found on the URL is noindex.

Blocked by Robots.txt: Displays Yes if the robots directive for the URL is noindex.

Blocked by X-Robots header: Displays Yes if the X-Robots header directive for the URL is noindex.

Robots Meta Tag Value: Displays the value of the robots meta tag where available.

Canonical Type: Displays the canonical URL for the page.

Title/Meta/H1 tab

Title/Meta/H1 charts

Pages with Duplicate Title/Meta/H1: Displays the count of duplicates.
Percent of Total Crawled pages: Displays the percent of pages with duplicate Title/Meta/H1 when compared to the total count of pages found in a crawl.
Title/Meta/H1 Issues: This displays a summary of the pages that are duplicated, not set, and unique. Duplicate means the count of pages with duplicate titles, Not set means the count of pages in which no title is present and Unique means the count of pages having unique title.
Total Indexable Pages vs Duplicate Title/Meta/H1: Displays a donut chart with the percent of indexable pages vs those with duplicate elements.

Duplicate Title/Meta/H1 table

Duplicate Title/Meta/H1: Displays the duplicated page element.

Page Count: Displays the number of pages with the duplicated page element.

Canonical Audit charts

Canonicals: This displays the count of pages with Canonical URLs found during the crawl.

Canonical Types: Displays a percentage summary of the type of canonical. Self means the canonical is self-referring, Other means that the canonical url on the page is pointing to a different page, None means no canonical is present, and External means the canonical is pointing to a page with a different domain.

Issues Count: These are the total on page and off page issues found on a crawl.

On Page: These are the issues found while analyzing the source of the page. We also capture the html fragment for issues. For more info please check, https://kb.seoclarity.net/portal/en/kb/articles/view-html-fragments-for-crawl-issues-found

Off Page: These are the issues found while crawling the canonical url. Off Page issues only populate, if the canonical audit option is selected while setting up the crawl.

Canonical Audit table

Title/URL: Displays the URL and Title of the page.

Status Code: Displays the redirect status code found for that page on the date of the crawl.

Canonical Type: Displays Same if the canonical is self-referring, None if no canonical is present, and Other if the canonical is different than the URL.

Canonical: Displays the canonical URL for that page.

Canonical Status Code: This is the status of the canonical url if it is different than the URL.

On Page Issues: These are the issues found while analyzing the source of the page.

Off Page Issues: These are the issues found while crawling the canonical URL. Off Page issues only populate, if the canonical audit option is selected while setting up the crawl.

Hreflang tab

This report will check for rel-alternate-hreflang annotations on a page. This includes how many rel-alternate-hreflang entries are on a page and for which countries and languages are covered. The report also checks if the pages it includes in its rel-alternate-hreflang entries are self referring and whether there are entries that are not retrievable because of a response code error.

Hreflang Charts

Hreflang: This shows the number of pages found in the crawl that had hreflang implemented.

Hreflang Attributes: The chart contains top 10 hreflang attributes detected.

On Page Issues: These are the issues found while analyzing the source of the page. We also capture the html fragment for issues. For more info please check, https://kb.seoclarity.net/portal/en/kb/articles/view-html-fragments-for-crawl-issues-found

Off Page Issues: These are the issues found while crawling and analyzing the hreflang url. Off Page issues only populate, if the Hreflang audit option is selected while setting up the crawl.

Hreflang table

Page: Displays the URL of the crawled page.

Tags Found: Displays the count of Hreflang tags found. Select the number to display the details.

X-Default: Displays the count of X-Default meta data. Select the number to display the details.

Self-Reference Language: Displays the count of self-referencing language meta. Select the number to display the details.

On Page Issues: Displays the count of issues found on the page with the hreflang tags. Select the number to display the details.

Off Page Issues: Displays the count of issues found while crawling and validating the hreflang links

Pagination Audit tab

Pagination Audit charts

Paginated Pages: Displays the count of pages with pagination.

Paginated Start Pages: Displays the count of pages where pagination begins.

Pagination Audit table

Title/URL: Displays the URL and Title of the page.

Status: Displays the redirect status code found for that page on the date of the crawl.

Paginated: Displays True if the URL has pagination, False if it does not.

Next: Displays the URL of the next page in the sequence.

Previous: Displays the URL of the previous page in the sequence

Canonical : Displays the canonical URL for that page.

Parent Page Details tab

Parent Page Details table

Title/URL: Displays the URL and Title of the page.

Status: Displays the redirect status code found for that page on the date of the crawl.

Parent Page Count: Displays the count of pages with a link to that URL. This count displays all pages containing the searched URL.

Schema tab

Schema table

URL: Displays the URL of the crawled page.

Twitter Markup: Displays the count of unique Twitter markup meta data. Select the number to display the details.

OpenGraph Markup: Displays the count of unique Facebook markup meta data. Select the number to display the details.

Performance tab

Performance charts

Crawl Performance: Displays the count of pages crawled and the depth (links away from the starting URL) of the crawl.

Time to First Byte: Displays the duration from the users browser making a HTTP request to the first byte being returned by the server.

Download Time: Displays the count of pages with a fast, medium, and slow download time.

Crawl Rate: Displays the count of pages crawled per hour.

Download Latency: Displays the average speed at which the crawler was able to access the crawled URLs based on the host server.

Performance table

Title/URL: Displays the URL and Title of the page.

TTFB: Displays the duration from the users browser making a HTTP request to the first byte being returned by the server for that URL.

Download Time: Displays the duration of download for the URL..

Resources Loaded: This applies to Javascript crawls only. We capture and display the count of resource urls found per page along with the resource url, the status, whether it was cached, the request type and the resource type.

For more info around Resource Urls, please check - https://kb.seoclarity.net/portal/en/kb/articles/view-details-of-all-resource-urls-rendered-in-javascript-crawls

Page Size: Displays the size of the page in kilobytes.

Custom tab

This tab will display details for any custom search that was setup for the crawl. This report consists of 2 tabs: Custom Extraction and Content Match.

Content Extraction: This tab contains the pages, content and links based on the div, css, Xpath entered while setting up a crawl.

Content Extraction Table

Title/URL: Displays the URL and Title of the page.

Status Code: Displays the status code of the page

Word Count: Displays the count of words for custom content found.

Content: Displays the custom content found based on the crawl settings.

Links: Displays the custom links found based on the crawl settings.

Content Match: This tab contains the pages and count of occurrences found on a page based on the search string entered while setting up a crawl.

Custom Extract Table

Title/URL: Displays the URL and Title of the page.

Status Code: Displays the status code of the page

Occurences: Displays the count of occurences for the string being searched for.

Structured Schema Tab

This tab shows all Structured Schema found on each page in the crawl.

Structured Schema Chart

Structured Schema: This shows the number of pages found in the crawl along with the total number of pages crawled. The donut chart shows the percentage of pages with schema that were found in the crawl.

Schema Types Found: The table shows the types of schema found in the crawl, along with the summary of the total number of pages found per schema type.

Format: These are the format of the schema found in the crawl along with the number of pages for each format that were found.

Structured Schema Table

Title/URL: Displays the URL and Title of the page.

Schema Found: The total number of schema types found per page.

Clicking on the count of schema found shows you all the schema types that were found on the page along with the format that it was implemented in.

Video Walkthrough of this feature:

Pausing Crawls

Crawls can be temporarily paused. After 7 days, paused crawls are automatically stopped.

Data Retention

Site Audits crawl data is retained for 13 months.

When Site Audit HTML data is saved it is retained for 30 days daily.

Related Articles
Setting up a Site Audit
Overview A Site Audit will crawl pages on your site and return a summary report of the audit results through Site Audit Reports along with a detailed analysis of of pages crawled, redirect chain analysis, audits for duplicate content, canonical, ...
Site Audit Report
Site Audit Report Overview Site Audit Reports displays a summary of the most recently completed crawls. It contains a summarized view of site health scores of crawls run within a project, the number of pages audited, crawlability and page analysis ...
Site Health: Hreflang Audit
Overview Hreflang is an extremely important part of Global SEO. Regular audits of hreflang for your site can help to avoid common mistakes that impact impressions and organic traffic for your site. Background The Hreflang Audit in Clarity Audits is ...
Site Health: View Details of all Resource Urls rendered in Javascript crawls
With the rise in usage of Javascript frameworks, such as React, Vue, Angular etc in Websites of today, crawling requires all the code and resource urls on the page to be processed and rendered. Having resource urls that are not accessible or ...
Site Health: Indexability - How to Check the Indexability of your site
Overview Search engines find information about your site by crawling them. According to Google "The web is like an ever-growing library with billions of books and no central filing system. We use software known as web crawlers to discover publicly ...

Site Audit Details

Site Audit Details

Site Audit Details Overview

Site Audit Details Use Cases

Background and Requirements for Site Audit Details

Site Audit Details

Details tab

Redirects tab

Indexability tab

Title/Meta/H1 tab

Hreflang tab

Pagination Audit tab

Parent Page Details tab

Schema tab

Performance tab

Custom tab

Structured Schema Tab

Pausing Crawls

Data Retention

Related Articles

Setting up a Site Audit

Site Audit Report

Site Health: Hreflang Audit

Site Health: View Details of all Resource Urls rendered in Javascript crawls

Site Health: Indexability - How to Check the Indexability of your site