HTTP Status Code
You can see the status code for the crawled web pages. HTTP code 2xx indicates that all pages are loading successfully.
3xx codes mean further action must be taken to fulfill that request, like page redirects.
4xx codes are a broken page error that occurs when the server cannot fulfill the request.
- Filter the pages with the status code
Pages with codes 4xx and 5xx can slow down your site’s performance. You can use filters only to fix 4xx and 5xx errors.
Canonicalization
It is crucial to have a search engine-friendly canonical tag rel= “canonical” on your web pages.
The search engine can distinguish between duplicate and fresh content by comparing the content on different websites. Google even suggests you should have canonical tags.
Self-canonical tags are necessary to avoid duplicate content. You will need to add the Canonical tag to your page.
It will not harm you if any site copies your content and claims that it is its own because you have already self-canonicalized the content.
Republishing content from other websites requires you to add the Canonical Tag to the source. Google will then consider that source the new source.
According to the report, 67% of pages have been self-canonicalized, and 33% don’t have canonical tags.
Filter these pages and add a canonical tag to protect your web pages from duplicate content.