If you’re using WordPress, you may see hundreds or thousands of pages that are not being indexed on Google from your Google Search Console reporting. While it’s worth investigating which pages are not being indexed, many of them are not a concern. Here’s why:
Alternate page with proper canonical tag
This page correctly points to the canonical page, which is indexed, so there is nothing you need to do. Alternate language pages are not detected by Search Console.Documentation
This might be the highest number of reported pages that are not indexed. Even a small blog may result in hundreds of these pages. An example of what you might see here is:
In the example above, it’s likely that someone linked the homepage with the included UTM tags. Since the utm_source says Twitter, we can assume that Google has found a link within Twitter that goes to that URL. The canonical tag for the page it’s referencing correctly points to “https://reggiodigital.com”. We can’t control where people link to so it’s possible that anything can be added to a URL when it’s linked to it.
Excluded by ‘noindex’ tag
When Google tried to index the page it encountered a ‘noindex’ directive and therefore did not index it.Documentation
WordPress has a ton more pages than the ones you created via Pages and Posts. Here’s a page that doesn’t exist on this site but can be visited if you entered it https://reggiodigital.com/search/thisisarandomstring/ . It’ll show as no results being found, but nonetheless a page is still rendered. We definitely don’t want this page indexed by Google, nor any search result page. Other examples of this are paginations like /page/3/.
As a result of this, you may find many being excluded by Google and that’s good!
If you’re using Yoast, you may find that /tag/ URL’s are noindex. This may be intentional or not depending on what your goals are. It’s common to noindex tags as they can be hundreds or thousands of tags on a site.
Crawled – currently not indexed
WordPress creates post-specific comment feeds for all posts.
https://reggiodigital.com/google-search-console-explaining-pages-arent-indexed-errors/feed/ for example is a link you can visit and is intentional that these are created. They can be safely ignored on Google Search Console as it’s intentional that we wouldn’t want these to be indexed.
Page with redirect
Google detected a URL that is being redirected to another page. You don’t want redirected pages to be indexed so this is normal. The links to these pages could be linked from within the site, or outside of the site.
Blocked due to other 4xx issue
It’ll depend on your hosting environment but you’re likely to find
https://www.seenicwander.com/wp-admin/admin-ajax.php here which is intentional and should be blocked.