The XML sitemap is one of the first things Google bots first look for when they find your website. This map is basically a “tour guide” for your website that tells Google how to navigate your pages, helps the bots find all of them in order to index them, and shows the priority of each page based on your settings. As expected, these maps are very good for SEO, they help the bots find new or updated pages and index them accordingly, even if your internal linking is not set up properly, or for whatever reason, is not working.
What are XML sitemaps?
Especially if you have a more complex website, there are times when your pages end up with no internal links pointing to them. This can be both on purpose or by mistake, but in any way, no internal linking makes these pages hard to find for both your users and Google bots. And this is where an XML sitemap comes in handy. The map lists a website’s every page, so Google can find them all. But more importantly, the list of links shows when a page was last modified so Google crawlers understand what updates – if any – have been done on the site since the last crawl.
In other words, Google bots periodically crawl the internet in search of new content or websites they need to index. So, whenever they return to your site, they first check your sitemap for the latest updates and compare them with the data collected during the last crawl. If you have new pages or new content, they will analyze it and then index it.
However, one single XML sitemap only allows up to 50 000 URLs. So if your website has more than those 50 000 posts, you may have to divide them by adding a second index sitemap. But, don’t worry, you can have as many sitemaps as you want. As we can in the Yoast example above, they have a different sitemap for posts, pages, authors, videos, etc.
Why you should use XML sitemaps?
Google points out that XML sitemaps are useful for “really large websites” with big archives, few external links, and those using rich media content. However, no matter the size of the website, a sitemap is always handy if you wish Google to be able to find your content. Also, make sure you update the sitemap periodically so you keep the crawlers informed.
In addition, the XML sitemap will show search engines if there is duplicate content on your website and which of the copies the bots need to focus on. While duplicate content doesn’t affect your ranking directly, if the crawlers find the copy before the original, they will index that. Thus, you will end up ranking with the wrong link. But, XML sitemaps show the last modified date and from those dates search engines know immediately which pages have new content and which don’t. Hence, when they crawl the duplicate content they know it is a copy based on an older page. Even if the crawlers haven’t had a chance to index the original.
Use Google Search Console to keep search engines up to date
To make things even easier, Google included an XML sitemap section into the Google Search Console, because much like you Google also wants to check every page of your website. So, when you create each sitemap, make sure to go into your Google Search Console account and add the link there. This way, Google will know right away you made an update and will hasten the crawl, instead of waiting for the “scheduled” check.
Make sure you put a good effort into making the XML sitemap. It might help you with restructuring the site, where you can also go over your most visited pages again and insert the correct links. When most of your pages can be reached through your website itself, you can rely on the good XML sitemap to help Google for any forgotten leftovers and help you to further optimize the crawling of your own website.