A “sitemap” is a file you submit to Google to tell the search engine which pages on your website to crawl and which to block from indexing. Your site doesn’t need a sitemap in order for most search engines to crawl and index it, however, the sitemap gives Google more information about your site, such as metadata, image and video file descriptions, alt text, and other site context that will make it easier for the search engine to know how (and how often) to index your site in search engine results pages (SERPs).

You likely won’t use a sitemap if you’re using a hosting environment that creates and manages sitemaps on behalf of their users, such as Google Sites. If you manage your site independently, you can view, test, and add new sitemaps via the Sitemaps report in the Google Search Console.

In this post, we’ll help you decide whether you actually need a sitemap, and if so, how to create, test, and manage it within the Google Search Console.

How to decide if your website needs a sitemap

You don’t need a sitemap in order for your web pages to appear in the search engine results page. If your site is user-friendly and organized in a coherent manner – this means that your site flows logically and there aren’t any “orphaned pages” that aren’t linked to any other pages – Googlebots will automatically crawl your site and index it for the SERPs.

However, there are several instances where having a sitemap can be helpful for indexing purposes. Those instances are:

  1. Your site is new. Sites with a low number of backlinks (like new websites) aren’t as easy to index as more mature sites that have been linked to by lots of other pages. Sitemaps tell the search engine how to index sites even without backlinks.
  2. Your site is large. The bigger the site, the more difficult it is for Google to tell when changes have been made or new pages have been added. A sitemap can help Google know how to crawl big sites.
  3. It has orphaned pages. Any pages that aren’t linked to by other pages on your site will be difficult for Google to access and crawl. A sitemap can make these pages more discoverable for the search engine.
  4. It has rich media elements. Sites that show up in Google news or that use rich media content can be difficult for Google to format correctly in search results. Sitemaps avoid the hassle of having a display issue.

How to build and submit your sitemap

If you decide that your site could use a sitemap for easier crawling there are a few steps you should take to ensure this process is successful.

First, you need to determine which pages on your site you want to tell Google to crawl. Once you’ve nailed that down, you need to determine which pages are the canonical versions so Google knows not to double index anything.

Next, choose a sitemap format. Google Search Console supports several different sitemap formats and requires all sitemaps to follow standard sitemap protocol. You can create your own sitemap or use a third-party tool to generate it for you if you’re not confident in your ability to follow all protocols.

Regardless of the format you choose, your sitemap must be 50MB or under and 50,000 or fewer URLs. If you’re just starting out with Google Search Console, it’s unlikely you’ll have a site that’s large than this, but if you do, you can break your site into multiple sitemaps. In this case, you also have the option of creating a sitemap index file that gives Google a list of all of your sitemaps. You can submit as many sitemaps as you need to Google as long as they are formatted correctly. You can learn more about splitting your sitemap into multiple files via Google’s sitemap tutorial page.

The last step is to make your sitemap accessible to Google. You will do this by adding it to your robots.txt file, or you can submit it directly to Google via the Search Console Sitemaps tool. If you choose to submit your sitemap to your robots.txt file, specify the path to your sitemap via a piece of code that looks like this:

Sitemap: http://yoursite.com/sitemap_location.xml

You’ll then use the “ping” function to tell Google to crawl the sitemap. You can send the HTTP GET request via this address:

http://www.google.com/ping?sitemap=<your sitemap URL>

Guidelines for building your sitemap

You can view a full list of sitemap requirements and guidelines on Google’s Search Console Help site. However, we’ve included a few guidelines here that we think you should be aware of before you begin the process of creating a sitemap.

  • Ensure that all of your URLs are consistent and fully qualified. Google will crawl your site exactly as you list it, so make sure you’re not forgetting pieces of your URLs that could make it more difficult for the search engine to locate. For example, if your site is https://www.yourexamplesitehere.com/, don’t tell Google that your URL is https://yourexamplesitehere.com. Even that slight change could affect how your site is crawled and how it shows up in SERPs. If your site is accessible from both versions, let Google know this using proper canonicalization methods.
  • If you have multiple versions of your site for various languages or regions, tell Google this using hreflang tags.
  • Ensure that your sitemap files are UTF-8 encoded and that all URLs are escaped properly.The Google Search Console Help site tells us more about this:

    “Non-alphanumeric and non-latin characters. We require your sitemap file to be UTF-8 encoded (you can generally do this when you save the file). As with all XML files, any data values (including URLs) must use entity escape codes for the characters listed in the table below.  A sitemap can contain only ASCII characters; it can’t contain upper ASCII characters or certain control codes or special characters such as * and {}. If your sitemap URL contains these characters, you’ll receive an error when you try to add it.”

  • Use multiple sitemaps if your site has more than 50,000 URLs to prevent your server from overloading.
  • If you do use multiple sitemaps for a site larger than 50 MB uncompressed, submit a single sitemap index file to Google that lists all of your sitemaps, instead of submitting all of the sitemaps individually.
  • If your site has different media types such as images, news, and video, use sitemap extensions to help Google crawl those elements.

How to use the Sitemap report in Search Console

After you’ve submitted your sitemap to Google via the Search Console or by adding it to your robots.txt file, Google must process and index the sitemap. When that is complete, your sitemap will appear in the Sitemaps report in your Search Console.

If you navigate to the Sitemaps report landing page, you’ll see a full list of all sitemaps you’ve submitted to Google, but only those that were submitted through the Search Console. If you submitted your sitemap via google.com/ping or your robots.txt file, you won’t see those sitemaps in the Sitemaps report. For every sitemap in this view, you’ll also be able to see when Google last crawled your site, how many URLs it indexed, and the type of sitemap you submitted (this will generally be “Sitemap,” “Sitemap index,” or “unknown” if Google is having trouble reading the file you submitted)

In addition, you’ll see the status of Google’s last crawl of your site. There are several possible statuses for this:

  • Processed Successfully – Google was able to clearly read your sitemap and crawled the site with no errors.
  • Has Issues – Googlebots encountered one or more errors with your sitemap, and has only queued those URLs for which it encountered no errors.
  • Couldn’t be Fetched – There was an error with your sitemap that prevented Google from crawling it.

If errors were found on your sitemap, you can click on the table in the drill down menu to view more information, including what errors were encountered.

What to do if you don’t see your sitemap in Search Console

If you aren’t seeing your sitemap in the Sitemap report and you think you’ve submitted it correctly, you can complete the following steps to diagnose the problem:

  1. Ensure that you’ve submitted the correct, complete domain for your site. Ensure that you’ve listed all versions of your URL in your sitemap and have told Google which is the preferred domain. You should only submit a sitemap for your preferred.
  2. Check that you’re looking in the right place. If you submitted the sitemap yourself, it should be visible in the By Me tab. Otherwise, you’ll find it in the All tab.
  3. Again, check that you have actually submitted your sitemap through Search Console. If you submitted it another way, you won’t be able to see it in this view.

How to delete a sitemap in Search Console

If, for whatever reason, you need to delete a sitemap from your Search Console, navigate to the sitemap table, check the box next to the sitemap you wish to delete, and click “Delete.” However, simply deleting that sitemap from Search Console won’t stop Google from reading the sitemap. If you want Google to stop reading that sitemap, you need to block access to the sitemap on your robots.txt file, or remove the sitemap altogether from your web host.

You can learn more about creating and managing your sitemap in Search Console via the Search Console Help site.

Get expert help managing your sitemap

Ready to submit your sitemap but not confident you have the skills to write it correctly and manage it within Search Console? We can help. Big Leap helps small businesses just like yours manage all aspects of their site and attract new clients through smart SEO strategies. Sign up for your free Big Leap SEO consultation today.

Latest posts by Meg Monk (see all)