SEO XML Sitemap

9 feb 2015, last update: 24 Jun 2017

Multilingual SEO XML Sitemap for Sitecore

On the Sitecore Marketplace there are some Sitemap modules available for Sitecore. But does it do what you want?
The Sitemap XML is a great module.  

  • Supported by Sitecore
  • It create a sitemap.xml file
  • It uses the “__Updated” field in the Statistics section from the page Item
  • Support for multi-site
  • Do nothing with changefreq and priority
  • No support for multi-language sites
  • No correct support for component base pages.
  • Not generating https urls


This article is about creating your own sitemap that works differently than the popular marketplace module and as simple as possible. Through the integration needed to your site, it is difficult to do this in a module.

The existing SEO XML sitemap for Sitecore modules generate wrong modification dates and are not multi language. With some simple config and code you can create your own
highly integrated sitemap.xml and use other properties from the schema defined on sitemaps.org with additional page config.

The Sitemap.xml specifications.

 Tag

 Required or optional

 Description

 <loc>

 required

 The URL with protocol

 <lastmod>

 optional

 datetime

 <changefreq>

 optional

  • always
  • hourly
  • daily
  • weekly
  • monthly
  • yearly
  • never

 <priority>

 optional

 from 0.0 to 1.0

What is the reason you want a sitemap.xml? My SEO tool says. Sure it make no sense if you have one or not. It's how you use it.

The lastmod is difficult because a page can contain a lot of component or other references. For this reason it is a good idea to skip this optional attribute. Alternative No correct support for component base pages or a longer render time and loop through the references to find the last modified date.

Only index the pages you want.

A sitemap.xml and a robots.txt are also nice for hackers. If not use properly it may contains some interesting link for them. So you need to know what to index and what not. For the “Sitemap XML module” you can define templates to index or ignore. There is a risk you allow some test pages or landings pages that are not intend to index by search engine. You can prevent that with a <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"> on the page. But of course better to exclude from the sitemap. A solution is to make a Field “Hide in SEO XML Sitemap” and place that on your pages to control the behavior.

The changefreq tag

Changefreq is only a tip for search engine. To save bandwidth and cpu cycles you can set a higher value on archived pages. And by using different values you can increase the likelihood that the most changed pages are more frequent visit by search engines. A solution is to make a Field "XML Sitemap Priority" and place that on your pages to control the behavior.

The Priority tag

Priority Like mentioned by the Changefreq also the Priority settings are just a guide for Google, or any other search engine, to follow when indexing your site. The priority you assign to a page is not affect the position of your URLs in the result pages of a search engine. This information can be used by search engines to make a selection from URLs on the same site. You can use this tag to increase the likelihood that the most important pages are present in a search index.
A solution is to make a Field "XML Sitemap Change Frequency" and place that on your pages to control the behavior.

Note: Assigning a high priority to all of the URLs on your site will have no effect. Since the priority is relative, you can use it only to distinguish between URLs on your site.

The code to create your own SEO XML sitemap.

Note: there is an other version of this code with more code and now called a SEO processor with a robot.txt include see SEO Processor

The following Sitemap works with Fields on page items to control the changefreq and priority and an option to hide.

  • Support for multi-site
  • use changefreq and priority
  • Optional support for multi-language sites
  • Support for component base pages.
  • Optional support lastmod (add some code)
using System.Collections.Generic;
using System.Text;
using System.Text.RegularExpressions;
using Sitecore.Data;
using Sitecore.Data.Fields;
using Sitecore.Data.Items;
using Sitecore.Links;

namespace Mirabeau.Website.Helpers
{
    public static class XmlSitemap
    {
        //single language
        public static string GetXml()
        {
            Database db = global::Sitecore.Context.Database;
            var homeitem = global::Sitecore.Context.Item.GetHomeItem();
            var query = string.Format("fast:{0}//*", EscapeSitecoreFastQueryPath(homeitem.Paths.FullPath));
            var detailList = new List<Item>(db.SelectItems(query));
            detailList.Add(homeitem);

            var options = global::Sitecore.Links.LinkManager.GetDefaultUrlOptions();
            options.AlwaysIncludeServerUrl = true;
            return CreateSiteMapUrls(detailList, options);
        }

        //multi language
        public static string GetXml(List<string> languagelist)
        {
            Database db = global::Sitecore.Context.Database;
            string sitemapLinks = string.Empty;
            foreach (var language in languagelist)
            {
                Language currentSiteLanugage;
                if (Language.TryParse(language, out currentSiteLanugage))
                {
                    Sitecore.Context.SetLanguage(currentSiteLanugage, true);
                }
                var homeitem = global::Sitecore.Context.Item.GetHomeItem();
                var query = string.Format("fast:{0}//*", EscapeSitecoreFastQueryPath(homeitem.Paths.FullPath));
                var detailList = new List<Item>(db.SelectItems(query));
                detailList.Add(homeitem);
                var options = global::Sitecore.Links.LinkManager.GetDefaultUrlOptions();
                options.AlwaysIncludeServerUrl = true;
                options.LanguageEmbedding = LanguageEmbedding.Always;
                options.Language = Language.Parse(language);
                options.EmbedLanguage(LanguageManager.GetLanguage(language));
                sitemapLinks += CreateSiteMapUrls(detailList, options);
            }
            return sitemapLinks;
        }

        private static string CreateSiteMapUrls(List<Item> detailList, UrlOptions urlOptions)
        {
            StringBuilder returnString = new StringBuilder();

            const string defaultpagechange = "daily";

            //Sitecore Fields eache page must contain this field.
            var HideInSeoXmlSitemap = "Hide in SEO XML Sitemap";
            var XmlSitemapPriority = "XML Sitemap Priority";
            var XmlSitemapChangeFreq = "XML Sitemap Change Frequency";

            foreach (Item item in detailList)
            {
                if (!item.GetCheckBoxValueDefaultTrue(HideInSeoXmlSitemap))
                {
                    //the GetCheckBoxValueDefaultTrue to filter the null values like component items
                    var url = LinkManager.GetItemUrl(item, urlOptions);
                    var prio = item.GetStringValue(XmlSitemapPriority);
                    var changefreq = item.GetStringValue(XmlSitemapChangeFreq);
                    if (string.IsNullOrEmpty(changefreq))
                    {
                        changefreq = defaultpagechange;
                    }
                    if (string.IsNullOrEmpty(prio))
                    {
                        returnString.AppendFormat("<url><loc>{0}</loc><changefreq>{1}</changefreq></url>", url, changefreq);
                    }
                    else
                    {
                        returnString.AppendFormat("<url><loc>{0}</loc><changefreq>{1}</changefreq><priority>{2}</priority></url>\n", url, changefreq, prio);
                    }
                }
            }
            return returnString.ToString();
        }

        #region Extension methodes and helpers

        public static Item GetHomeItem(this Item item)
        {
            global::Sitecore.Sites.SiteContext site = global::Sitecore.Context.Site;

            if (site == null)
            {
                return null;
            }

            global::Sitecore.Data.Database db = global::Sitecore.Context.Database;
            return db.GetItem(site.StartPath);
        }

        public static string GetStringValue(this Item item, string fieldName)
        {
            if (item != null && item.Fields[fieldName] != null &&
              !string.IsNullOrEmpty(item.Fields[fieldName].Value))
            {
                return item.Fields[fieldName].Value;
            }

            return string.Empty;
        }

        public static bool GetCheckBoxValueDefaultTrue(this Item item, string fieldName)
        {
            CheckboxField checkBox = item.Fields[fieldName];
            if (checkBox != null)
            {
                return checkBox.Checked;
            }
            return true;
        }

        public static string EscapeSitecoreFastQueryPath(string path)
        {
            return Regex.Replace(path, @"([^/]+)", "#$1#").Replace("#*#", "*");
        }

        #endregion

    }
}

The Layout (MVC) for single language

@using Mirabeau.Website.Helpers
@{
   Response.ContentType = "text/xml";
}<?xml version="1.0" encoding="UTF-8"?>
<urlset
      xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
      http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
@Html.Raw(XmlSitemap.GetXml())
</urlset>

 

Example Layout for multilanguage

@using Mirabeau.Website.Helpers
@{
   Response.ContentType = "text/xml";
}<?xml version="1.0" encoding="UTF-8"?>
<urlset
      xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
            http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
@Html.Raw(XmlSitemap.GetXml(new List<string>() {"en","nl-NL"}))
</urlset>

Sitecore Items

Below the Sitecore Templates. Create a changefreq list for a droplist. And create the XML Sitemap fields, use the changefreq list for the droplist for field XML Sitemap Change Frequency.

sitemap xml template

Create validation and default value for field XML Sitemap Priority

field validation

 

Create the Sitemap Layout and link to the view. Create a Sitemap template and set the Sitemap layout in the __Standard Values. And create the sitemap item below the homepage. With sitemap as name.

Now you can do a request to the sitemap with /sitemap.aspx to change that to /sitemap.xml we can add it to the "Allowed extensions"

Make an .xml URL possible for Layout rendering

Place this in a file /App_Config/Include/Mirabeau.SitemapXml.config

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <pipelines>
      <preprocessRequest help="Processors should derive from Sitecore.Pipelines.PreprocessRequest.PreprocessRequestProcessor">
        <processor type="Sitecore.Pipelines.PreprocessRequest.FilterUrlExtensions, Sitecore.Kernel">
          <param desc="Allowed extensions (comma separated)">aspx, ashx, asmx, xml</param>
        </processor>
      </preprocessRequest>
    </pipelines>
  </sitecore>
</configuration>

 

Because Sitecore security hardening I don’t allow xml files

No problem we tell the search engine what the url of the sitemap is by using a robots.txt

User-agent: *   
Sitemap: /sitemap.aspx

Pipeline or Layout

This prototype code use a layout. It is also possible to use a pipeline see the code from the Sitemap XML using a layout is a very simple standard solution, no config needed and you can also use the Sitecore layout cache.

Multilingual sites

On multi-language sites you should place a url for all the languages in the sitemap.xml. This code can do that and is easy to adapt to your need.

Related Links

Multisite Multilingual SEO Processor
The Module with Sitecore support Sitemap XML
Zero configuration required and easy to install XML Sitemap Generator
Using the Sitemap XML Module in a Hardened, Multi-site Environment
Easy a pipeline without writing to disk Simple Sitemap XML
Generate a Google sitemap for a Million plus Sitecore items site
Ultimate Sitemap XML based on the Sitemap XML with related meta fields