Home /

Notes to self /

Adding Sitemaps to a Jekyll site

Adding Sitemaps to a Jekyll site

It's a 7 minute read

While working on a Jekyll site you will want to automatically generate a sitemap.xml which you can submit to Google’s Search Console. While there are plugins for this; it is in fact very easy to make yourself and you can add generation options exactly to your liking.

In todays post we’ll have a look at how a sitemap can be generated by Jekyll with simple Liquid tags.

Generating a basic sitemap.xml with Jekyll

Let’s start off by creating the file sitemap.xml in the root directory of your Jekyll site. Add the front matter to the top, so that Liquid will parse the template. We’ll be adding Liquid tags in a minute, and without front matter they’ll be ignored:

---
layout: null
---

Now we build the framework for our sitemap. It must have an XML header and a <urlset> element. Inside that element, we will loop through all of our posts:


---
layout: null
---
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  {% for post in site.posts %}
  {% endfor %}
</urlset>

You can have Jekyll build your site right away and see the sitemap.xml file appear in the _site folder. Let’s add <url> elements for all our posts now, but only for published posts:


<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  {% for post in site.posts %}
    {% unless post.published == false %}    
    <url>
      <loc>{{ site.url }}{{ post.url }}</loc>
      <lastmod>{{ post.date | date_to_xmlschema }}</lastmod>
      <changefreq>monthly</changefreq>
      <priority>0.5</priority>
    </url>
    {% endunless %}
  {% endfor %}
</urlset>

This will generate a sitemap of all of our posts, using the each post’s publication date as the value for the <lastmod> element, and setting the change frequency and priority to monthly and 0.5, respectively.

We can add static pages to our sitemap.xml, too. While doing this, we’ll take care to remove the index.html from the home page:


  {% for page in site.pages %}
  <url>
    <loc>{{ site.url }}{{ page.url | remove: "index.html" }}</loc>
    {% if page.date %}
      <lastmod>{{ page.date | date_to_xmlschema }}</lastmod>
    {% else %}
      <lastmod>{{ site.time | date_to_xmlschema }}</lastmod>
    {% endif %}
    <changefreq>monthly</changefreq>
    <priority>0.3</priority>
  </url>
  {% endfor %}

Note that pages don’t usually have dates associated with them, so unless a date is present, we’ll use the site’s update time.

Generating a sitemap.xml with custom post inclusion, customisable change frequency and priority

There may be some published posts or pages that we don’t want to show up in our sitemap. Also, we may want to give certain content a higher priority. Ideally, these things are configured through the YAML front matter of each post and page:

---
sitemap:
  lastmod: 2018-05-25
  priority: 0.7
  changefreq: 'weekly'
---
or
---
sitemap:
  exclude: 'yes'
---

We can add some conditional logic to our sitemap.xml to use these attributes, if present. Note that we add an exclude attribute to the sitemap’s own front matter too, to prevent it from including itself.


---
layout: null
sitemap:
  exclude: 'yes'
---
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  {% for post in site.posts %}
    {% unless post.published == false %}
    <url>
      <loc>{{ site.url }}{{ post.url }}</loc>
      {% if post.sitemap.lastmod %}
        <lastmod>{{ post.sitemap.lastmod | date: "%Y-%m-%d" }}</lastmod>
      {% elsif post.date %}
        <lastmod>{{ post.date | date_to_xmlschema }}</lastmod>
      {% else %}
        <lastmod>{{ site.time | date_to_xmlschema }}</lastmod>
      {% endif %}
      {% if post.sitemap.changefreq %}
        <changefreq>{{ post.sitemap.changefreq }}</changefreq>
      {% else %}
        <changefreq>monthly</changefreq>
      {% endif %}
      {% if post.sitemap.priority %}
        <priority>{{ post.sitemap.priority }}</priority>
      {% else %}
        <priority>0.5</priority>
      {% endif %}
    </url>
    {% endunless %}
  {% endfor %}
  {% for page in site.pages %}
    {% unless page.sitemap.exclude == "yes" or page.name == "feed.xml" %}
    <url>
      <loc>{{ site.url }}{{ page.url | remove: "index.html" }}</loc>
      {% if page.sitemap.lastmod %}
        <lastmod>{{ page.sitemap.lastmod | date: "%Y-%m-%d" }}</lastmod>
      {% elsif page.date %}
        <lastmod>{{ page.date | date_to_xmlschema }}</lastmod>
      {% else %}
        <lastmod>{{ site.time | date_to_xmlschema }}</lastmod>
      {% endif %}
      {% if page.sitemap.changefreq %}
        <changefreq>{{ page.sitemap.changefreq }}</changefreq>
      {% else %}
        <changefreq>monthly</changefreq>
      {% endif %}
      {% if page.sitemap.priority %}
        <priority>{{ page.sitemap.priority }}</priority>
      {% else %}
        <priority>0.3</priority>
      {% endif %}
    </url>
    {% endunless %}
  {% endfor %}
</urlset>

Note, also, that I’ve added feed.xml as a special case of a file to exclude from the sitemap. This file is generated by the Jekyll Atom feed plugin, and we don’t have access to its YAML front matter to give it an exclude attribute.

Privacy

© 2023 Alan Reid