While working on a Jekyll site you will want to automatically generate a sitemap.xml
which you can submit to Google’s Search Console. While there are plugins for this; it is in fact very easy to make yourself and you can add generation options exactly to your liking.
In todays post we’ll have a look at how a sitemap can be generated by Jekyll with simple Liquid tags.
Generating a basic sitemap.xml with Jekyll
Let’s start off by creating the file sitemap.xml
in the root directory of your Jekyll site. Add the front matter to the top, so that Liquid will parse the template. We’ll be adding Liquid tags in a minute, and without front matter they’ll be ignored:
---
layout: null
---
Now we build the framework for our sitemap. It must have an XML header and a <urlset>
element. Inside that element, we will loop through all of our posts:
---
layout: null
---
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
{% for post in site.posts %}
{% endfor %}
</urlset>
You can have Jekyll build your site right away and see the sitemap.xml file appear in the _site
folder.
Let’s add <url>
elements for all our posts now, but only for published posts:
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
{% for post in site.posts %}
{% unless post.published == false %}
<url>
<loc>{{ site.url }}{{ post.url }}</loc>
<lastmod>{{ post.date | date_to_xmlschema }}</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
</url>
{% endunless %}
{% endfor %}
</urlset>
This will generate a sitemap of all of our posts, using the each post’s publication date as the value for the <lastmod>
element, and setting the change frequency and priority to monthly
and 0.5
, respectively.
We can add static pages to our sitemap.xml
, too. While doing this, we’ll take care to remove the index.html
from the home page:
{% for page in site.pages %}
<url>
<loc>{{ site.url }}{{ page.url | remove: "index.html" }}</loc>
{% if page.date %}
<lastmod>{{ page.date | date_to_xmlschema }}</lastmod>
{% else %}
<lastmod>{{ site.time | date_to_xmlschema }}</lastmod>
{% endif %}
<changefreq>monthly</changefreq>
<priority>0.3</priority>
</url>
{% endfor %}
Note that pages don’t usually have dates associated with them, so unless a date is present, we’ll use the site’s update time.
Generating a sitemap.xml with custom post inclusion, customisable change frequency and priority
There may be some published posts or pages that we don’t want to show up in our sitemap. Also, we may want to give certain content a higher priority. Ideally, these things are configured through the YAML front matter of each post and page:
---
sitemap:
lastmod: 2018-05-25
priority: 0.7
changefreq: 'weekly'
---
or
---
sitemap:
exclude: 'yes'
---
We can add some conditional logic to our sitemap.xml
to use these attributes, if present. Note that we add an exclude
attribute to the sitemap’s own front matter too, to prevent it from including itself.
---
layout: null
sitemap:
exclude: 'yes'
---
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
{% for post in site.posts %}
{% unless post.published == false %}
<url>
<loc>{{ site.url }}{{ post.url }}</loc>
{% if post.sitemap.lastmod %}
<lastmod>{{ post.sitemap.lastmod | date: "%Y-%m-%d" }}</lastmod>
{% elsif post.date %}
<lastmod>{{ post.date | date_to_xmlschema }}</lastmod>
{% else %}
<lastmod>{{ site.time | date_to_xmlschema }}</lastmod>
{% endif %}
{% if post.sitemap.changefreq %}
<changefreq>{{ post.sitemap.changefreq }}</changefreq>
{% else %}
<changefreq>monthly</changefreq>
{% endif %}
{% if post.sitemap.priority %}
<priority>{{ post.sitemap.priority }}</priority>
{% else %}
<priority>0.5</priority>
{% endif %}
</url>
{% endunless %}
{% endfor %}
{% for page in site.pages %}
{% unless page.sitemap.exclude == "yes" or page.name == "feed.xml" %}
<url>
<loc>{{ site.url }}{{ page.url | remove: "index.html" }}</loc>
{% if page.sitemap.lastmod %}
<lastmod>{{ page.sitemap.lastmod | date: "%Y-%m-%d" }}</lastmod>
{% elsif page.date %}
<lastmod>{{ page.date | date_to_xmlschema }}</lastmod>
{% else %}
<lastmod>{{ site.time | date_to_xmlschema }}</lastmod>
{% endif %}
{% if page.sitemap.changefreq %}
<changefreq>{{ page.sitemap.changefreq }}</changefreq>
{% else %}
<changefreq>monthly</changefreq>
{% endif %}
{% if page.sitemap.priority %}
<priority>{{ page.sitemap.priority }}</priority>
{% else %}
<priority>0.3</priority>
{% endif %}
</url>
{% endunless %}
{% endfor %}
</urlset>
Note, also, that I’ve added feed.xml
as a special case of a file to exclude from the sitemap. This file is generated by the Jekyll Atom feed plugin, and we don’t have access to its YAML front matter to give it an exclude
attribute.