How to Create a Sitemap in Gatsby
A sitemap is a file where you specify links to your pages on your site. Search engines like Google, Bing and others, use this file to crawl through the provided links. Sitemaps are usually written in XML format.
Search engines can’t always discover links of your website on their own. Perhaps your website is very big, and it takes them time to find every link. Maybe it’s the opposite, your website is too small and there are no external links to your website. Therefore, a sitemap can help your website appear in search results.
How do know if you need a sitemap? If you need some arguments to decide, Google has written some guidelines you can use. But, since you are here, you are probably more interested into creating a sitemap.
Prerequisites
You need to have Gatsby CLI installed. It is also recommended to set up a project using gatsby-starter-blog starter. The code examples use the starter, so it will be easier to follow along.
Step 1 — Installing Sitemap Plugin
In Gatsby, you generate a sitemap using the official plugin for creating sitemaps.
Run the following command in your terminal with your project’s directory open.
npm install gatsby-plugin-sitemap
Add the plugin to your gatsby-config.js
file in root folder of your project. Also, you must specify siteUrl
in siteMetadata
for the plugin to work.
// gatsby-config.js
module.exports = {
siteMetadata: {
siteUrl: `https://www.example.com`,
},
plugins: ['gatsby-plugin-sitemap']
}
This plugin only creates a sitemap for the production version of your site. You need to build your project to test if the plugin works.
Build your project first.
gatsby build
Now, run it in production
mode.
gatsby serve
You should see a message in your terminal with a URL to your website.
Unless your website has thousands of links, the plugin should generate 2 sitemap files.
Open /sitemap/sitemap-index.xml
on your site that is being served. You should see a sitemap index file.
<sitemapindex>
<sitemap>
<loc>https://www.example.com/sitemap/sitemap-0.xml</loc>
</sitemap>
</sitemapindex>
A sitemap index file contains links to other sitemaps that contain the actual links to your site’s pages. Sitemap index file splits your websites into many smaller sitemaps. Don’t worry about this, the plugin takes care of it.
To see the actual sitemap, open /sitemap/sitemap-0.xml
.
<urlset>
<url>
<loc>https://www.example.com/hello-world/</loc>
<changefreq>daily</changefreq>
<priority>0.7</priority>
</url>
. . .
</urlset>
According to the Sitemap protocol, you should place your sitemap at the root directory of your site.
plugins: [
{
resolve: 'gatsby-plugin-sitemap',
options: {
output: '/'
}
}
]
Rebuild your site and serve it again. You should now see your sitemap by opening /sitemap-0.xml
.
Step 2 — Adding Advanced Configuration
As you might have noticed, the plugin sets changefreq
to daily
and priority
to 0.7
by default. If you want to change these values, you need to do more configuration. You can also show the time when the page was last time modified by adjusting the config and adding the lastmod
property.
First thing you need to do is define a GraphQL query inside gatsby-config.js
. Add it to your plugins array inside the gatsby-plugin-sitemap
options. The query must fetch your site’s URL, which you previously specified in siteUrl
. Furthermore, the query should fetch all the data about pages you want to use in your sitemap. Since you want the sitemap to show links, grab the path
of every page in your site.
You can also get the date
property from MarkdownRemark
nodes, which you can use to set the lastmod
property in your sitemap.
// gatsby-config.js
{
resolve: 'gatsby-plugin-sitemap',
options: {
output: '/',
query: `
{
site {
siteMetadata {
siteUrl
}
}
allSitePage {
nodes {
path
}
}
allMarkdownRemark {
nodes {
frontmatter {
date
},
fields {
slug
}
}
}
}`
}
}
The query also fetches slug
from MarkdownRemark
nodes, which blog posts rely on. When you create pages from blog posts, you use their slug
to build the page path. Which means that slug
has a direct connection from blog post to its corresponding SitePage
node. It will make sense in a bit, keep following along.
The next part is preparing the objects that will go into the sitemap. You add resolvePages
function right under the query
which takes an object as its argument. To shorten your code, you can extract the properties coming from the query
. Do so by using JavasScript destructuring. Extract allSitePage
nodes into allPages
and allMarkdownRemark
nodes into allPosts
variables.
// gatsby-config.js
options: {
output: '/',
query: { /* . . . */ },
resolvePages: ({
allSitePage: { nodes: allPages },
allMarkdownRemark: { nodes: allPosts },
}) => {
const pathToDateMap = {};
allPosts.map(post => {
pathToDateMap [post.fields.slug] = { date: post.frontmatter.date };
});
const pages = allPages.map(page => {
return { ...page, ...pathToDateMap [page.path] };
});
return pages;
}
}
Build a pathToDateMap
object that maps the slug of the post, for example /post-slug/
, to its publication date.
To connect the blog post publication dates to their pages use map
array method. It creates a new array holding objects of each page, its path
and its corresponding date
, if exists in pathToDateMap
. Complete the resolvePages
by returning an array with the pages you want to put into the sitemap.
// gatsby-config.js
// . . .
options: {
// . . .
serialize: ({ path, date }) => {
let entry = {
url: path,
changefreq: 'daily',
priority: 0.5,
};
if (date) {
entry.priority = 0.7;
entry.lastmod = date;
}
return entry;
}
}
You have complete control over the values you want to see in each entry of your sitemap. The object you return from serialize
function represents each entry in the final sitemap.
Whew, that was a lot of work. Congratulations!
Checking Your Site on Search Engines
To see if your site is being crawled by search engines, you can use a special query. Open Google, Bing or DuckDuckGo and type site:your-website.com
. Of course, substitute your-website.com
with your own domain. Results of your query should show links of your website which the search engine has found.
Don’t see any results? Don’t worry. You can manually submit your sitemap to help search engines find your site sooner. Here are guides for Google and Bing, which should be a good start. The rest is up to you and I hope you find success.