Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

Our strategy

Our SEO tools are a combination of automations and manual fine-tuning possibilities.

We automate as much as possible by generating most configuration files automatically. Additionally we allow the editor to manage which URL are indexed or not with some manual control centralised in one single user interface in Sanity.

Glossary

SEO has its own set of tools and terms. Let’s briefly explain those that are relevant to this article

Crawler / Search Engine Bot

a program that scans the internet looking for pages to index.
A crawler finds URLs from different sources:

  1. Pages linked from other (indexed) pages. For example when the homepage is read all the links are also followed and indexed and the behaviour repeats recursively.

  2. Manual suggestions via online tools

  3. The sitemap file

robots.txt

a file hosted on the root of a market’s website which contains rules for disallowing specific URLs or folders from being indexed.

...

It also links to the sitemap.

sitemap.xml

a file hosted on the root of a market’s website which contains a list of URLs that a search engine bot should index. The purpose of this file is suggesting URLs to the crawler. Nothing more.

...

Info

Consider robots and sitemap as the yin and yang ☯️ or URLs crawling.
They are not strictly opposite but they serve opposite purposes.
The robots file disallows URLs and the sitemap suggests them.

How we do it

Instead of relying on manual creation and maintenance of the aforementioned files, we generate them (independently from the build and release process) following the approach:

  1. We retrieve all the static pages defined in Sanity. Depending on their flag “Include in sitemap" we add them to the sitemap or to the robots.

  2. We fetch all the dynamic (menu, categories, store locator, etc.) pages and add them to the sitemap.

  3. We read the additional URLs and excluded URLs lists that can be found in Sanity configuration:
    Desk > Marketing Content > Features > Feature SEO ([markets-domain]/desk/marketingContent;features;featureSeo)

...

Output

The result of these three levels are merged together and the files are generated and published nightly and optionally on-demand, independently from the release process of the app.

...

Info

To easily test pages we have created a browser tool that might help:
https://rbictg.atlassian.net/wiki/spaces/ICFE/pages/4459136016/Tools#Show-Live-Robots-Meta

Follow up

Additionally we are considering if we could automatically add the "meta robots” tag or header to excluded files based on their presence in the robots.txt (disallow).

...