Docs
SEO
Files
Sitemap & robots.txt

Setting Up Sitemap and robots.txt for SEO

Properly configuring your sitemap and robots.txt files is crucial for optimizing your website's visibility to search engines. This guide will walk you through the process of setting up these important SEO elements.

Sitemap Configuration

The sitemap.xml file follows the Sitemaps XML format, helping search engine crawlers index your site more efficiently. We use next-sitemap for generating both the sitemap and robots.txt files.

  1. Configuration File Location

    • The sitemap configuration file is located in the root directory at /next-sitemap.config.js. Here's the revised markdown:
  2. Key Configuration Options

  • siteUrl: Set this to the base URL of your website.

    • Set the environment variable NEXT_PUBLIC_SITE_URL to the base URL of your website.
    • Using an environment variable allows for easier configuration across different environments.
    • Ensure that all URLs in your sitemap align with the final destination after any redirects (e.g., with or without 'www'). This alignment helps Google and other search engines properly crawl and index your site.
  • generateRobotsTxt: Set to true to generate the robots.txt file. The default value is true.

  • sitemapSize: Currently set to 5000. Adjust as needed.

    • When the number of URLs exceeds this limit, next-sitemap will create additional sitemap files (e.g., sitemap-0.xml, sitemap-1.xml) and an index file (sitemap.xml).
  • exclude: An array of relative paths to exclude from the sitemap.

  1. Exclude Array
    • Private Pages:

      exclude: [
          "/private", 
          "/private/**",
      ],
      • Adjust the route names if your private pages path differ.
    • Login Pages:

      exclude: [
          "/login",
          "/login/*", 
      ],
      • Adjust the route names if your login pages path differ.
    • SEO-related Files:

      exclude: [
          "/manifest.webmanifest",
          "**/apple-icon.*",
          "**/icon.*",
          "**/twitter-image.*",
          "**/opengraph-image.*",
      ],

robots.txt Configuration

The robots.txt file tells search engine crawlers which URLs they can access on your site.

  • Generation: Set generateRobotsTxt to true in the configuration file. Default value is true so no manual changes are typically necessary.
  • Location: Automatically generated in the root of the public directory.

By following these configuration steps, you'll ensure that search engines can effectively crawl and index your site, potentially improving your search engine visibility and rankings.