Skip to content

We share, receive and see URLs all the time, so it’s nice to have human-readable URLs like example.tld/yeah instead of example.tld/yeah.html.

On this blog using Vitepress on a Nginx environment, it works rather well (with a minor trade-off).


Vitepress cleanUrls option on a Nginx server

  • nginx
  • url-rewriting
  • vitepress
  • url
  • http
  • redirect

Setting Vitepress cleanUrls to true removes the trailing .html from URLs. The internal links of your Vitepress app changes from <a href="/something.html"> to <a href="/something">.

However, cleanUrls does not change the file names or the folder structure of the generated HTML files, which means your server must now be able to serve the /something.html file when the /something URL is requested.

Here’s how I solved it, assuming the app is hosted on its own domain or subdomain (example.tld or subdomain.example.tld, I did not test example.tld/path/to/app).

At the time of writing, I’m using Vitepress 1.0.0-rc.24 and Nginx 1.25.

Full configuration

If you don’t want to read the step-by-step section, here are the important parts of the configuration. Adapt them to your needs and make sure to understand the trade-off for index pages.

nginx
server {
    index index.html;
    rewrite ^(.+)/$ $1 permanent;

    if ($request_uri ~ ^/(.*)index\.html(\?|$)) {
        return 301 /$1;
    }

    if ($request_uri ~ ^/(.*)\.html(\?|$)) {
        return 301 /$1;
    }

    location / {
        error_page 404 /404.html;
        try_files $uri $uri.html $uri/ =404;
    }
}
The same, with code comments:
nginx
server {
    index index.html;
    # and other things…

    # Remove the trailing slash (permanent 301 redirect). 
    rewrite ^(.+)/$ $1 permanent;

    # Remove the trailing `index.html`. 
    if ($request_uri ~ ^/(.*)index\.html(\?|$)) {
        return 301 /$1;
    }

    # Remove the trailing `.html`. 
    if ($request_uri ~ ^/(.*)\.html(\?|$)) {
        return 301 /$1;
    }

    location / {
        # When the HTTP status code is 404, answer with the `/404.html` file.
        error_page 404 /404.html;

        # When `foo/bar` (which is `$uri`) is requested, 
        # try to serve the first existing file among the list: 
        # `foo/bar`, `foo/bar.html` or `foo/bar/index.html`. 
        # Otherwise answer with a 404 code.
        try_files $uri $uri.html $uri/ =404;
    }
}

Step-by-step

The first step is enough to have clean URLs, the other steps after will bring refinements and safeguards.

Step 1: the gist

Make sure your Nginx location block have this try_files rule:

nginx
server {
    index index.html;
    # and other things…

    location / {
        # When `foo/bar` (which is `$uri`) is requested, 
        # try to serve the first existing file among the list: 
        # `foo/bar`, `foo/bar.html` or `foo/bar/index.html`. 
        # Otherwise answer with a 404 code.
        try_files $uri $uri.html $uri/ =404; 
    }
}
The try_files list in details:
  • $uri is for exact matches like assets: when the browser requests logo.svg, we search for a logo.svg file.
  • $uri.html adds a missing .html (foo/bar becomes foo/bar.html), which is the reverse operation of Vitepress cleanUrls.
  • $uri/ adds a trailing / (foo/bar/) to instruct Nginx to “look into the foo/bar directory”, so Nginx ends up searching for foo/bar/index.html, as defined by the index directive. This one is needed for the root of the website (example.tld must serve example.tld/index.html).
  • =404 throws a 404 status if none of these files can be found.

At this step, most paths are working with and without .html:

MarkdownValid pathsInvalid paths
/index.md/, /index.html-
/notes.md/notes, /notes.html/notes/ (403)
/notes/my-note.md/notes/my-note, /notes/my-note/, /notes/my-note.html-
-anything not foundNginx 404

We need 3 improvements:

  • Vitepress 404 page (see step 2) instead of Nginx 404 message;
  • a fix for the 403 when there’s a trailing slash (step 3);
  • no more paths with .html (step 4);

Step 2: the 404 page

The default Vitepress comes with a 404.html page. Let’s serve it using the error_page directive:

nginx
server {
    index index.html;
    # and other things…

    location / {
        # When the HTTP status code is 404, answer with the `/404.html` file.
        error_page 404 /404.html; 

        try_files $uri $uri.html $uri/ =404;
    }
}

Step 3: avoiding the 403 errors

To avoid the 403 on paths with a trailing slash (e.g. my-blog.com/ or my-blog.com/notes/), we can redirect them to their non-trailing slash version using the rewrite directive in the server block.

nginx
server {
    index index.html;
    # and other things…

    # Remove the trailing slash (permanent 301 redirect).
    rewrite ^(.+)/$ $1 permanent; 

    location / {
        error_page 404 /404.html;
        try_files $uri $uri.html $uri/ =404;
    }
}

WARNING

Note that the permanent keyword does a 301 redirect, which is permanent and might be cached during a very long time by browsers and search engines. If you are not sure about this rewriting rule, consider the redirect keyword to get a 302 (temporary) redirect instead of a 301.

Step 4: redirect /slug.html to /slug

For now, /slug and /slug.html are both working, so let’s redirect the one with the trailing .html.

https://stackoverflow.com/questions/38228393/nginx-remove-html-extension

nginx
server {
    # other things…

    rewrite ^(.+)/$ $1 permanent;

    # Remove the trailing `index.html`.
    if ($request_uri ~ ^/(.*)index\.html(\?|$)) { 
        return 301 /$1; 
    } 

    # Remove the trailing `.html`.
    if ($request_uri ~ ^/(.*)\.html(\?|$)) { 
        return 301 /$1; 
    } 

    location / {
        error_page 404 /404.html;
        try_files $uri $uri.html $uri/ =404;
    }
}

WARNING

Like explained in the warning of the previous step, you might prefer return 302 over return 301.

The trade-off for index pages

If you have index pages like notes/index.md, you have to move them to notes.md, otherwise you’ll experience an infinite redirection loop where Nginx wants to remove the slash, but Vitepress wants to add it back.

In other words, do this:

.
├─ index.md
├─ notes
│  ├─ index.md
│  ├─ post-1.md
│  └─ post-2.md
├─ about.md
└─ notes.md

Readings

Aside from Nginx documentation for the HTTP core module, I was helped by: