in Projects

Nginx, Mojolicious, and Dealing with robots.txt

This is a story about how I suffered by blindly following an example.

Beginning

Mojolicious is an awesome perl web framework, and nginx is a very popular webserver. Nginx is so popular that the Mojolicious team wrote up a guide with an example configuration for proxying requests from nginx to Mojolicious. I used this code happily and completed the rest of the growing guide. Everything was great and I continued working on my new website.

The Problem

I realized that I wanted to serve a robots.txt file until the project was complete. I suddenly hit a block: how do I present robots.txt without passing it to Mojolicious? The immediate answer is to use nginx’s try_files command. You can use the $uri variable to try different locations, and then ultimately move to a final fallback. Here’s one of the examples from the nginx website:

try_files $uri $uri/ /error.php?c=404 =404;

This will try to hit the file referenced by the URI first, then try to see if a directory exists with that name, then try to use the assumed error.php file. If none of those work, it’ll throw the standard 404 “Not found” error page. You can also specify a named location. Here’s a shortened example of my configuration:
upstream mojo {
server 127.0.0.1:3000;
}
server {
listen 80;
server_name localhost;
try_files $uri $uri/ @proxy
location @proxy {
proxy_pass http://mojo;
}
}

In this scenario, the try_files line will try to access the file with a given name, then a directory, and will otherwise pass the request to the proxy block, which will allow the request to float up to the server listening on 127.0.0.1:3000 (in this case, Mojolicious’s development server, morbo.).

This lets me have nginx serve the robots.txt file fine, but visiting the site to see the index page from Mojolicious was returning a 403 “Forbidden” response. It turns out nginx tries the URI, then tries the directory, and realizes it’s a directory with no autoindexing. It ends up returning a 403 because it’s forbidden to show a non-indexed directory. I was able to hit specific URIs and had the requests passed to Mojolicious fine — it was only the index page.

The Answer

My gut reaction was to cheat and pass every request to Mojolicious, and specifically outline how to handle robots.txt. There are a lot of examples online of people doing similar things by specifying the extensions they want handed out. My only goal was to statically serve files if they existed in the document root, and otherwise pass it on to Mojolicious. Since directory indexes are disabled anyway, there’s no reason the try_files condition shoud ever hold true for $uri/. So I removed it:

try_files $uri @mojo

This works fine and took way too long to figure out for being so obvious. I’ve used try_files in the past and have needed both pieces ($uri and $uri/) every time, so I needed to look at it longer before realizing the obvious answer.

Write a Comment

Comment

  1. Hello there! Quick question that’s totally off topic. Do you know how to make your site mobile friendly? My website looks weird when browsing from my apple iphone. I’m trying to find a theme
    or plugin that might be able to fix this problem.
    If you have any recommendations, please share.

    Appreciate it!