Deploying production and staging sites using Docker is quick and simple, but having one repo, two servers that need two environments can be tricky.
My Docker setup involves a container on AWS, and deployments from Cloud Docker. When I deploy a change or a new feature to production or staging the entire environment is pushed.
This lets me create new servers with minimal effort.
This also means that my infrastructure is in flux. Because the site does not have a database or any uploaded content it’s very fast and has a simple Dockerfile. But it also means I have some challenges when dealing with staging and production environments.
I have a publicly accessible staging, and production server but the site I work on also runs the risk of being indexed by Google. The site does about 2 million page views a month and has over 10,000 inventory pages. So if Google were to index that site and pickup on those inventory pages it would cause a huge mess in our event tracking, analytics and just be confusing for our visitors if they stumble across the wrong site.
One solution was to dynamically server the robots.txt
file but PHP FPM would require me to allow that text file to execute as php and that is a terrible idea.
The best and fastest solution was to use the power of Nginx and rewrite any request to robots.txt
to a staging specific version.
# Remap robots.txt for staging
if ($http_host = "staging.site.com") {
rewrite ^/robots.txt /robots-staging.txt last;
}
My robots-staging.txt
looks like this.
User-agent: *
Disallow: /
This won’t 100% block users or some search engines, and it can easily be ignored by malware and malicious bots but it should help!