Using .htaccess rules to redirect old URLs to a new site

Backstory

I recently migrated my old WordPress blog to Ghost (tutorial on that coming soon) and I didn’t want to lose all of the links that exist out there in the process. I don’t care so much about search engine optimization (although Google juice never hurts), but the idea of people following bookmarks to my site only to get a 404 makes me sad. Redirecting the whole domain would be pretty simple, just go to the Redirects option in cPanel and it’s straightforward. But that’s still not idea because people who have links to my RSS feed in their feed reader wouldn’t likely get the update and anyone following an old link and just ending up on the home page isn’t going to know where to find what they had wanted to see other than searching around at that point. So I wrote a few lines in an .htaccess to dynamically redirect to specific locations based on where the person entered the old site.

What is .htaccess?

We run Apache on Reclaim Hosting and when an account is provisioned the server creates a directive telling Apache what your domain is and where the files for that domain are on the server along with a handful of other things. This is how multiple sites are able to be hosted from a single server because Apache reads the URL and then looks at its list of folders and information about each domain it has a record for and then displays the contents of the right URL.

An .htaccess file allows you to write commands that override the rules that already exist in the Apache configuration. So instead of someone having to type in HAXCMS site list they can just type http://timowens.io and a rule in .htaccess will tell the server that index.php is what the server should display. WordPress includes an .htaccess by default that makes those pretty permalinks you’re used to. We can utilize the .htaccess to get pretty specific about how we want to rewrite URLs though.

My WordPress site was at http://archive.timmmmyboy.com (I had also played with Anchor and moved my WordPress blog at the time to a subdomain). All of my posts have the following structure:

http://archive.timmmmyboy.com/2013/11/writing-collaborative-documentation-with-dokuwiki-and-github/

So the base domain, followed by the year, followed by the month, followed by the name of the post. Now my Ghost blog has this format for posts:

http://blog.timowens.io/writing-collaborative-documentation-with-dokuwiki-and-github/

Adding the Redirect

No year or month, but otherwise we have something to work with. Basically what we want to do is have Apache grab the post name from the end of the URL and append it to the domain of the new blog. There are also other links like for the RSS feed that would be /feed. Here’s what my full .htaccess file looks like now with the redirects in place:

RewriteEngine On
RewriteRule ^([0-9]+)/([0-9]+)(.)$ http://blog.timowens.io$3 [R=301,L]
RewriteRule ^feed$ http://blog.timowens.io/rss [R=301,L]
RewriteRule ^feed/$ http://blog.timowens.io/rss [R=301,L]
RewriteRule ^(.
)$ http://blog.timowens.io/ [R=301,L]

That’s it! Let’s go line by line and see what it’s doing.

RewriteEngine On

Adding the Rewrite Rule

We need to tell Apache we’re going to be rewriting some URLs so this has to come before any RewriteRule directives.

RewriteRule ^([0-9]+)/([0-9]+)(.*)$ http://blog.timowens.io$3 [R=301,L]

RewriteRule is the directive to create the rewrite. The ^ is a wildcard saying “ignore whatever came before the pattern I’m about to show”. ([0-9]+)/([0-9]+) are regular expressions that mean "a series of numbers between 0 and 9 will show up between these slashes). (.*)$ is our final wildcard “anything coming after that pattern” and the $ tells the server to store each of the wildcard directives of that line to a variable.

The second part of that rule allows us to grab that variable (in this case the third variable, the one that holds our post name) and append it to the end of our new URL. [R=301,L] tells the server this is a 301 Permanent Redirect which the server will then let Google and other entities know when they visit. It’s highly highly recommended that you use 302 Temporary Redirects until you’re confident you’ve got it right. Temporary redirects will not store a cookie in your browser causing the redirect to be cached, nor will search engines be notified while you’re testing.

RewriteRule ^feed$ http://blog.timowens.io/rss [R=301,L]
RewriteRule ^feed/$ http://blog.timowens.io/rss [R=301,L]
The link to my RSS feed is different so this redirect says “If someone visits some anything with the word feed on the end, redirect here”. There are two lines because I also assume sometimes people put the slash o the end and sometimes they don’t. That could probably be consolidated if I knew regex a bit more.

RewriteRule ^(.*)$ http://blog.timowens.io/ [R=301,L]

Finally a catchall. “Whatever URL they visited, if there isn’t a post or directive to end them to a specific location, redirect to the homepage.” Basically I’d rather someone get sent from my old site to the homepage at this point if it can’t find a post rather than just showing a 404 error.

That’s it! Once you’ve tested it and know it’s working switch from 302 Temporary Redirects to 301 Permanent Redirects and you’re all set! Quick note about .htaccess files, Because of the dot on the beginning of the filename some programs may not show the file by default since that’s universal for “hidden files”. In the File Manager in cPanel you can check to show hidden files before entering.