Wednesday, July 22, 2009

Redirecting most old Postnuke pages to the new Zikula site

The first order of the day is to redirect old pages to the new pages.

For this, go to your old sitemap, and see the links that you have. If your site is not very heavy, you can redirect each page individually from the old permalink structure to the new one.

Create a file called .htaccess in your root folder (where your index.php is) and put the following into it:
Options FollowSymlinks
RewriteEngine on

Then, we will be writing in our redirects. If there is already an .htaccess in the root folder, just edit it to add your redirects to it (make sure the above lines are present already, or add them in):

RewriteRule ^Article1.htm$ http://www.foobar.com/foo/bar/ [R=301,L]

http://foobar.com/Article1.htm is where the article used to be (you don't need to put in the whole url) and we are now sending it to the new location. Refer to your old sitemap, and point everything to its new locations as separate lines.

This will tell the search engines about the new locations, as well as redirecting any referral links pointing to the old locations along with the traffic and good karma.

For a very large site, this method may not be practical. In that case, you have to make tough choices. Save what you can, and pray for the rest.Some suggestions.

  1. Figure out the pages with the most links and traffic, and be sure to redirect those specifically.
  2. Install your sitemap, and point the url for the old sitemap to the new one, so that all engines coming to look for info at your site will directly go to the new information, though they came out of habit to the old sitemap.
  3. See if there are patterns you can roughly redirect (rather than shelling out 404s) For example, anything with a certain module name, goes to the page for that module.
  4. Edit page content of index locations of new pages where possible (that haven't been individually redirected) - for example, the main downloads page where all downloads are getting redirected to, can contain an index of downloads and their new links for quick reference and manual click through.
  5. It helps to have your 404 error (page not found) page describe what's happening, and assure visitors that things will soon be smooth again, in the meanwhile, to navigate using the very convenient options you will provide.

You can use RedirectMatch for this stuff:

RedirectMatch (.*)\.pdf$ http://www.foobar.com/downloads/


Be sure to put your RedirectMatch list after your redirects, so that specific redirects go to their accurate pages before the slack gets taken care of.

So download number 64 may not redirect to the exact page, but all downloads will go to the main downloads page. This is better than 404s. With some study of the older and newer link structures, it will be posssible to use regex to get quite accurate redirects, particularly if you aren't using short urls (though that kind of defeats half the point of the upgrade)


RedirectMatch (.*)\.gif$ http://www.foobar.com/downloads/$1.gif


Test. Test. Test.

Issues with redirects will be visible instantly, so if things work, they work. If not, tweak. There are abundant resources on the net to use .htaccess redirects. Search. Sometimes, inexplicable issues can happen which are peculiar to your server, software, installed modules, or other things, so if something is working for the world, but not for you, don't lose hope. There are enough websites in the world and someone or the other has experienced your settings too. You'll be sure to find references that work.

Last, but most important:
  • Keep a sharp eye on hits from search engines to your new url structure - indicating that that engine is now aware of the change on your site.
  • Keep an eye on your traffic sources - referrals in particular, and share updated links where possible - most people will appreciate having updated information to replace outdated information (though, sometimes it just might trigger some lazy ones to delete your link and leave it at that, so see how well you know them)
  • Keep an eye on your error log for two things:
  1. 404s: which are the pages that have large numbers of 404s? You may want to redirect them specifically, or as a pattern to some page.
  2. 500s: Internal Server Error - these can sometimes happen when your .htaccess gets too large, or is creating issues and will crash your site till they are resolved - urgent.
Following this should get much of your site back visible from your earlier links, while ensuring that the transition to new links is happening smoothly in the background.

No comments: