[thelist] 301 redirection mess -- two wholesale site-URL-structure changes within 8 months

Mohan Arun 437341 at gmail.com
Mon Apr 4 20:41:49 CDT 2011


-------------------------------------------
>Example: visitor seeks "../page-name.htm" from a pre-September URL. They
hit our server, which first redirects the request to
"../cases.php?CaseID=123", but then since that page no longer exists, the
.htaccess file then takes the request for "../cases.php?CaseID=123" and
301-redirects it to "../case/123/page-name" -- or have I got the wrong end
of the stick here?
--------------------------------------------

What  you describe here is a 301 redirect chain.
All you have to do is to complete the chain without sendin'
the search crawler into infinite spin (i.e., avoid '301 loop')
Make sure the chain has <3 hops (max 3 redirects) and you should be fine
See the near-official word from Google
http://www.google.com/support/forum/p/Webmasters/thread?tid=6545835c4607b651&hl=en

in your cases.php file make modification
CaseID = request.getParameter(CaseID)
lookup what is the "page-name" for this CaseID, from the database.
set the http status code '301 moved permanently' in your php code itself
(no need to bother with .htaccess again)
set the Location: header in your php code and send the new url (the one with
the /page-name) in the HTTP header itself
This shoud inform the crawler that, the next time it requests the page
it must request the new url which it received in the Location: header.
See "301 moved permanently"
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
"The requested resource has been assigned a new permanent URI and any future
references to this resource SHOULD use one of the returned URIs. Clients
with link editing capabilities ought to automatically re-link references to
the Request-URI to one or more of the new references returned by the server,
where possible."

Search crawler is supposed to obey the 301 and NOT request
the old .htm file again.
You can see if the search engine is still requestin' your .htm files,
by lookin' thru the logs.
If it still does, complain with Google via their webmaster tools forum.
Then submit 'url removal request' to remove all those .htms from
its index.

Now, for a near-crazy thought.
If you ask me,ideally, the search engine's
webmaster tools should let you maintain your site's list of urls
in its index - manually, so we can check if what
was supposed to be done automatically did in fact
happen as intended (basically
a htaccess that is maintained by the search engin' in addition
to the htaccess on the web server).

I have been without sleep for some time so you must take
my sleep-deprived thought with a 'grain of salt'
http://en.wikipedia.org/wiki/Grain_of_salt

- Mohan Arun
Blog: http://mohanarun.com
Tweets: http://twitter.com/437341


More information about the thelist mailing list