[thelist] Search engines and dynamic sites using query strings

Daniel J. Cody djc at starkmedia.com
Tue Jan 30 14:16:31 CST 2001


Ben Gustafson wrote:


> A possible solution (perhaps read: hack) I came up with for the search
> engine issue is to write an ASP script that creates static HTML pages of all
> the pages of the site, which are for the search engines to index. The files
> contain the page's content within comment tags, and a Meta Refresh tag that
> redirects the browser to the "real" dynamic version of the corresponding
> page on the site. An example of one of these pages is at
> http://www.lionbridge.com/company/p432l1.htm .

The major problem with this is that the robot won't follow a meta 
refresh tag, and therefore won't index the page with the 'real' content 
on it. Also, i see your redirect page doesn't have any links on it to 
other pages on your site, which the spider would follow and thereby 
index the other content on your site automatically... Also, if you have 
to manually run an asp script on every page, thats a serious pain in the 
ass..

 
> How have others approached the issue of search engines not indexing pages
> with query strings? What are people's opinions on the off-the-shelf tools
> (such as xBuilder) that allegedly create static versions of dynamic sites?

we had the same problem when we redesigned the evolt.org site a couple 
months back.. Lots of content, but spiders weren't getting to it because 
of our query string URL's.. We ended up doing our own hack with CF that 
passes every page through a 404 document which grabs URL params out of 
the URL string and passes them to a template that parses the relevant 
info out of them.. So what used to be the spider unfriendly

http://evolt.org/index.cfm?menu=8&catid=25&cid=4572

is now the spider happy

http://www.evolt.org/article/Visiting_The_Ghost/25/4572/index.html

of course, even though it appears to be a 4 level directory hierarchy, 
none of those directorys(or index.html for that matter) really exist. 
but they exist as a link on a webpage which means a spider will follow 
it and a search engine will index it, which is our goal..

Its a really interesting subject that i've been spending a lot of time 
learning about as well.. Hope that answers some questions or gives you 
an idea or two.. hopefully soemone else pipes up about other things to 
try :)

.djc.





More information about the thelist mailing list