[thelist] Search engines and good code REPLY

John C Bullas jcbullas at nildram.co.uk
Mon Oct 20 01:47:14 CDT 2003


At 01:24 20/10/2003, Diane Soini wrote:
>The company I work for is looking at getting a better search engine for 
>the corporate web site. It

<300 pages gets you Atomz for free (for commercial use too?)

http://www.atomz.com

check out the search on http://www.ime.org.uk you can customise the living 
s**t out of the thing...

FB

>appears that they are leaning toward the Google appliance. That got me to 
>wondering... A lot of folks on this list keep saying that using good 
>markup improves the searchability of your web site and gives

YUP as a good generalization

>you better rankings as a result. I'm wondering if there is some empirical 
>evidence out there to prove this. The company is also looking into a web 
>site redesign, and the Internet Marketing Manager seems to think that the 
>only thing that we'll have to worry about is making sure all pages have 
>metatags and all non-HTML has metadata associated with it. He's a good 
>page-layout artist, but his

use index and noindex, follow and nofollow tags, set up a robots.txt file 
to direct the engines that take notice to go the right way, hide stuff like 
SSI inserts away from the root directory, block indexing on directories you 
want kept out of the way, set up a custom 404 page too as a lot of crawlers 
(not atomz one)  make mistakes and somehow end up missing pages

-===== example ===
Mon Oct 20 2003 12:15:44 am BST

66.196.72.49 tried to load www.fatblokeracing.org/404page.shtml

User Agent = Mozilla/5.0 (Slurp/cat; slurp at inktomi.com; 
http://www.inktomi.com/slurp.html)

Referring URL:
================

page was removed long ago.... it worked from memory!

>opinions about good coding are that it is completely worthless to even 
>consider it. It would be nice to have a good counter-argument about that. 
>I'm curious if you open the box on the Google appliance, do they actually 
>tell you that using proper markup is going to help? Some how I doubt it 
>since you get

Google has guides as to their own spidering technique, all your custom box 
does if you don't "pay" is to be selective of this existing output AFAIK

we get loads of google and other search referrals

http://www.sitemeter.com/default.asp?action=stats&site=s10internetmini&report=11&visit=1

>plenty of poorly-coded results in a google search. But I'm curious 
>nonetheless. Anybody have

you can control what gets "read" if the searchbot abides by robots.txt AND 
stuff like noindex tags
I think it has helped on the IME as we have restricted "abiding bot" access 
to certain diectories

>evidence they could share to prove that good markup really improves the 
>searchability?
>
>D



More information about the thelist mailing list