[thelist] Search engines and good code REPLY
John C Bullas
jcbullas at nildram.co.uk
Mon Oct 20 01:47:14 CDT 2003
At 01:24 20/10/2003, Diane Soini wrote:
>The company I work for is looking at getting a better search engine for
>the corporate web site. It
<300 pages gets you Atomz for free (for commercial use too?)
http://www.atomz.com
check out the search on http://www.ime.org.uk you can customise the living
s**t out of the thing...
FB
>appears that they are leaning toward the Google appliance. That got me to
>wondering... A lot of folks on this list keep saying that using good
>markup improves the searchability of your web site and gives
YUP as a good generalization
>you better rankings as a result. I'm wondering if there is some empirical
>evidence out there to prove this. The company is also looking into a web
>site redesign, and the Internet Marketing Manager seems to think that the
>only thing that we'll have to worry about is making sure all pages have
>metatags and all non-HTML has metadata associated with it. He's a good
>page-layout artist, but his
use index and noindex, follow and nofollow tags, set up a robots.txt file
to direct the engines that take notice to go the right way, hide stuff like
SSI inserts away from the root directory, block indexing on directories you
want kept out of the way, set up a custom 404 page too as a lot of crawlers
(not atomz one) make mistakes and somehow end up missing pages
-===== example ===
Mon Oct 20 2003 12:15:44 am BST
66.196.72.49 tried to load www.fatblokeracing.org/404page.shtml
User Agent = Mozilla/5.0 (Slurp/cat; slurp at inktomi.com;
http://www.inktomi.com/slurp.html)
Referring URL:
================
page was removed long ago.... it worked from memory!
>opinions about good coding are that it is completely worthless to even
>consider it. It would be nice to have a good counter-argument about that.
>I'm curious if you open the box on the Google appliance, do they actually
>tell you that using proper markup is going to help? Some how I doubt it
>since you get
Google has guides as to their own spidering technique, all your custom box
does if you don't "pay" is to be selective of this existing output AFAIK
we get loads of google and other search referrals
http://www.sitemeter.com/default.asp?action=stats&site=s10internetmini&report=11&visit=1
>plenty of poorly-coded results in a google search. But I'm curious
>nonetheless. Anybody have
you can control what gets "read" if the searchbot abides by robots.txt AND
stuff like noindex tags
I think it has helped on the IME as we have restricted "abiding bot" access
to certain diectories
>evidence they could share to prove that good markup really improves the
>searchability?
>
>D
More information about the thelist
mailing list