[thelist] SE spidering spectra URL's

Raymond K. Camden rcamden at allaire.com
Thu Nov 23 21:34:19 CST 2000


Instead of ?var=val&var2=val2, use this format:

foo.cfm/var/val/var2/val2

Then, examine the CGI.PATH_INFO variable to parse your date. In the example
I used above, you would parse out the current template, foo.cfm, then loop
through each element and assign a variable name and variable value.

So, a real example:

press_release.cfm/id/300/method/display

Would tell press_release.cfm that the ID is 300 and the method is display.

I've attached a custom tag that will automatically translate /var/val into
URL.VAR=Val. Note, however, that the tag is pretty old and may not be the
best code.

P.S. Of course, this isn't really a Spectra or ColdFusion problem, it's a
spider problem. ASP and PHP sites will have the same issues with spiders.
Does anyone here actually know someone who works for one of these companies?
I'd love to hear a good reason for NOT indexing ?var=val type URLs,
especially in this day and age.

path2url.cfm:
<CFSET PATH =
REReplace(CGI.PATH_INFO,".*#ListLast(CGI.CF_TEMPLATE_PATH,"\")#","")>

<CFIF NOT Len(Path)><CFEXIT></CFIF>

<CFIF ListLen(Path,"/") GTE 2>
	<CFLOOP INDEX="X" FROM="1" TO="#ListLen(Path,"/")#" STEP="2">
		<CFTRY>
			<CFSET "URL.#ListGetAt(Path,X,"/")#" = ListGetAt(Path,X+1,"/")>
			<CFCATCH>
			</CFCATCH>
		</CFTRY>
	</CFLOOP>
<CFELSEIF Path IS NOT "/">
	<CFSET Path = Replace(Path,"/","")>
	<CFSET "URL.#Path#" = True>
</CFIF>
=======================================================================
Raymond Camden, Principal Spectra Compliance Engineer for Allaire

Email   : jedimaster at allaire.com
ICQ UIN : 3679482

"My ally is the Force, and a powerful ally it is." - Yoda


> -----Original Message-----
> From: thelist-admin at lists.evolt.org
> [mailto:thelist-admin at lists.evolt.org]On Behalf Of Peter Van Dijck
> Sent: Thursday, November 23, 2000 8:57 AM
> To: thelist at lists.evolt.org
> Subject: [thelist] SE spidering spectra URL's
>
>
> Am I correct to think that SE's don't spider URL's with querystrings? And
> if so, in many sites, if we want them to be spidered we need to give it
> url's without querystrings? (like wired.com)
> The approach we're taking is this:
> we are building a Spectra site with lots of daily articles, e.g.
> there will
> be URL's.
> However, we make a daily copy of these articles into normal url's.
> Then, we will submit a page with links to these url's to the se's.
> That shold work.
> Am I missing a better way, or anything???
> Thanks guys!!
> Peter (sick and speccing)
>
>
> ---------------------------------------
> For unsubscribe and other options, including
> the Tip Harvester and archive of TheList go to:
> http://lists.evolt.org Workers of the Web, evolt !





More information about the thelist mailing list