[thelist] SE spidering spectra URL's
Raymond K. Camden
rcamden at allaire.com
Thu Nov 23 21:34:19 CST 2000
Instead of ?var=val&var2=val2, use this format:
foo.cfm/var/val/var2/val2
Then, examine the CGI.PATH_INFO variable to parse your date. In the example
I used above, you would parse out the current template, foo.cfm, then loop
through each element and assign a variable name and variable value.
So, a real example:
press_release.cfm/id/300/method/display
Would tell press_release.cfm that the ID is 300 and the method is display.
I've attached a custom tag that will automatically translate /var/val into
URL.VAR=Val. Note, however, that the tag is pretty old and may not be the
best code.
P.S. Of course, this isn't really a Spectra or ColdFusion problem, it's a
spider problem. ASP and PHP sites will have the same issues with spiders.
Does anyone here actually know someone who works for one of these companies?
I'd love to hear a good reason for NOT indexing ?var=val type URLs,
especially in this day and age.
path2url.cfm:
<CFSET PATH =
REReplace(CGI.PATH_INFO,".*#ListLast(CGI.CF_TEMPLATE_PATH,"\")#","")>
<CFIF NOT Len(Path)><CFEXIT></CFIF>
<CFIF ListLen(Path,"/") GTE 2>
<CFLOOP INDEX="X" FROM="1" TO="#ListLen(Path,"/")#" STEP="2">
<CFTRY>
<CFSET "URL.#ListGetAt(Path,X,"/")#" = ListGetAt(Path,X+1,"/")>
<CFCATCH>
</CFCATCH>
</CFTRY>
</CFLOOP>
<CFELSEIF Path IS NOT "/">
<CFSET Path = Replace(Path,"/","")>
<CFSET "URL.#Path#" = True>
</CFIF>
=======================================================================
Raymond Camden, Principal Spectra Compliance Engineer for Allaire
Email : jedimaster at allaire.com
ICQ UIN : 3679482
"My ally is the Force, and a powerful ally it is." - Yoda
> -----Original Message-----
> From: thelist-admin at lists.evolt.org
> [mailto:thelist-admin at lists.evolt.org]On Behalf Of Peter Van Dijck
> Sent: Thursday, November 23, 2000 8:57 AM
> To: thelist at lists.evolt.org
> Subject: [thelist] SE spidering spectra URL's
>
>
> Am I correct to think that SE's don't spider URL's with querystrings? And
> if so, in many sites, if we want them to be spidered we need to give it
> url's without querystrings? (like wired.com)
> The approach we're taking is this:
> we are building a Spectra site with lots of daily articles, e.g.
> there will
> be URL's.
> However, we make a daily copy of these articles into normal url's.
> Then, we will submit a page with links to these url's to the se's.
> That shold work.
> Am I missing a better way, or anything???
> Thanks guys!!
> Peter (sick and speccing)
>
>
> ---------------------------------------
> For unsubscribe and other options, including
> the Tip Harvester and archive of TheList go to:
> http://lists.evolt.org Workers of the Web, evolt !
More information about the thelist
mailing list