[thelist] reporting site's links

rudy Rudy_Limeback at maritimelife.ca
Fri Oct 20 15:45:33 CDT 2000


>  I need to create lists or an array that has all the hrefs and 
> what is between <a> and </a> for them for a site with about 140 pages.

hi erik

first of all, you realize that there may be different relative urls for 
the same page, right?  so you don't want to grab the actual href code, but 
rather what it resolves to

for example

   <a href="index.htm">Home</a>
   <a href="/index.htm">Home</a>
   <a href="../index.htm">Home</a>

also, there could be variations in the link text for the same page, for 
example

   <a href="/index.htm">Home</a>
   <a href="/index.htm">Return to Home Page</a>
   <a href="/index.htm">Main Page</a>

having gotten this far, i am now prepared to give you only half a solution 
(sorry) -- try the "list all links" bookmarklet to resolve the urls

it is only half a solution because it doesn't show the link text -- 
perhaps one of the elite javascript gurus on thelist (among which i do not 
count myself) can augment this to grab the link text too

  bookmarklet: list all links

  source: http://bookmarklets.com/tools/data/index.phtml

  description: Creates a new page with just the links (the URLs) that 
appear in whichever page you are viewing. You can save the newly created 
page by choosing "Save As..." from the File menu.

  cut&paste code (warning, wrap alert): 
javascript:WN7z=open('','Z6','width=400,height=200,scrollbars,resizable,menubar');DL5e=document.links;with(WN7z.document){write('<base 
target=_blank>');for(lKi=0;lKi<DL5e.length;lKi++){write(DL5e[lKi].toString().link(DL5e[lKi])+'<br><br>')};void(close())}


rudy
r937.com




More information about the thelist mailing list