[thelist] XPATH - get only atrribute *values*

aardvark evolt at roselli.org
Fri Aug 5 10:48:00 CDT 2005


On 5 Aug 2005 at 16:22, James Hardy wrote:
[...]
> <LINK href="http://www.bbc.co.uk/">
> 	<TEXT>TV</TEXT>
> </LINK>
> <LINK href="http://www.cnn.com/">
> 	<TEXT>TV</TEXT>
> </LINK>
> <LINK href="http://www.aol.com/">
> 	<TEXT>ISP</TEXT>
> </LINK>
> <LINK href="http://www.demon.net/">
> 	<TEXT>ISP</TEXT>
> </LINK>
> <LINK href="http://www.five.tv/">
> 	<TEXT>TV</TEXT>
> </LINK>
> ...
> 
> the XPath expression
> //LINK[TEXT='TV']/@href
> will return three element nodes with evaluate to the following strings
> 
> href="http://www.bbc.co.uk/"
> href="http://www.cnn.com/"
> href="http://www.five.tv/"
[....]

i suspect it's actually returning them without the href and quotes... 
or at least it should be...

if your XML actually looks like this:

<LINK href="href='http://www.five.tv/'">
	<TEXT>TV</TEXT>
</LINK>

then i could see it returning href='http://www.five.tv/'... but 
otherwise, it should return the text in the quotes of the attribute 
you're requesting, not the entire attribute and its name...

however, if my example above is the  case, then use substring() with 
string-length()...

substring(//LINK[TEXT='TV']/@href,7,string-
length(//LINK[TEXT='TV']/@href)-7)

the 7 cuts off the href=' (by starting at position 7) and the string-
length() finds the last character, minus 7 for the ending quote and 
href=', and stops there...

substring() takes the position of the starting character (from 1) as 
the first number, and the length (in characters) to return...



More information about the thelist mailing list