[thelist] perl regexp/XML question

Jeremy Ashcraft ashcraft at 13monkeys.com
Thu, 06 Jan 2000 15:15:31 -0600


I need the help of a perl jockey greaer than I.  I'm writing a function
to parse the key-value pairs from the attributes in an XML tag(eg <tag
attr1="val1" attr2="val2" attr3="val3" ...>).  The way I'm doing it now
is a real hack:

sub parseAttr()
{
    my($xmltext) = @_;

   @arr = split(/\"/, $xmltext);

    pop @arr;     #get rid of "/>" on end
    $arr[0] =~ s/^\S+//g;    #get rid of "<tag " in first element so its
just attr=

    foreach $arr(@arr) {
        if($arr =~ /^\s/)     #if it starts with a space (eg attr=)
        {
            $arr =~ s/\s//g;   #remove whitespace
            chop $arr;          #chop off =
        }
    }
    for($i=0; $i < $#arr; $i+=2) {
        $ret{$arr[$i]} = $arr[$i+1];      #assign key-value to hash
    }
    return %ret;
}

nice, huh?  Anyway what I want to know is what regexp can I use to
match all attr="value" in an XML tag and extract out the ' attr="value"
' substring.

something like m/(regexp i'm looking for)/g

Confused?  good.  I'd like to have a more elegant way to do this than
that nasty hack i've written

(and yes I've tried the XML:Parser module and it has problems with <tag
/> as opposed to <tag></tag>)

Thanks in advance.

--
Jeremy Ashcraft
web developer/geek
ashcraft@13monkeys.com
http://www.13monkeys.com