[thelist] Regex frustration....

Sam-I-Am sam at sam-i-am.com
Thu Sep 27 09:21:08 CDT 2001


regexps start to bring real value when you are fluent enough to compose
the pattern faster than the time it would take you to do the job
manually. 
If you have to stop and look up every idiom/meta character each time,
and test repeatedly on dummy data etc it presents a discouraging
learning curve.
That said, it really is worth it. I remember saying exactly what you are
saying about 2-3 years ago, and I've persisted in the face of deadlines
and impatient collegues enough times to have a basic but useful grasp of
regexps. 

For the html world, where you rarely care about the optimization and
performance concerns that are the bulk of "Mastering Regular
Expressions" a few simple tips get you a long way:

<tip type="starting with regular expressions" author="sam-i-am">
Most of the time when editing or cleaning up markup, you really only
need a pattern a little bit more flexible than your editor's various
toolbar options and helpers. 
This simple kind of pattern gets a lot of use:

<([^>]+)>

in other words - match a tag; a '<' followed by anything except '>',
followed by '>'. (And remember the tagname and attributes). 

Building from this, you can match any particular tag: 
<(td\s+[^>]*)> - i.e a td tag with optional attributes. 
(<\/*td[^>]*>) start or end td (and remember the whole thing. 

Couple this with a substitution: 
s/<(tr\s+[^>]*)>/<tr>/g - strip all tr attributes

s/<img\s+src="([^"]+)"[^>]*>/<a href="\1">\1<\/a>/gi - make a list of
links out of a list of img tags. 

and so on. 
Knowing regexp just helps. Those stupid tedious jobs that end up taking
30 minutes, can get sometimes get done in 1. And occasionally you're in
position to cut hours off everyone's task with a simple regexp. Even if
you can just cut out a couple of operations and end up going through the
rest by hand, it's still a great tool for your toolbox that can save
enough time to let you concentrate on the real problems. 

(disclaimer: this tip was a stream of consciousness kind of thing. I've
not tested the above regexps)
</tip>




</tip>
Andy Warwick wrote:
> 
> On 2001-09-26 22:38, Chris George at chrisg at gsnet.com wrote:
> 
> > Oh my freaking gosh.
> >
> > Here's a solution:
> 
> <snip>
> 
> Chris,
> 
> If you do reconsider, and spend some time learning regular expressions, I
> can recommend the Book "Mastering Regular Expression" by Jeffrey E. F.
> Friedl (O'Reilly, ISBN: 1-56592-257-3).
> 
> It helped me a lot when I was starting out, and has proved a life saver in
> many instances. When you see some of the cool stuff you can do with regular
> expressions and log files, not to mention validation of user input and
> automated, widescale tag changes, you'll become an instant convert. Honest.
> 
> Andy W
> 
> ---------------------------------------
> For unsubscribe and other options, including
> the Tip Harvester and archive of TheList go to:
> http://lists.evolt.org Workers of the Web, evolt !




More information about the thelist mailing list