[thelist] regex: removing multiline comments

Chris W. Parker cparker at swatgear.com
Fri Jan 17 14:40:11 CST 2003


hi.

today has been my first day with actually trying to use and understand
regular expressions. i've read a lot about them in the past but never
really got a chance to try it out. i would just like to thank my
interest in regular expressions (hehe) and html-kit which has allowed me
to use them in a search and replace.

what i am trying to do is debloat my css file. i thought one good way to
do that would be to remove all the uneccessary white space.

removing \t's \n's and replacing \s{2,}'s with one space is easy. the
hard part is finding one regex that will remove all the comments.
specifically multiline comments and two comments on one line. i've also
found out what being greedy means. ;)

if i have a comment like this /* hello */ i assume i should be able to
use /\*.*\*/ to remove it. unfortunately this also matches...

/* a comment */not a comment/* another comment */

i've read about using ?: (or something like that) to prevent greediness,
however i don't think html-kit supports that because it complains about
a "nested repeat". that's one obstacle.

the second one is that of multiline comments with more than two lines.
two line comments are fine, i just can't get more than two lines to
match correctly. a three line comment is like this...

/* here is the start
	here is some more
	here is the end */

the regex i used for two lines is /\*.*/n.*\*/ but i have been unable to
get one that works for 3+ lines.


anyone have the answer?

thanks,
chris.



More information about the thelist mailing list