[thelist] Re: regex: removing multiline comments

Kelly Hallman khallman at wrack.org
Fri Jan 17 15:45:01 CST 2003


On Fri, 17 Jan 2003, Chris W. Parker wrote:
> removing \t's \n's and replacing \s{2,}'s with one space is easy. the
> hard part is finding one regex that will remove all the comments.
> specifically multiline comments and two comments on one line. i've also
> found out what being greedy means. ;)
>
> if i have a comment like this /* hello */ i assume i should be able to
> use /\*.*\*/ to remove it. unfortunately this also matches...

Instead of using .* to grab the innards of the comment, try .*?
...so: /\*.*?\*/ may do the trick...it tries to make the smallest match
possible, versus the greediness of .*

> the second one is that of multiline comments with more than two lines.
> the regex i used for two lines is /\*.*/n.*\*/ but i have been unable to
> get one that works for 3+ lines.

Untested, but might work for your situation: /\*[\n.]*?\*/

perl (which contains one of the more robust regex engines in existence)
has a flag for multi-line matches, which might be what you're looking for.
Not to say that you need to use perl for this, but that multi-line matches
is usually a flag or option separate from the regex itself.

Different applications use different regular expression engines, and
all features are not supported by all engines.  Your milage may vary.

--
Kelly Hallman
http://wrack.org/




More information about the thelist mailing list