[thelist] Re: regex: removing multiline comments

Kelly Hallman khallman at wrack.org
Fri Jan 17 15:45:01 CST 2003

On Fri, 17 Jan 2003, Chris W. Parker wrote:
> removing \t's \n's and replacing \s{2,}'s with one space is easy. the
> hard part is finding one regex that will remove all the comments.
> specifically multiline comments and two comments on one line. i've also
> found out what being greedy means. ;)
> if i have a comment like this /* hello */ i assume i should be able to
> use /\*.*\*/ to remove it. unfortunately this also matches...

Instead of using .* to grab the innards of the comment, try .*?
...so: /\*.*?\*/ may do the trick...it tries to make the smallest match
possible, versus the greediness of .*

> the second one is that of multiline comments with more than two lines.
> the regex i used for two lines is /\*.*/n.*\*/ but i have been unable to
> get one that works for 3+ lines.

Untested, but might work for your situation: /\*[\n.]*?\*/

perl (which contains one of the more robust regex engines in existence)
has a flag for multi-line matches, which might be what you're looking for.
Not to say that you need to use perl for this, but that multi-line matches
is usually a flag or option separate from the regex itself.

Different applications use different regular expression engines, and
all features are not supported by all engines.  Your milage may vary.

Kelly Hallman

More information about the thelist mailing list