[thelist] Re: regex: removing multiline comments
Kelly Hallman
khallman at wrack.org
Fri Jan 17 15:45:01 CST 2003
On Fri, 17 Jan 2003, Chris W. Parker wrote:
> removing \t's \n's and replacing \s{2,}'s with one space is easy. the
> hard part is finding one regex that will remove all the comments.
> specifically multiline comments and two comments on one line. i've also
> found out what being greedy means. ;)
>
> if i have a comment like this /* hello */ i assume i should be able to
> use /\*.*\*/ to remove it. unfortunately this also matches...
Instead of using .* to grab the innards of the comment, try .*?
...so: /\*.*?\*/ may do the trick...it tries to make the smallest match
possible, versus the greediness of .*
> the second one is that of multiline comments with more than two lines.
> the regex i used for two lines is /\*.*/n.*\*/ but i have been unable to
> get one that works for 3+ lines.
Untested, but might work for your situation: /\*[\n.]*?\*/
perl (which contains one of the more robust regex engines in existence)
has a flag for multi-line matches, which might be what you're looking for.
Not to say that you need to use perl for this, but that multi-line matches
is usually a flag or option separate from the regex itself.
Different applications use different regular expression engines, and
all features are not supported by all engines. Your milage may vary.
--
Kelly Hallman
http://wrack.org/
More information about the thelist
mailing list