[thelist] Regex Help

VOLKAN ÖZÇELİK volkan.ozcelik at gmail.com
Sat Feb 4 01:29:35 CST 2006


I've come up with somehting like

([\s\S]+?)\.( |\r|\n)

Which is:

match anything (whitespace or non-whitespace) non-greedily.
followed by a full stop
which is then followed by a space, a carriage return or a new line

The first capturing group ([\s\S]+?) gives the sentence.
One caveat though the last paragraph's last sentence will not be
matched since there is an EOF there instead of a whitespace character.
It's possible to handle though (by concatenating an extra space)

Though imho, it is not possible to match without an extra space in
your situation.
(that's why word processors *require* a whitespace character after
each punctuation mark for grammatical and syntax checking)

I think I understand your point:

If say someone types

"... congue hendrerit et nam sit.Magna ...."

by mistake you want the reg ex to match two sentences instead of one large
(note that sit.Magna has no space after the period)

Though I cannot see an easy way (without using the space in the reg
ex) of getting out of it.

HTH,
--
Volkan Ozcelik
+>Yep! I'm blogging! : http://www.volkanozcelik.com/volkanozcelik/blog/
+> My projects/studies/trials/errors : http://www.sarmal.com/



More information about the thelist mailing list