[thelist] Ereg

Matt Warden mwarden at gmail.com
Wed May 17 19:54:13 CDT 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Hicks wrote:
> Matt Warden wrote:
>>$subject = 'HI, my name is, "Warden, Matt"';
>>$foo = preg_replace('/"([^",\]*),([^",]*)"/', '"${1}\,${2}"', $subject);
>>
> 
> 
> But that only transforms quoted strings containing only one comma.
> 
> What about quoted strings containing more than one comma?

Right you are.

The regex allows you to run multiple passes, but obviously that is not
ideal.

My regex also has a bug where it won't match this:

"hel\lo, my friend"

because it fails to match at \.

Interesting problem. Maybe something like...

$count=0;
$results = array();
$atoms = explode(',', $subject);
for ($i=0; $i<count($atoms); $i++) {
	$results[$count] = $atoms[$i];
	$quotepos = strpos($atoms[$i], '"');
	// if quote found but no second quote found...
	if ($quotepos !== FALSE
		&& strpos($atoms[$i], '"', $quotepos) === FALSE) {
		// then there was at least 1 split within an atom.
		// append the next atoms, moving the pointer along
		// with us, until the matching " is found
		for ($i=$i+1; true; $i++) {
			// repair the false split
			$results[$count] .= '\,' . $atoms[$i];
			// i would put this in the loop condition,
			// but i'm not sure whether the condition
			// is tested with type-equality or not
			if (strpos($atoms[$i], '"') !== FALSE) break;
		} // end inner for
	}
	$count++;
} // end outer for

echo implode(',', $results);

Not a regex, I know. And, again, not tested. But hopefully this helps
the OP.

- --
Matt Warden
Oxford, OH, USA
http://mattwarden.com


This email proudly and graciously contributes to entropy.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFEa8W0rI3LObhzHRMRApQnAKDKwRlXVIpFh2jFVfzEFGb8wT5jywCfTGVx
BCAucAYgvI+IrIjORBQaSvQ=
=CR00
-----END PGP SIGNATURE-----



More information about the thelist mailing list