[thelist] php pattern matching

Simon Ould evolt.org at isurus.com
Sun Feb 13 16:06:57 CST 2011


Hi Bob,

The examples below make the assumption that a "}" will never occur
within the structured data.

<?php
$raw_data = '{12345|tom smith|whatever title} oh what a beautiful
morning, yadda yadda';
$result = preg_match('~^{([[:print:]]*)}([[:print:]\n\r]*$)~U',
$raw_data, $split_data);

if(!$result)
       echo 'No match found';
else
       echo "<pre>\$split_data:\n" . print_r($split_data, true). "</pre>\n";
?>

Output from the above script:
$split_data:
Array
(
   [0] => {12345|tom smith|whatever title} oh what a beautiful
morning, yadda yadda
   [1] => 12345|tom smith|whatever title
   [2] =>  oh what a beautiful morning, yadda yadda
)

ie $split_data[1] contains the structured information, and
$split_data[2] contains the descriptive text (which may contain CR LF
characters).



If you are going to be extracting the parts of the structure data too,
you could do this at the same time, for example:
<?php
$raw_data = '{12345|tom smith|whatever title} oh what a beautiful
morning, yadda yadda';
$result = preg_match('~^{(([[:print:]]*)\|([[:print:]]*)\|([[:print:]]*))}([[:print:]\n\r]*$)~U',
$raw_data, $split_data);

if(!$result)
       echo 'No match found';
else
       echo "<pre>\$split_data:\n" . print_r($split_data, true). "</pre>\n";
?>

Output from the above script:
$split_data:
Array
(
   [0] => {12345|tom smith|whatever title} oh what a beautiful
morning, yadda yadda
   [1] => 12345|tom smith|whatever title
   [2] => 12345
   [3] => tom smith
   [4] => whatever title
   [5] =>  oh what a beautiful morning, yadda yadda
)

ie
$split_data[1] contains the structured information
$split_data[2] contains the first part of the structured information.
$split_data[3] contains the second part of the structured information.
$split_data[4] contains the third part of the structured information.
$split_data[5] contains the descriptive text.


The descriptive text may contain CR/LF characters.


HTH,

 Simon


> From: Bob Meetin <bobm at dottedi.biz>
...snip...
> {12345|tom smith|whatever title} oh what a beautiful morning, yadda yadda
>
> The curly brackets will be used to contain some particular information
> always the same structure. Anything external to whatever is contained
> within the curly brackets {} is descriptive text.  The descriptive text
> may be a long passage, many lines.
>
> What is the PHP syntax used to:
>
> 1) Extract anything between the curly brackets into a simple string?
> 2) Remove the part within the curly brackets leaving the descriptive text?
>
> -Thx, Bob
>
>


More information about the thelist mailing list