[thelist] [fwd] Regex Revisited

Andrew Chadwick andrew.chadwick at prnewswire.co.uk
Thu Sep 27 06:24:18 CDT 2001


On Thu, Sep 27, 2001 at 06:52:23AM -0400, Adrian Fischer wrote:
> Im trying to get the first 100 characters from a paragraph with I
> extract from a mysql db.  I got it working with this bit of code
> 
>   $$varDBInfo[2]=~/^(.{0,100})/;
>   $short_text=$1;
>   print "This the extract $short_text";
> 
> This is what I think its saying: ^=start of string, (.)=any
> character except new line,{0.100}=from 0 to the 100th character.
> 
> The next hurdle seems to be if they format the text like I have in
> this email, [Hi gang,\n\n] with a greeting then a comma and a new
> line before the message actaully starts.  How can I change the code
> to allow for all that if they should do it but still get the 100
> characters if they dont.  Are we talking if statements.  Im in perl
> by the way.
> 
> Do I need to add in a \n \r in there somewhere?
[...]

Assuming the \n for your system is the \n you have in your data[1],

  ($_) = $sth->fetchrow_array;   # or whatever you use to get stuff
  s/^[^\n,]+,[\t\040]*\n{2,}//;  # lop off what you said
  print;

should do the trick. No need for testing anything, s/// does nothing
if its pattern doesn't match.

Ask yourself if that's what you really want, though. It could quite
easily do the wrong thing and strip any first para of one line
separated from the rest of the text by two or more newlines. The first
line of this reply is an example.


[1] It's easy enough to massage it to the local end-of-line convention
    if it isn't.

-- 
Andrew Chadwick, UNIX/Internet Programmer, PR Newswire Europe, Oxford
--
The views or opinions above are solely mine and are not necessarily those
of PR Newswire Europe. The message may contain privileged or confidential
information; if you are not a named recipient, notify me, and do not copy,
use, or disclose this message. <andrew.chadwick at prnewswire.co.uk>.




More information about the thelist mailing list