[thelist] Perl string help

Bill Moseley moseley at hank.org
Wed Feb 14 09:30:22 CST 2007


On Wed, Feb 14, 2007 at 02:52:14PM +0000, Struan Donald wrote:
> FWIW Bill's regex solution is quite likely quicker and it's certainly
> a more Perlish solution. And if you can use the Text::Wrap solution
> then that's even better as Text::Wrap is much more likely to handle
> edge cases.

More Perlish, but not faster at all:

Text length is 26200
regex returns 108 lines
text_wrap returns 107 lines
loop returns 107 lines
Benchmark: timing 1000 iterations of loop, regex, text_wrap...
      loop:  1 wallclock secs ( 0.40 usr +  0.00 sys =  0.40 CPU) @ 2500.00/s (n=1000)
     regex:  1 wallclock secs ( 1.48 usr +  0.00 sys =  1.48 CPU) @ 675.68/s (n=1000)
 text_wrap: 25 wallclock secs (23.75 usr +  0.03 sys = 23.78 CPU) @ 42.05/s (n=1000)

But, speed may not be that critical when processing a form posting.


#!/usr/bin/perl
use strict;
use warnings;
use Text::Wrap;
use Data::Dumper;
use Benchmark;


my $t = 'Can anyone help me with a Perl string function?  I need to take the text from a textarea and split the string at the last blank space before 250 characters.  So if I have a 1000 character string, it would be cut into about 4 or 5 strings at spaces between words.';

my $piece_length = 250;


my $long_text = $t x 100;

print "Text length is ", length( $long_text ), "\n";

my %tests = (
    regex       => \&regex,
    text_wrap   => \&text_wrap,
    loop        => \&loop,
);

print "$_ returns ", scalar @{ $tests{$_}->()}, " lines\n"
    for keys %tests;


timethese( 1000, \%tests );


sub regex {
    my $text = $long_text;
    my @lines;
    my $len = $piece_length - 1;
    push @lines, $1 while $text =~ s/(.{1,$len})\s+//o;
    push @lines, $text if $text; # orphaned text

    return \@lines;
}

sub text_wrap {
    my $text = $long_text;

    $Text::Wrap::columns = $piece_length;
    return [split /\n/, wrap('','', $text )]
}

sub loop {
    my $text = $long_text;

    my @pieces;
    my $start = 0;

    while ( ( length($text) - $start ) > $piece_length ) {
        my $split = rindex( substr( $text, $start, $piece_length ), ' ' );
        push @pieces, substr($text, $start, $split);
        $start += $split + 1;
    }

    push @pieces, substr($text, $start);

    return \@pieces;
}







-- 
Bill Moseley
moseley at hank.org




More information about the thelist mailing list