[thelist] Perl string help
Bill Moseley
moseley at hank.org
Wed Feb 14 09:30:22 CST 2007
On Wed, Feb 14, 2007 at 02:52:14PM +0000, Struan Donald wrote:
> FWIW Bill's regex solution is quite likely quicker and it's certainly
> a more Perlish solution. And if you can use the Text::Wrap solution
> then that's even better as Text::Wrap is much more likely to handle
> edge cases.
More Perlish, but not faster at all:
Text length is 26200
regex returns 108 lines
text_wrap returns 107 lines
loop returns 107 lines
Benchmark: timing 1000 iterations of loop, regex, text_wrap...
loop: 1 wallclock secs ( 0.40 usr + 0.00 sys = 0.40 CPU) @ 2500.00/s (n=1000)
regex: 1 wallclock secs ( 1.48 usr + 0.00 sys = 1.48 CPU) @ 675.68/s (n=1000)
text_wrap: 25 wallclock secs (23.75 usr + 0.03 sys = 23.78 CPU) @ 42.05/s (n=1000)
But, speed may not be that critical when processing a form posting.
#!/usr/bin/perl
use strict;
use warnings;
use Text::Wrap;
use Data::Dumper;
use Benchmark;
my $t = 'Can anyone help me with a Perl string function? I need to take the text from a textarea and split the string at the last blank space before 250 characters. So if I have a 1000 character string, it would be cut into about 4 or 5 strings at spaces between words.';
my $piece_length = 250;
my $long_text = $t x 100;
print "Text length is ", length( $long_text ), "\n";
my %tests = (
regex => \®ex,
text_wrap => \&text_wrap,
loop => \&loop,
);
print "$_ returns ", scalar @{ $tests{$_}->()}, " lines\n"
for keys %tests;
timethese( 1000, \%tests );
sub regex {
my $text = $long_text;
my @lines;
my $len = $piece_length - 1;
push @lines, $1 while $text =~ s/(.{1,$len})\s+//o;
push @lines, $text if $text; # orphaned text
return \@lines;
}
sub text_wrap {
my $text = $long_text;
$Text::Wrap::columns = $piece_length;
return [split /\n/, wrap('','', $text )]
}
sub loop {
my $text = $long_text;
my @pieces;
my $start = 0;
while ( ( length($text) - $start ) > $piece_length ) {
my $split = rindex( substr( $text, $start, $piece_length ), ' ' );
push @pieces, substr($text, $start, $split);
$start += $split + 1;
}
push @pieces, substr($text, $start);
return \@pieces;
}
--
Bill Moseley
moseley at hank.org
More information about the thelist
mailing list