[thelist] Re: Regex remove repeats
Chris Nicholls
chris at axe.dircon.co.uk
Wed Jan 26 08:59:50 CST 2005
>>I'm stuck on writing a regex that is going to lok ahead to see if the
>>next item in the comma separated list is the same as the last and, if
>>it is, to remove it
This isn't a regex, but one solution, simpler and more scalable,
depending on your scripting language, would be:
1. Split string into list
2. Create an empty "holding" array. Also create an empty hash/struct
3. Loop through list see if value exists as a key in your new struct.
If not, add value to holding array, and set flag in struct saying
value has been "seen"
4. Write holding array, which now contains only unique values, back out
to new list.
In Perl:
my @new_list;
# will hold unique values
my %hash;
# will hold flags marking "seen" values
my $input='ac/dc,ac/dc,ac/dc,ac/dc,david , david bowie,dixie hicks,dixie';
# your input list
for my $key (split /\s*,\s*/ , $input){
#split on comma and optional spaces before or after comma
if (!$hash{$key}){
push @new_list,$key;
# add not-yet-encountered values to new list array
$hash{$key}=1;
# set flag in hash to say we've now seen this value
}
}
my $unique_list=join ',', @new_list;
# turn array into new list
-Chris
More information about the thelist
mailing list