[thelist] Regex remove repeats

Jason Handby jasonh at corestar.co.uk
Wed Jan 26 06:41:28 CST 2005


> Hi all
> 
> I've got a large amount of repeating data that I need to 
> strip down so that there's only one instance of each name.  
> For example, I've got:
> 
> ac/dc,ac/dc,ac/dc,ac/dc,david bowie,david bowie,dixie 
> chicks,dixie chicks,black sabbath,zz top,zz top,zz top,zz top
> 
> I'm stuck on writing a regex that is going to lok ahead to 
> see if the next item in the comma separated list is the same 
> as the last and, if it is, to remove it.
> 
> Any help greatly appreciated.

How's this?

	s/([^,]*,)(?:\1+)/\1/g

It requires that your string of data has a comma on the end -- hope that's
not a problem.

It works in the Regex Coach ( http://www.weitz.de/regex-coach/ ), and should
also work in ASP.NET, but I haven't tested it in PHP.



Jason



More information about the thelist mailing list