[thelist] converting whitespace to single space between printablecharacters

Chris at globet.com Chris at globet.com
Wed Dec 7 09:37:05 CST 2005


Alex

> Im trying to reduce the whitespace between printable 
> characters to a single space.
> 
> What i want to do is get all of the the words in a text so as 
> to make a histogram of word frequencies.
> 
> Im doing this by exploding the string and using space as the 
> delimiter. 

Caveat: I'm not a PHP programmer. Having said this, would it be easier
to use regular expressions to create a collection of words thus:

preg_match("/^\w/i", $text, $matches);

After which you could analyse the words in the collection.

> that only works in so far as words that are separated by 
> spaces. so i find that i get words that have been yoked 
> together since they were seperated with a return character.
> 
> how would i convert the text to be ready before i use 'explode'?

If you must use this method, can you use regular expressions, replacing
strings conforming to the following pattern with a single space:

/[\n\r\\s\t]+/

To be on the safe side you could then replace strings conforming to the
following pattern with a single space:

/\s+/

HTH

Chris Marsh
Web Developer
http://www.globet.com/
Tel: +44 20 8246 4804 Ext 828
Fax: +44 20 8246 4808

Any opinions expressed in this email are those of the individual and not
necessarily the Company. This message is intended for the use of the
individual or entity to which it is addressed and may contain
information that is confidential and privileged and exempt from
disclosure under applicable law. If the reader of this message is not
the intended recipient, you are hereby notified that any dissemination,
distribution, or copying of this communication is strictly prohibited.
If you have received this communication in error, please contact the
sender immediately and delete it from your system. 



More information about the thelist mailing list