[thelist] Search Function on Website

kasimir-k evolt at kasimir-k.fi
Tue May 16 05:36:55 CDT 2006


Justin Zachan scribeva in 16/05/2006 4:56:
> Does anyone have suggestions on how best to deal with allowing the search
> function to allow for spelling mistakes???

There are various ways to determine the similarity of two given strings 
(which is what you want to do to allow for spelling mistakes).

Simon White has a nice introduction to the subject here:
http://www.catalysoft.com/articles/MatchingSimilarStrings.html
And he presents his own approach here:
http://www.catalysoft.com/articles/StrikeAMatch.html

Below is a PHP version of the algorithm:

/**
*	Sting Similarity based on common character pairs
*	@param		$str0
*	@param		$str1
*	@return		similarity value, from 0 to 1
*/
function strSim($str0, $str1) {
    $pairs = array();
    for ($p = 0; $p < 2; $p++) {
       $pairs[$p] = array();
       $str = ' ' . trim(preg_replace('/\s+/', ' ', ${"str$p"})) . ' ';
       for ($i = 0, $ii = strlen($str) - 1; $i < $ii; $i++) {
          $pairs[$p][] = strtoupper(substr($str, $i, 2));
       }
    }
    $intersection = 0;
    $union = count($pairs[0]) + count($pairs[1]);
    for ($i = 0, $ii = count($pairs[0]); $i < $ii; $i++) {
       if (($key = array_search($pairs[0][$i], $pairs[1])) !== false) {
          $intersection++;
          unset($pairs[1][$key]);
       }
    }
    return 2 * $intersection / $union;
}


hth,
.k



More information about the thelist mailing list