[thelist] Search closest match across multiple columns (mysql)

Phil Turmel pturmel-webdev at turmel.org
Wed Apr 16 10:16:46 CDT 2008


Phil Turmel wrote:
> Dave,
> 
> Matt Warden wrote:
>> On Tue, Apr 15, 2008 at 5:49 PM, Dave Stevens
>> <evolt at davestevens.co.uk> wrote:
>>> So, for example, assuming my table contains this data:
>>>
>>> name                 height         weight         dateofbirth Joe
>>> Bloggs        60                50                1980-05-02 John
>>> Smith        70                65                1972-01-17 Jane
>>> Doe          45                20                 1987-12-14
>>>
>>> If my user entered:
>>>
>>> height: 50 weight: 35 date of birth: 1981-11-01
>>>
>>> My application would return Jane Doe first, then Joe Bloggs and
>>> finally John Smith.
>>
>> Why? We need to know how you determine that.
>>
>> Most likely, you are doing something like:
>>
>> select name from mytable order by (height-50)*weight1 +
>> (weight-35)*weight2 + (dob-'1981-11-01')*weight3 desc
>>
>> weight1, weight2, and weight3 determine how much height, weight, and 
>> dob matter in determining a match. This is something you have to 
>> determine. Do you have an idea of these?
>>
> 
> Matt's recommendation is clearly the *Right Answer* (tm).  If you're
> scratching your head as to what weightings to start with, I suggest you use
> the reciprocal of standard deviation as your weight.  This will give you a
> reasonably "natural" spread on each criteria.  You'll likely need to tweak
> the dateofbirth weighting some, as birth dates are not normally distributed
> in the general population.
> 
> HTH,
> 
> Phil
> 
> 

Hmmm.  Let me amend this....  You'll want absolute values on each component,
and an ascending index.

Something like this:

SELECT name
  FROM  mytable
  ORDER BY ABS(height-50)*W1 + ABS(weight-35)*W2 + ABS(dob-'1981-11-01')*W3
  LIMIT 10

You can get starting points for W1, W2, and W3 with:

SELECT 1/STD(height) AS W1,
  1/STD(weight) AS W2,
  1/STD(dob-'2000-01-01') AS W3

HTH,

Phil

-- 
Need to contact me offlist? 
  Drop -webdev or you probably won't get through.



More information about the thelist mailing list