[thelist] Search closest match across multiple columns (mysql)

Phil Turmel pturmel-webdev at turmel.org
Wed Apr 16 10:00:58 CDT 2008


Dave,

Matt Warden wrote:
> On Tue, Apr 15, 2008 at 5:49 PM, Dave Stevens
> <evolt at davestevens.co.uk> wrote:
>> So, for example, assuming my table contains this data:
>> 
>> name                 height         weight         dateofbirth Joe
>> Bloggs        60                50                1980-05-02 John
>> Smith        70                65                1972-01-17 Jane
>> Doe          45                20                 1987-12-14
>> 
>> If my user entered:
>> 
>> height: 50 weight: 35 date of birth: 1981-11-01
>> 
>> My application would return Jane Doe first, then Joe Bloggs and
>> finally John Smith.
> 
> Why? We need to know how you determine that.
> 
> Most likely, you are doing something like:
> 
> select name from mytable order by (height-50)*weight1 +
> (weight-35)*weight2 + (dob-'1981-11-01')*weight3 desc
> 
> weight1, weight2, and weight3 determine how much height, weight, and 
> dob matter in determining a match. This is something you have to 
> determine. Do you have an idea of these?
> 

Matt's recommendation is clearly the *Right Answer* (tm).  If you're
scratching your head as to what weightings to start with, I suggest you use
the reciprocal of standard deviation as your weight.  This will give you a
reasonably "natural" spread on each criteria.  You'll likely need to tweak
the dateofbirth weighting some, as birth dates are not normally distributed
in the general population.

HTH,

Phil


-- 
Need to contact me offlist? 
  Drop -webdev or you probably won't get through.



More information about the thelist mailing list