[thelist] Search closest match across multiple columns (mysql)

Phil Turmel pturmel-webdev at turmel.org
Wed Apr 16 10:00:58 CDT 2008


Matt Warden wrote:
> On Tue, Apr 15, 2008 at 5:49 PM, Dave Stevens
> <evolt at davestevens.co.uk> wrote:
>> So, for example, assuming my table contains this data:
>> name                 height         weight         dateofbirth Joe
>> Bloggs        60                50                1980-05-02 John
>> Smith        70                65                1972-01-17 Jane
>> Doe          45                20                 1987-12-14
>> If my user entered:
>> height: 50 weight: 35 date of birth: 1981-11-01
>> My application would return Jane Doe first, then Joe Bloggs and
>> finally John Smith.
> Why? We need to know how you determine that.
> Most likely, you are doing something like:
> select name from mytable order by (height-50)*weight1 +
> (weight-35)*weight2 + (dob-'1981-11-01')*weight3 desc
> weight1, weight2, and weight3 determine how much height, weight, and 
> dob matter in determining a match. This is something you have to 
> determine. Do you have an idea of these?

Matt's recommendation is clearly the *Right Answer* (tm).  If you're
scratching your head as to what weightings to start with, I suggest you use
the reciprocal of standard deviation as your weight.  This will give you a
reasonably "natural" spread on each criteria.  You'll likely need to tweak
the dateofbirth weighting some, as birth dates are not normally distributed
in the general population.



Need to contact me offlist? 
  Drop -webdev or you probably won't get through.

More information about the thelist mailing list