[thelist] SQL Server 2008 - How to get the 95% percentile?

Luther, Ron Ron.Luther at hp.com
Tue Jun 14 16:49:39 CDT 2011


Anthony Baratta led us on a nice review of "trim mean"

>>but the idea is to remove 5% of the total "population" to get ride of "outliers". e.g. 


Hi Anthony!

Oh yeah.  I remember that.  We used to call that kind of statistic a "trim mean".  [Back when I knew something about statistics] ... and you are exactly right, it is used to get rid of outliers.  It lets you throw a Bill Gates out of your random sample when you're trying to estimate something like 'average net worth'.

Or judge a figure skating competition by throwing out the high and low values.

>>The quick and dirty is to remove the "highest" 5%. But obviously you could go the other way and remove the >>bottom 5%. What I don't remember is, can you go both ways by 2.5%?

Yep.  Exactly so.  In a 5% "trim mean" you remove the 2.5% highest values and the 2.5% lowest values.  



The percentile, on the other hand, is a different color of stripy horsed animal!  It's the "I scored in the 95th percentile, so my score is in the top 5% in the country on this exam!"  Or the "I'm paying a bundle for car insurance because my driving record puts me in the 90th percentile for 'high risk'".

Cheers!
RonL.

IIRC, (and it's been a VERY long time for me as well),  I think there was a similar technique where you ranked things by the absolute value of the deviation from the mean.  Then trimmed.  Then recomputed the mean.  Don't remember what it was called.  {It's not a true "trim mean" because in a 5% case you might end up trimming 4% from the 'high' end and only 1% from the 'low'.  It concentrated on removing outliers first.}  But it's been a very long while so I may be misremembering.




More information about the thelist mailing list