[thelist] SQL Server 2008 - How to get the 95% percentile?
Luther, Ron
Ron.Luther at hp.com
Tue Jun 14 16:49:39 CDT 2011
Anthony Baratta led us on a nice review of "trim mean"
>>but the idea is to remove 5% of the total "population" to get ride of "outliers". e.g.
Hi Anthony!
Oh yeah. I remember that. We used to call that kind of statistic a "trim mean". [Back when I knew something about statistics] ... and you are exactly right, it is used to get rid of outliers. It lets you throw a Bill Gates out of your random sample when you're trying to estimate something like 'average net worth'.
Or judge a figure skating competition by throwing out the high and low values.
>>The quick and dirty is to remove the "highest" 5%. But obviously you could go the other way and remove the >>bottom 5%. What I don't remember is, can you go both ways by 2.5%?
Yep. Exactly so. In a 5% "trim mean" you remove the 2.5% highest values and the 2.5% lowest values.
The percentile, on the other hand, is a different color of stripy horsed animal! It's the "I scored in the 95th percentile, so my score is in the top 5% in the country on this exam!" Or the "I'm paying a bundle for car insurance because my driving record puts me in the 90th percentile for 'high risk'".
Cheers!
RonL.
IIRC, (and it's been a VERY long time for me as well), I think there was a similar technique where you ranked things by the absolute value of the deviation from the mean. Then trimmed. Then recomputed the mean. Don't remember what it was called. {It's not a true "trim mean" because in a 5% case you might end up trimming 4% from the 'high' end and only 1% from the 'low'. It concentrated on removing outliers first.} But it's been a very long while so I may be misremembering.
More information about the thelist
mailing list