I'm no statistician, but I do load test a lot of systems and report upon them.
It's common practice to report upon a particular measure using the average of the 90th percentile of the data (throwing out the slowest 10%): response times, throughput, etc. This is done to remove the outliers; the 2 days response time for a call that normally takes 100ms.
I got to thinking; if the slowest 10% is obviously wrong, why isn't the fastest 10%? Seems just taking the slow outliers is cheating.
I think a good compromise between simplicity and accuracy would be to throw out the slowest 5% and the fastest 5%.
A statistician would know better I'm sure. But explaining 90th percentile to your boss/customer is generally hard enough.
No comments:
Post a Comment