Wilson Confidence Interval for 5 Star Rating

The Wilson confidence interval takes TRUE or FALSE, or "upvotes" and "downvotes" respectively as arguments. From these votes he generates a rating.

For my project, I think WCI is great. However, scalar upvote and downvote are not enough to describe what I am evaluating.

As for the 5 star rating, and this is where I need someone to refute my logic. Now I think that if I were to implement a 5-star rating using WCI, then the following should work without breaking the inside of the confidence interval.

For each star in the rating widgets, we assign a unique integer value. Each value is considered either positive (top) or negative (downvote). Thus, the following values:

1/5 stars: -2 2/5 stars: -1 3/5 stars: 1 4/5 stars: 2 5/5 stars: 3

To summarize the above values. A minimum vote of 1 star is classified as 2 downvotes. A 2 star voice is classified as 1 vote. For an average voice of 3 stars we give 1 upvote. For 4 stars we give 2 upvotes. And for a maximum of 5 stars we give 3 upvotes.

Please refuted this logic, why it will not work? Maybe this contradicts the "average understanding of man" of the star rating system?

+7
algorithm rating
source share
3 answers

First, try to understand what is intuition for WCI. Or, even simpler, the normal approximation interval ( http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval ).

The intuition behind all this interval computation is simple. You calculate the average of the sample and the standard deviation. The interval is the average + -z * std.

In your case, calculating the average is simple. This is the average rating. Suppose p1 is the fraction of a 1-star rating, p2, ..., p5. p1 + p2 + ... + p5 = 1. And suppose you calculate these statistics using n samples. the average of your data: 1 * p1 + 2 * p2 + ... + 5 * p5.

Dispersion of your data (E (x ^ 2) - (E (x)) ^ 2) / n = ((p1 * 1 ^ 2 + p2 * 2 ^ 2 .. + p5 * 5 ^ 2) - (1 * p1 + 2 * p2 + .. + 5 * p5) ^ 2) / n

Since std = sqrt (var), it is enough to calculate the normal approximation interval. I will let you work on extending this for WCI.

+2
source share

It’s easy to think of the following “workaround,” which converts a multi-rating system into a binary up / down rating (which can then be estimated using the lower bound of the Wilson rating confidence interval):

Say you have a popular 5-star rating system. Thus, we have several votes, each of which has a value: 1, 2, 3, 4 or 5.

To "convert" these ratings to upper / lower votes, use the following rule:

For star rating -- Add * - 0.00 to up votes and 1.00 to down votes (ie a full down vote) ** - 0.25 to up votes and 0.75 to down votes *** - 0.50 to up votes and 0.50 to down votes **** - 0.75 to up votes and 0.25 to down votes ***** - 1.00 to up votes and 0.00 to down votes (ie a full up vote) 

After we downgrade 5 star ratings to the upper and lower ratings, we can continue the usual calculation of the account described in the article by Evan Miller.

Since I am not a statistician or a mathematician, and I would like to hear from other people if this makes sense or not, and what could be the problems with this approach.

+2
source share

The biggest problem with this scheme is that one 5-star rating will weigh up to 3 star ratings. In addition, an item with 300 3-star ratings (which should be a mediocre score) will have the same score as an item with 100 5-star ratings (which should be a great result).

What you can do is calculate the Wilson confidence interval for each possible score. The lower limit of each interval represents the weight of this account in the average (weighted) means.

+1
source share

All Articles