This depends rather on what the purpose of scoring is. If it is to filter one's reading, there is an inherent contradiction:
* the reader wants some filtering, so they can just read the good stuff
* the filtering is done by the readers, which requires they read more than just the good stuff
How effective can this ever be? It seems a weakness in all public-contribution-based systems (including Google search to some extent, because of page-rank . . .).
(I posted this thought before somewhere, but I don't know if it wasn't thought good, or that no-one much read it.)
I like this trivial way to compute scores for similar applications:
score = upvotes*(upvotes/(upvotes+downvotes))
Edit: some example output follows.
10 up, 10 down = 5.0
50 up, 50 down = 25.0
100 up, 0 down = 100.0
70 up, 30 down = 49.0
2 up, 1 down = 1.3333
10 up, 20 down = 3.3333
This removes some bias due to the time, but does not remove from the game the idea that a lot of votes are still an hint of interest. Ah, and the math is trivial ;)
Note that this is just the vanilla formula, you can simply alter the weight of the different parameters changing a bit the math.
Useless to say that this is only the first step to get the actual sorting. You may not want to order by score, but by rank, where rank is something like that (if the items you are sorting must be fresh):
up down score
100 100 50
20 0 20
100 200 33
100 300 25
100 400 20
Wouldn't you want a comment with 20 upvotes and no downvotes to rank above a comment with 100 upvotes and 300 downvotes? It gets even worse for higher numbers:
You can change this factor just modifying a bit the equation. In the vanilla version the amount of votes play a very important role indeed, but it's simple to hack it in order for avoid this problem. For instance using the logarithm instead of do everything linear like I did.
this still doesn't solve the problem of "newer" comments staying at the bottom of the list and never getting the chance to gain traction no matter how good they are.
to solve this is not hard mixing the time factor inside. "hot" is using this algorithm but it sucks because in a thing like comments you want to add a "barrier", that is, for the first 24 hours time will do his work, but will affect the rating less as the whole discussion gets older.
won't modifying the rank by mixing in a time factor make comments with a good score move away from the top over time even if they still have a good score? while the math is trivial with your suggestion, it tends to be biased in favor of volume of upvotes (regardless of the ratio of upvote/downvote), then biased in favor of new comments (again, regardless of upvote/downvote relevance)? while this will solve sorting requirements for some applications, it's not really solving the problem that most comment voting systems have where volume of votes is rewarded over quality and/or newer comments are never positioned well enough to gain any traction over comments posted in the first hour. if you end up hacking your formulas a lot to fix these shortcomings wouldn't it be just as easy to say the problem is not trivial and then leverage the complicated but statistically sound formula suggested by the author?
yes, I think your point is correct but what I don't like the concepts behind the reddit new formula, because I really think that 100 upvotes and 30 downvotes are really different than 10 upvotes and 3 downvotes. Also time should absolutely appear in the equation in my opinion in the kind of discussion site like reddit or HN, since new comments, if as good as old comments, should be able to go up. And in this latest trick there is in my opinion the best way to avoid stagnation leading to the same top comments being voted up and up again. I'll experiment a bit about this issue, since I'm very interested in this issues. I happen to run oknotiize.virgilio.it that's the main social news site in Italy, and while we have voting on comments we currently use a chronological sorting. The votes are just used to colour the background red or green. Now I want to add sorting so I'll try to design an algorithm having the following goals:
a) time matters, so good old comments tend to go down, but this only happens in the first 24/48 hours, in the long run everything is much more ordered by score.
b) the score of a comment is roughly up/down but it tends to get better with more votes.
c) still much better a/b ratios tend to win against other comments having a lot of comments. For instance a 100 up and 50 down will win against a 10 up and 5 down in a strong way, but will be more or less the same score as a 15 up and 5 down.
before you go reinventing the wheel, here are the results from running your examples through the formula suggested by Evan Miller in the link...
10 up 10 down 0.327403766068315
50 up 50 down 0.418847795168265
100 up 0 down 0.973657278603792
70 up 30 down 0.620167865061336
2 up 1 down 0.253533865071156
10 up 20 down 0.210836926288844
and a few more to see some other interesting cases...
100 up 30 down 0.703333001615963
10 up 3 down 0.541934244292211
30 up 100 down 0.175849385303401
1 up 0 down 0.269865944074627
1 up 1 down 0.120866317496523
2 up 0 down 0.425030603165412
100 up 0 down 0.973657278603792
100 up 100 down 0.442235043128361
200 up 0 down 0.986652839030471
as you can see. as there are more votes cast, the confidence that this represents the ultimate ranking improves. with less votes, a high up/down ratio is good, but not as good as a high up/down ratio with more total votes (because there is more statistical confidence with more data points).
finally below are the test cases that broke your initial implementation.
That isn't a very good sorting measure. It simply means you were the first one to comment.
You can game Hacker News currently this way. If there is already comments in a topic, don't start a new thread, instead reply to the top thread no matter what they said. You will get more replies and more votes.
but only for a lot of second-level children, not children of one reply. reddit's stupid pun threads usually have one reply after another as a staircase instead of many replies to one quality comment.
i always liked gnus usage of scoring individual news articles by either subject/author etc etc. having that would be really nice.
just allow folks to score whatever/whoever they feel like, rather than foisting your version of 'goodness' on others. make low scoring articles fade into the background...
I hope this can lead to more long lasting discussions. Sometimes I find a thread late and want to add something but it seems pointless when the last comments were added more than a day ago. I think this site could be better if there was a way to keep having the threads active and maybe summarized and condensed every so often.