How should entries be ranked?
category: parties [glöplog]
I hate being the guy who starts such a discussion again, but the elephant in the room was too big to ignore ...
Over in the Evoke 2019 thread, steam kindly provided the raw vote data for the PC 4k compo, which had really skewed results, with the obvious winner entry coming in 5th. This data confirmed what I always thought: When ranking compos, using the vote average instead of the sum would be much more fair. Let's sort the results by average:
That looks much more reasonable to me!
The problem with the current voting system is that due to how it works, the votes are implicitly weighted by how many people voted for that particular entry. As a result, if somebody votes for some, but not all entries in a compo, the unvoted entries get the worst possible result. Or, put differently, not voting for specific entries is not the neutral option, it's the worst option; literally "0 out of 5 stars".
I'm not at all comfortable with this. If I missed an entry during the compo or don't remember it while voting, I'd like to have a true neutral option, and that neutral option should also be the default for unvoted entries. You get both of these properties by keeping the voting system as it currently is, but using the vote average instead of the sum for the final ranking.
I initially wanted to write this post in the Evoke thread, but then it would have looked like particular criticism about Evoke, which it totally isn't. It's a broader issue that affects all parties that use the default Partymeister voting system. I'm also not picking on Partymeister - I don't know who defined the current voting system and why, but they certainly had their reasons to design it as it is. I'd love to hear about them, and humbly suggest a simple change that I see as a clear improvement.
Over in the Evoke 2019 thread, steam kindly provided the raw vote data for the PC 4k compo, which had really skewed results, with the obvious winner entry coming in 5th. This data confirmed what I always thought: When ranking compos, using the vote average instead of the sum would be much more fair. Let's sort the results by average:
Quote:
1. Eisenerz 538 117 4.5983
2. Glitch Rider 630 147 4.2857
3. 완노비 555 135 4.1111
4. One Way Trip 591 145 4.0759
5. Druckbestampfung 555 143 3.8811
That looks much more reasonable to me!
The problem with the current voting system is that due to how it works, the votes are implicitly weighted by how many people voted for that particular entry. As a result, if somebody votes for some, but not all entries in a compo, the unvoted entries get the worst possible result. Or, put differently, not voting for specific entries is not the neutral option, it's the worst option; literally "0 out of 5 stars".
I'm not at all comfortable with this. If I missed an entry during the compo or don't remember it while voting, I'd like to have a true neutral option, and that neutral option should also be the default for unvoted entries. You get both of these properties by keeping the voting system as it currently is, but using the vote average instead of the sum for the final ranking.
I initially wanted to write this post in the Evoke thread, but then it would have looked like particular criticism about Evoke, which it totally isn't. It's a broader issue that affects all parties that use the default Partymeister voting system. I'm also not picking on Partymeister - I don't know who defined the current voting system and why, but they certainly had their reasons to design it as it is. I'd love to hear about them, and humbly suggest a simple change that I see as a clear improvement.
I'm afraid you're trying to find a technical solution to a social problem.
I just voted for some demos this time. The ones I remembered.. So this was actually wrong like it seems now.
X uses the avg point already. But then again, there's no perfect solution, if people just don't vote.
ftp://ftp.scs-trc.net/pub/c64/Party/2018/X/X18_Results.txt
ftp://ftp.scs-trc.net/pub/c64/Party/2018/X/X18_Results.txt
Quote:
I'm afraid you're trying to find a technical solution to a social problem.
Read again; this is not about protection against namevoting or similar abuse of the voting system. It's just improving the outcome for people who vote conciously.
X way looks good in first glance, given that at least some vote. But with too few voters nothing works.
I wholeheartedly agree with KeyJ. Result calculations based on vote average would make much more sense.
Average doesn't work if there are a lot of entries, and people don't vote for the shitty ones.
Eg: if the 27th worst intro gets 5 stars from its creator (and no one else!), then it can easily win a compo.
So, instead of rating, it is better to vote for the order, because this way you rate _all_ the prods.
Eg: if the 27th worst intro gets 5 stars from its creator (and no one else!), then it can easily win a compo.
So, instead of rating, it is better to vote for the order, because this way you rate _all_ the prods.
Quote:
Average doesn't work if there are a lot of entries, and people don't vote for the shitty ones.
Eg: if the 27th worst intro gets 5 stars from its creator (and no one else!), then it can easily win a compo.
Pretty much this.
Quote:
So, instead of rating, it is better to vote for the order, because this way you rate _all_ the prods.
Don't agree with this. Voting for the order is harder cognitively, especially for compos with a lot of entries. Doing a fair comparison with "the last ten entries" is really hard. So you'll most likely end up ranking the stuff you like the most, and then just randomly assign the rest of the entries.
It's way easier to assign a score (let's say 1-5) to the entries as you see them than trying to maintain a constantly updated ranked ordering in your head through a compo.
One thing to consider: Which option works better when there are few votes? When there are lots of voters, both methods will converge obviously, but what about small parties? With the limited 1-5 star system, will there potentially be several entries with an average of 5 points simply because everyone liked them and couldn't decide?
i also think that it is "weird" to select 0-5 points during livevoting without knowing what else is to be played in the compo. i never quite got wether my "5" means "awesome shit, should get a meteorics" or "given this competition, was one of the best". i also think people handle this differently.
an average is always a problem if you have less votes. i could win the compo by giving myself the only vote (a 5, obviously) and noone else voting. so then you'd have to do stuff like: needs at least 10% of possible votes (number of registered votekeys). but apart from the 10% being completely arbitrary that actually might pose a problem with smaller compos or parties, where 10% again might be (less than) one person.
if it is just about deciding on the ranking in the compo, and not a general "this prod is the best ever" (which might be resolved on other occasions (pouet, jury-awards, fistfights, ...) then i'd let the voter sort the entries by preference, something like this:
https://en.wikipedia.org/wiki/Instant-runoff_voting
this also does not solve people not voting at all and might be tedious to do, but at least you would have a ranking that most people (that voted) agree upon...
...i bet this needs more thought put into it.
an average is always a problem if you have less votes. i could win the compo by giving myself the only vote (a 5, obviously) and noone else voting. so then you'd have to do stuff like: needs at least 10% of possible votes (number of registered votekeys). but apart from the 10% being completely arbitrary that actually might pose a problem with smaller compos or parties, where 10% again might be (less than) one person.
if it is just about deciding on the ranking in the compo, and not a general "this prod is the best ever" (which might be resolved on other occasions (pouet, jury-awards, fistfights, ...) then i'd let the voter sort the entries by preference, something like this:
https://en.wikipedia.org/wiki/Instant-runoff_voting
this also does not solve people not voting at all and might be tedious to do, but at least you would have a ranking that most people (that voted) agree upon...
...i bet this needs more thought put into it.
When there are lot of voters, can't see any reason why not to use avg rating instead of sum. And yes 1-5 stars is too limited, said that for years in other places... ;)
What Gargaj and Pohar said.
Quote:
I'm afraid you're trying to find a technical solution to a social problem.
I wholeheartedly agree. The solution of a social problem (when it exist) tend to be social too.
Quote:
Voting for the order is harder cognitively, especially for compos with a lot of entries. Doing a fair comparison with "the last ten entries" is really hard. So you'll most likely end up ranking the stuff you like the most, and then just randomly assign the rest of the entries.
It's way easier to assign a score (let's say 1-5) to the entries as you see them than trying to maintain a constantly updated ranked ordering in your head through a compo.
I also agree with this. Moreover, I mostly prefer to vote five stars to the (to my taste) three best demos and four to the two next and three stars to the worse than trying to find a "proper" order.
I don't agree with what lug00ber said, from a UI interface perspective I'd say its way more intuitive to just sort entries instead of choosing between 1-5, also this solves the problem of people only voting for some entries in a compo as adjusting the order always affects all entries.
So +1 for sorting/ordering approach from my side.
So +1 for sorting/ordering approach from my side.
I agree that there "low vote count" situations can be very problematic for average-based ranking. The question is, does this happen in the real world?
I'd really love to run the numbers. Can some party orgas publish or send me the raw voting results, with vote sum and vote count fields per prod at least? Then we can simulate how average-based ranking would turn out in practice.
I'd really love to run the numbers. Can some party orgas publish or send me the raw voting results, with vote sum and vote count fields per prod at least? Then we can simulate how average-based ranking would turn out in practice.
So if i got it all right, instead of namevoting LJ's prod I should now anti-namevote all others? No problem then \o/
as a simple solution, make the default for prods not voted for yet 2 or 3 stars instead of zero (or 5 stars on a 0 - 10 scale). or if you want to keep it at zero make a negative vote option (that could be seen as hating tho, so probably its socially better to stay in the positive).
Or use the average as keyj suggested, but only on larger parties.
Why do small parties need voting anyway? Let the orgas or a jury decide the ranking - at least you know who to blame if it dont suits you ;P
Or use the average as keyj suggested, but only on larger parties.
Why do small parties need voting anyway? Let the orgas or a jury decide the ranking - at least you know who to blame if it dont suits you ;P
+1 I prefer the sorting/ordering UI.
Or if you keep the current system, I'm not sure how is the UI exactly, but you should make it super clear that when you don't vote, you're actually giving a 0/5.
Or even better, you start with all the prods at 1/5:
- it's following on from the current system
- it's very clear on an UI perspective that if you don't explicitly vote for a prod, you are giving it the worst note possible
- it limits a bit the bias brought up by OP, since overlooked entries have 1 point instead 0 (while still giving a boost for the "popular" prods)
Or if you keep the current system, I'm not sure how is the UI exactly, but you should make it super clear that when you don't vote, you're actually giving a 0/5.
Or even better, you start with all the prods at 1/5:
- it's following on from the current system
- it's very clear on an UI perspective that if you don't explicitly vote for a prod, you are giving it the worst note possible
- it limits a bit the bias brought up by OP, since overlooked entries have 1 point instead 0 (while still giving a boost for the "popular" prods)
Quote:
When there are lot of voters, can't see any reason why not to use avg rating instead of sum. And yes 1-5 stars is too limited, said that for years in other places... ;)
Quote:
an average is always a problem if you have less votes. i could win the compo by giving myself the only vote (a 5, obviously) and noone else voting. so then you'd have to do stuff like: needs at least 10% of possible votes (number of registered votekeys). but apart from the 10% being completely arbitrary that actually might pose a problem with smaller compos or parties, where 10% again might be (less than) one person.
if it is just about deciding on the ranking in the compo, and not a general "this prod is the best ever" (which might be resolved on other occasions (pouet, jury-awards, fistfights, ...) then i'd let the voter sort the entries by preference, something like this:
https://en.wikipedia.org/wiki/Instant-runoff_voting
this also does not solve people not voting at all and might be tedious to do, but at least you would have a ranking that most people (that voted) agree upon...
...i bet this needs more thought put into it.
Now that we're talking about voting systems:
Using the average instead of the total is too vulnerable to a low amount of people voting. However, I don't know of any method that balance these two functions (eg. all usage of the STAR voting method use the total score instead). On the other hand, one entry having fewer votes but a higher average probably only happens when livevoting borks, as far as I know. But I might be wrong.
For ranked voting methods, the Schulze method or a variant of the Borda count might be desirable over IRV. Or, if you want to select a top-3 and don't care much about the rest, a three-"seat" STV (Schulze STV?) might also be a better option than IRV.
Quote:
I don't agree with what lug00ber said, from a UI interface perspective I'd say its way more intuitive to just sort entries instead of choosing between 1-5, also this solves the problem of people only voting for some entries in a compo as adjusting the order always affects all entries.
The problem isn't a UI problem, it's remembering if you like the current entry better or worse than the entry played 7 slots ago or the one played 4 slots ago. Should it be placed at sixth or seventh place in your current ranking?
entry sorting also sucks, selecting a personal top 5 or smth would be easier. make it mandatory to select 5 as well, so you don't have the 'let's put that teenangsty demo on #1 and don't care about the rest!'-laziness/bias in the results :)
Quote:
So, instead of rating, it is better to vote for the order, because this way you rate _all_ the prods.
i agree with this totally.
biggest problem with the other metrics is that if you forget to vote, selfvote/namevote a single entry and dont vote for anything else, it easily shifts results.
also there's a reason why e.g. personality tests usually force you to put certain words/traits/statements/topics in distinct order, even though it is hard to sometimes choose which you would prefer if they are "almost" equal of weight in the respective catergory.
therefore as proposal:
rank all or nothing (opt-in)
-- forces users to put ALL compo entries in order, otherwise they cant vote for that specific compo (and especially not for a single entry)
for every compo the party organizers can choose a voting mode
-- default ranking = "reverse compo order"
---- for major realtime compos
---- imho most compo orgas have a good feeling about which entries should close the compo
---- gives a presense to results if desired
-- default ranking = "off"
---- for compos like music, foto, ...
---- no vote at all for any entry in that compo by default
---- similar to today, only selfvote/namevote is compensated due to enforced all-or-nothing ranking
user option to completely ignore a compo when compo mode is "reverse compo order" (opt-out)
-- fairness for the usual "dont care about that compo", "missed that compo", "fuck oldschool/newschool", etc.
-- otherwise you accept the default ranking
all votekeys handed out count to result
-- if votekey is used -> use as of today, according to above modes/options
-- if votekey not used -> implicitely contributes to compos with mode "reverse compo order"
pros:
-- people are forced to actually vote if they dont agree with default ranking (when that mode is active)
-- if people forget or dont care about the vote the "reverse compo order" gives at least some pre-sense to the voting
-- everyone has a preferred ranking in his mind, so just put it in the system fairly (because you cant do otherwise)
-- people cannot bias results by only selfvoting or namevoting one specific entry, as ALL entries need to be ranked
cons:
-- not many -> discuss
Things that require more involvement from people have traditionally worked really well, yes.
Quote:
also there's a reason why e.g. personality tests usually force you to put certain words/traits/statements/topics in distinct order, even though it is hard to sometimes choose which you would prefer if they are "almost" equal of weight in the respective
This.
Also the cool thing about entry sorting is that is very clear on an UI perspective: even if you don't sort the end of the list, you do know you are doing something wrong, that may messup the results in the end.
And it avoid questions like "should I put 5 stars for my favorite & 4 stars instead of 5 for my 2nd choice" or "is the party using the sum or the average" :p.
@gopher: here the cons I see:
- it adds complexity
- I don't see how you can "force" someone to rank everything ; people will probably still do what lug00ber said, assigning random rankings to the bottom half
- in the default ranking voting mode, you give some weight to the orgas in the results = the orgas become a jury = not what sceners are used to / brings a lot of issues/questions ;).
From what I understand, the underlying issue is: people don't vote enough.
So the main focus should be to remove complexity in the voting process.
For example
1/ In the 5 stars marks, but drop the implicit 0/5 mark when one don't vote // 1/5 is the default mark for all prods.
(Cf my previous message)
As a visitor, it changes basically nothing (you "boost" the sum of prods you chose to vote on ; and if you don't vote for a prod fine). And it's more clear on how what you do or don't will affect the results.
2/ Switch to explicit ranking, but only to the top 5
(what Maali said)
As a visitor, it's very clear about what you are doing / how your voting affect the results.
If you don't remember enough prods to fill a top 5 => abstention the fuck, you were probably to inattentive/drunk during the compo ;).
If you have lore than 5 prods in mind => it forces you to make choices
The only cons I see is that you will have more ties for the bottom half of big compos. But I don't think it's a big deal (especially if it helps to improve the relevance of the podium).