Fixing online communities: A biological approach

It’s an old story, repeated many times. Evolution in the making. A new online community is born. Founding members of the community define its core values, writing style, and topics worth considering. The community grows and after a while inevitably fragments into subgroups with markedly distinct interests and sets of values. Community posts complaining about irrelevant topics or trolling comments are the first indicator of the changes to come. It seems inevitable; despite various protective measures by the community newcomers take over. Community is changed for good (or worse). Newcomers win and old-comers leave for greener pastures elsewhere.

Communities employ various tactics to protect themselves from users and content considered harmful. Some require users to earn karma points to be able to post, comment or rank others. Elaborate page-rank like reputation systems have also been devised. And, if everything fails good old censorship can be adopted. Just rename trustworthy members of the community from censors to the moderators first and grant them rights to delete content and ban users from the community.

There seems to be a tradeoff between openness of the community and its vulnerability to vandalization. Censorship can be very effective measure of protection, however, the openness and full potential of growth of the community is inevitably lost. On the other hand, completely unprotected community results in tons of bad posts and troll comments. Users can still skip them but their valuable time and loyalty to the community is wasted.

As the title of this post hints Nature has dealt with similar problems; evolution produced robust and resource-allocation efficient solutions for them. We can learn and adopt solution to our needs. Fundamentally, the problem we are dealing with is a decision problem. Is a post worth publishing; is a comment civil and appropriate?

How bees do it

Imagine a bee which finds a source of food. Happy bee immediately returns to the hive and starts elaborate dance to guide other bees to the location of the food. Does the entire hive flock in the direction indicated by a happy bee? Of course not. It would be too risky to deploy all hive’s resources based on an opinion of only one bee. The happy bee might err signalling wrong direction or the source of the food found might be insufficient for all bees in the hive. Instead, only a handful of scouts fly in the direction of food source. If the scouts perform the dance on their return more bees head towards the target. If not, the target is abandoned. Fewer hive resources are wasted on unpromising sources of food. The decision making process is simple, robust and reasonably accurate.

The adoption of this principle by online communities is straightforward. In this analogy community members are bees, community is a hive and happy bee is a member of the community submitting a post or a comment. At random, a small scout group of community members is selected. If the scout group majority decides that the post is not appropriate or against community founding principles it is discarded. Since a malicious post is not an honest error as in the case of a happy bee penalties for trolls are in order: poster and all up voters are downvoted or even denied the right to post again. If, on the other hand, the post is accepted by scout group whole community votes on the submitted post.

How efficient is it

The mechanism described above is resource allocation efficient. Only a scout group allocates its time and is exposed to malicious content. As members of the scout group are randomly selected there is no need for the privileged class of community censors. Burden of filtering per community member is low and evenly distributed over all community members. The bee scout group size is small (15-20). Unless the community is seriously infected by trolls small scout groups should suffice.

For the math geeks out there: if there are n total members in the community and t of them are trolls then selecting exactly k trolls in a scout set of size s is a random variable X following hypergeometric distribution:
hypergeometric distribution
To filter out malicious messages scout set members perform majority voting. Scout set successfully filters out malicious messages if the trolls are not in a majority (i.e. number of trolls k =< s/2) and are outvoted by other members of scout set. Therefore the probability of a successful filtering of a malicious message by scout set community members is given by:
filtering probability

A caveat: if immediate filtering of messages is required only currently online part of the community should be taken into account in calculations of filtering probability. In such a a case n stands for the number of online members of the community and t for the number of online trolls. You need to take this into account if online presence of your community varies during the day/week cycle. In such a case trolls may wait for off peak hours to get a better chance to be included in scout set and outvote other members in the filtering process.

So let’s plug in some numbers for filtering malicious messages with 20 scouts:

community size 1000 1000 1000 5000 5000 5000
troll share 10% 20% 30% 10% 20% 30%
filtering rate 100% 99.95% 98.37% 99.99% 99.94% 98.30%

Filtering, even in a case where 30% of community members are trolls, is almost perfect. Not a shabby result for 20 bees in action.

Leave a Reply