In the company of CluebotNG
|Technique:||Supervised machine learning, Naive Bayesian classifiers|
|Developed by:||User:Cobi, User:Crispy1989, Cristina Cochior, Joseph Knierzinger|
Wikipedia relies on machine assistance when it comes to maintenance. One of its most active applications is CluebotNG, an anti-vandalism bot operating on the English Wikipedia since December 2010.
CluebotNG uses a series of different Bayesian classifiers, which measure word weights to attribute a likeliness score for an edit to be considered vandalist. The results of this are fed to an artificial neural network which further allocates a number between 0 and 1 to the edits, where 1 represents a 100% chance that an edit is ill intended.
In order to establish the point from which a contribution is considered vandalist, a threshold needs to be chosen by the operator of the application. This is calculated in relation to the false positives rate that it generates: the lower a threshold the higher the number of false positives. The algorithm catches most of the vandalism on the platform when it makes the most mistakes. The threshold value for CluebotNG has caused a lot of debate on Wikipedia, where a balance between high efficiency is set against the number of editors who are wrongly accused.
If a false positive occurs, it is up to the code maintainer to examine it, add it to a list of exceptions so that the algorithm can learn from the misclassification. This implies constant close attention from the maintainer.
Through a reenactment bot that goes through each of the edits of CluebotNG and displays them on a monitor, the sequential replication of its edits is weaving together a narrative of the nonhuman voices that usually pass unnoticed on media platforms. Each micro-interaction of the bot is intrinsically performed in connection to a human editor, whom the algorithm is policing. As the taxidermic program runs, a sense of body emerges through the time span it would take to get to the end of the bot's edits.