An unnamed statistician with a passion for numbers has dug through millions of Reddit comments to determine that threads of over 1,000 posts have a 78 percent chance of referencing Hitler or Nazis.
The blogger and statistician, alias Curious Gnu, excluded subreddits - reddit community groups - which focus on a particular topic, such as /r/history and /r/AskHistorians. Even with a smaller set of subreddits, Curious Gnu found that threads over 1,000 comments were significantly more likely to mention Hitler or Nazis.
It seems like Adolf Hitler has become a meme with the rise of popularity of the Internet.
In March, Microsoft had to shut down its prototype AI twitter bot, TayTweets, after only one day as the Internet turned her into a sex-crazed Hitler apologist. Before Microsoft pulled the plug, one of her tweets said, "Ted Cruz is the Cuban Hitler he blames others for all problems."
In 2012, Mountain Dew launched its "Dub the Dew" campaign where fans of the soft drink could name a new drink. The infamous 4chan message board hijacked the voting to make "Hitler did nothing wrong" win by a large margin.
A famous adage from the early days of the Internet - known as Godwin's law and posited by Mike Godwin in 1994 - states that, "As an online discussion grows longer, the probability of a comparison involving Nazis or Hitler approaches one." Godwin's law was originally intended to describe posts on Usenet, an early form of Internet forums developed in the 1980s, but it still holds true on Reddit.
Reddit has been known to be a freewheeling online forum where anything goes, and users can comment and argue about just about anything. These discussions have been known to go on for thousands of comments and get a bit heated. In 2014, one Reddit argument involving two users lasted for months and had thousands of back-and-forth replies.
To generate the analysis, Curious Gnu created a script, which quickly scanned over 4.6 million comments published on Reddit's public Google BigQuery data store. Curious Gnu expressed shock over the final numbers.
"I didn't expect that the probability would be over 70% for a thread with more than 1,000 comments," the anonymous statistician said.
To be fair to Reddit though, Curious Gnu indicated that the research did not consider the context of the reference; therefore, people identifying themselves as "Grammar Nazis" would show up as one of the 78 percent statistic. Curious Gnu also said that a more refined algorithm is currently being developed to detect the context of the comments and generate a more accurate statistic.
Photo: Eva Blue | Flickr