NSFW images and videos are all around the web, but nobody is naïve enough to think that human users are in charge of labeling such content as work-unfriendly.
Yahoo made sure that its neural network is well versed in detecting and categorizing content that features a little (more) extra nudity, so users don't have to spend an hour explaining to HR what they were doing on "that" page.
Yahoo engineers Jay Mahadeokar and Gerry Pesavento think it is time for developers to get free access to the machine learning source code that combs out NSFW from LOLcats.
It should be mentioned that Yahoo's algorithm is only part of the solution. Fact is, telling NSFW from an innocent image is a truly daunting task, at least when it comes to programming a software to do it properly. Humans can easily do it, but their viewpoint and sensitivities are hard to embed in a line of code.
Until Yahoo deployed its algorithm, machines had a rough time telling apart nude painting from pornographic imagery, but thanks to the tech company's contribution, they are now able to showcase Renaissance art without batting an eye.
Neural networks are an ideal tool for classifying images, as proven by both standard and medical visualization research in the past years. The idea is that the neural networks tap into a large database filled with very specific imagery, which helps its algorithms learn how to spot similar ones. For dogs, they detect tails and snouts, for cars, they identify wheels and grills. For NSFW imagery, the pattern is the same, possibly featuring less grills and more tail.
The result is a system that filters a huge amount of images and scores each one on a scale from 0 to 1, 0 standing for immaculate, and 1 for things that make sailors blush.
The utility of polishing the image-recognition mechanism goes well beyond simple censorship and keeping things professional. Erotic imagery has a well-earned place on the internet, but it is just as important to cast it aside when tackling large data sets.
Other companies, such as Apple and Google are already working on their own image-sorting algorithms, some of which fit nicely as sorting tools in the mobile environment.
Such systems could also be deployed for email inspection, and anyone who has ever been pranked by friends sending cheeky photos in a work email knows what that means.
Developers who want to give the algorithm a spin can download it from GitHub, and the Yahoo blog post offers additional details, should they need any.
One thing to remember, though: coders will have to use their own smut data base.