A new project led by a Google expert can determine where a picture was taken even without using geotagging technology. The new deep-learning machine, named PlaNet, is able to figure out the location in the image using only its pixels.
Google computer vision specialist Tobias Weyand developed the new machine, training it to be significantly better than humans in determining the location of images even if there are no clues in the picture on where it was taken.
Weyand and his team developed PlaNet by first dividing the world into a grid made up of 26,000 squares. The squares have different sizes, depending on how many pictures have been taken in the areas within the square. More popular locations such as tourist spots and big cities are made up of more but smaller squares, while remote locations have fewer but bigger squares.
The team then made a database made up of 126 million geolocated pictures from the Internet, and related them to the squares where they were taken. Of these pictures, 91 million were used to teach the neural network to determine where an image was taken using just the picture itself.
PlaNet was then validated using the remaining 34 million pictures in the database, and then tested in a variety of ways.
When the deep-learning machine was fed 2.3 million pictures from Flickr, it was able to determine the location of 3.6 percent of the images at street-level accuracy and 10.1 percent of the images at a city-level accuracy. In addition, PlaNet was able to determine the country where the picture was taken in 28.4 percent of the images, and the continent in 48.0 percent.
PlaNet was even tested against 10 humans that have seen their fair share of travels, with pictures being shown to PlaNet and to the human taken from Google Street View.
Out of the total 50 rounds, PlaNet won 28 of them. In addition, the guesses of humans were off by an average of 2,320.75 kilometers, while the figure for PlaNet was only 1,131.7 kilometers.
How can PlaNet determine the location shown in images so well? Weyand explains that the advantage of the deep-learning machine is that it has been able to see more places compared to any human, which has allowed it to learn clues that humans will not be able to distinguish on their own.
Even more amazing is the fact that PlaNet only takes up 377 MB of space, which can snugly fit into smartphones.