Skip to content

Google explains how deep 3D photos in Google Photos work

24 mayo, 2021

Google Photos, a service that recently launched collages and will no longer have unlimited storage in 2021, added a great novelty last December: automatic 3D effect photos. Google calls them ‘cinematic photographs’ and can be generated automatically from the application, by clicking on the recent highlights section.

From the Google blog have wanted to explain how they manage to give movement to the photos, making them have such a striking 3D effect. As always, they use their neural networks and computational expertise to keep the tricks of Google Photos growing.

The technology behind Google’s ‘cinematic photos’

According to Google, with cinematic photos it wants to try to revive the user “the feeling of immersion of the moment in which the photo was taken”, simulating both the movement made by the camera and the 3D parallax. How do you convert a 2D image into a 3D one?

Google uses its neural networks trained with photographs taken with the Pixel 4 to estimate depth of field with a single RGB image

Google explains that, as they do with portrait mode or augmented reality, cinematic photographs require a depth map to be able to give information about the 3D structure. To achieve this effect in any mobile that does not have a double camera, they have trained a convolutional neural network to predict a depth map from a single RGB image.

With only one point of view (the plane of the photo), it is able to estimate the depth of the photograph with monocular keys such as the relative sizes of the objects, perspective of the photograph, blur and more. For this information to be more complete, use data collected with the Pixel 4’s camera, to combine them with other photos taken with professional cameras by the Google team.

How to find out how much your photos and videos take up on Google Photos

Basically, the technique is similar to that of the Pixel portrait mode: the image is analyzed, segmented and once the background is isolated, movement is simulated by displacing the background. This is much more complex, since various corrections and analysis are required in the photograph since a few misinterpreted pixels could ruin the final result.

More information | Google