Why can’t I see the camera and tripod in a 360 video? What is parallax?
After posting our little planet roller coaster video on YouTube (which received a little more 200.000 views!), we’ve had a lot of comments – most of them being very enthusiastic!
Here are some sample comments:
- “I’ve watched this twice and still can’t work out where they mounted the camera”
- “This is pretty cool! How did you place the go pros? Because it doesn’t look like they’re fastened to anything.”
- “I dont even know whats going on”
As you can judge by yourself, the burning question on every viewer lips is: where is the camera?
We admit that this video can be disturbing at first sight. Two important questions arise:
Why don’t we see the tripod?
What is the position of the camera in space?
The goal of this post is to try to give you an intuition of what’s going on here. We’ll split the explanation in two parts.
We’ll first show in this first part how it is possible to hide the tripod by exploiting parallax cleverly. In the second part, we’ll see how we can obtain this cool “little planet” effect which we used for the roller coaster video by using a special type of spherical projection: the stereographic projection.
Hopefully, after reading both parts of the article, you’ll figure out better what’s going on behind the scenes and understand this weird yet alluring camera effect.
Exploiting parallax to effortlessly remove the tripod (or any close object)
Here’s a very simple experiment you can make at home. Lift your finger in front of your eyes. With your left eye open, align your finger with some reference object in the distant background. Now open your left eye and close the right one. You will notice that your finger is not aligned any more with your reference object.
Now observe that the farther the finger is from your eyes, the less it will move between the two views. That’s why this technique can only be used with objects very close from the camera system. Luckily for us, the tripod will always be very close from the cameras.
Now suppose you are given these two images (one per eye): is it possible to reconstruct the background image without occlusion?
Yes, provided you combine both views cleverly so as to keep only the areas which do not contain the finger. To achieve this, it suffices to mark the finger on each image with a label saying “this view does not hold the truth for this region”. When stitching the images, among possible pixel candidates for one region, the stitcher will avoid selecting pixels marked as being wrong (in VideoStitch this is implemented as “masks”).
You can see below the reconstructed view using both images above. Note that missing information in the left image can be retrieved in the right image, and conversely.
Now, let’s see where does this useful effect come from.
What is parallax?
Here is Wikipedia‘s formal definition:
Parallax is a displacement or difference in the apparent position of an object viewed along two different lines of sight, and is measured by the angle or semi-angle of inclination between those two lines
Put differently, what this basically means is that an object seen from two different point of views will appear to have moved relative to the distant background between the two different views.
The following figure (taken from Wikipedia) illustrates this phenomenon.
(Image credits to Booyabazooka)
From the point of view of A and B, the object appears to be at two different positions relative to the background. A will see the object in front of the blue square while B will see it in front of the red square.
Another way of understanding parallax is thinking about it in terms of object alignment. Suppose you are viewing two objects at different distances and they are perfectly aligned from where you are standing (the near object being in front of the far one, occluding it) if the two objects are perfectly aligned (so that the near object appears directly in front of the far object). Tilt your head slightly and the two objects will be separated by a small angle.
The following system models this situation. Consider the right figure below (you can watch animated versions of the left image and right image). Two cameras C and C’ rotate around a common center (not visible in this figure). Both cameras obey the ideal pinhole model (rays emanating from points in space travel through a straight line joining the point and the camera’s optical center until they hit the camera sensor).
Camera C views objects P1 and P2 aligned. C’ is slightly rotated with respect to C. Observe that P1 (near object) is not imaged in the same position as P2 (far object). Left image shows that if the cameras shared a common optical center (whatever their orientation), there would be no parallax, objects would stay aligned in C and C’.
Side note: our eyes are a natural visual system which exhibits parallax. Each eye sees a slightly different point of view which gives us the sensation of depth. This is known as the stereoscopic effect, and it is particularly compelling to study it to achieve panoramic stereoscopic video, we’ll get to that in a later post.
Parallax is a defect of 360 camera rigs
Yes, you read it right! The whole mathematics machinery behind full-sphere panoramic video fundamentally assumes a common camera center for all cameras. This assumption enables to effectively project the videos on a panoramic sphere so that the entire view can be stitched from the individual videos without stitching errors. To ensure this, static panorama photographers usually rotate one camera around a special point (called the “no-parallax point” which coincides with the entrance pupil of the camera).
Unfortunately, real camera rigs cannot fulfill this condition. Indeed, because of the physical size of the cameras there’s no way to pack multiple cameras with their optical centers overlapping exactly. As a consequence, all cameras in a rig are slightly translated relative to one another.
The truth is, a perfect full-spherical camera rig should not exhibit any parallax, because parallax will inevitably introduce stitching errors.
On the other side (and somewhat paradoxically), parallax is the only reason why we can nicely remove the tripod or any close object!
In a nutshell, exploiting intelligently a defect in the structure of camera rigs enables us to remove the tripod or whatever object is holding the camera from the video.
This also gives some insight on how to design a 360° monoscopic video rig. What really matters to avoid parallax-induced stitching errors is to diminish the distance between the optical centers for all pairs of neighboring cameras, so that the overlapping zones can be stitched cleanly. Again, what is essential is not that the rig is small, but rather than the distances between neighboring lenses are the smallest possible.
If you want to learn more about parallax, you should read Risk Littlefield’s excellent “Theory of the No-parallax Point in Panorama Photography“. Paul Walree’s page on the center of perspective also offers some compelling insight on the subject.
This concludes the first part of this post. Stay tuned because next time we’ll dig into the joys of the stereographic projection and unravel the weird perspective of the roller coaster video!
Image credits: The roller coaster featured image was stitched using original images by Ignacio Ferrando.