plenoptic cameras and the meaning of photography

Raytrix introduced the R11 Lightfield Camera not too long ago. It is still low-res and expensive, but improved versions of these should eventually catch on — they make too much sense not to.

The idea of plenoptic cameras has been thrown around for quite a while. Instead of a conventional camera with a single lens focusing a single object plane onto the in-camera image plane (i.e. the sensor), a plenoptic camera attempts to capture enough additional information so as to be able to reconstruct “all possible” images that can be obtained from the light entering the same aperture. The most talked-about application is subsequent refocusing; if it were just this, then multi-capture with mechanical focal-length sweeps using a conventional camera would suffice. Another is stereograms, but again, two spaced shots would suffice for that. A plenoptic camera does more in one shot and makes these merely special post-processing cases. The simplest conception of a plenoptic camera is essentially an array of mini-cameras (microlens + micropixel array for each logical pixel) that separately captures light from all directions at each point in the image. In between conventional cameras and plenoptic cameras are perhaps smarter, sparser non-regular arrays like these coded aperture systems that hark back to old radio-astronomy literature. These have good signal processing properties from a deconvolution perspective, but the full-array plenoptic camera like the R11 seems fully general, and with some future industrial scaling, the saved expenses of a compromise may be inconsequential.

Fine, so a plenoptic camera may make clever use of its given aperture size, but do we really get something for nothing? To answer that, first a digression.

Why does a conventional camera lose information? According to the cartoon version of Fourier optics, a lens is a spatially-variant phase transformer and space itself is an array of lightfield integrators. One can sort of see that light from the object arriving at farther off-axis locations on the lens will have greater (i.e. higher-frequency) phase variations corresponding to various points of the object, the result being that the pattern at the front end of the lens is the Fourier Transform of the object. The lens multiplies its aperture shape onto this, and by its phase transforming capability, sends the product out to the image plane, where the analogous thing happens. The pattern at the image plane is thus the Inverse Fourier Transform of the pattern coming out of the lens. The net result is the image is the object convolved with the transform pair of the lens aperture shape, known as the Point Spread Function (PSF).

So right off the bat, the object has been low-passed through the finite aperture due to diffraction. Nothing can be done about finite bandwidth, but at least the focused distance to the image plane is also where usually the PSF main lobe is narrowest for a particular object plane distance (coincides with geometric optics results). Away from the designated object plane distance, we no longer get good approximations to Fourier Transforms so the image is further distorted by non-invertible transformations that can never be uniquely decoded.

A plenoptic camera is supposed to get around this with its microlens array structure because (1) each small aperture microlens is a large depth-of-field subsystem operating in parallel with others, (2) total bandwidth is not sacrificed since the synthetic aperture is still large, (3) better than a pinhole array as microlenses still do your Inverse Fourier Transforms for you, no deconvolution of potentially non-invertible transformations involved. But for this, there is now a design tradeoff between higher spatial resolution per depth vs. depth resolution. So it’s not a free gain.

In particular, once the camera is built (at least the current type), we don’t get the option of choosing non-uniform tradeoffs across the image. We get some spatial resolution and we get some depth resolution that is the characteristic of the camera, then all we can do is to degrade it computationally for effect, always throwing away the bulk of the data in each computation, but having the flexibility to choose what to throw away. In a conventional camera, the optical resources are deployed differently. We could obtain high spatial and depth resolution around one particular depth and decaying spatial and depth resolution at depths away from that. Or we could obtain high spatial resolution at all depths, but no depth resolution. Or something in between. The choices are limited but given it is what we actually want, the data is combined fully in the desired way. It is only when we think we are mistaken do we regret that information has been “lost” (but really just combined in the “wrong” way.) Because a conventional camera has evolved for 100+ years, in capable hands it is kind of matched to exactly what you want to to do in photography. Of course you can’t do weird things like having two depths at which things are focused, but that is also weird in a way that our (conventional) eyes are not used.

Then this brings us to the question of what is photography. Is it a reproduction of the physics of radiation? Then a plenoptic camera isn’t enough, one would need a hyperspectral holographic recorder of some sort. But it isn’t that. Is it to reproduce what our eyes see? It isn’t that, either, because photographs are somewhat realistic but surely not real. With more and more in-camera and post-processing gimmicks, real may even be unacceptable. Is it just free painting with a really complicated and constrained brush — in other words, masochism? No, isn’t that, because then no interaction with the real world input would be required. The meaning of photography, so far as I can see, is an art which not unlike other arts, is the communcation of an emotional state. But this emotional state is defined by that which the photographer experiences at the acute moment when a particular real world phenomenon (the scene) is observed. It’s what the photographer imagines is seen that he tries his best to manipulate his instrument to reproduce, but this imagination is seeded irrevocably by the transient real phenomenon. Anything else, he could just patiently paint from scratch; but for this, the imagination seeded by the transient real phenomenon, he somehow needs photography.

But does he need post-processing? Does he need a fully capable plenoptic camera so he can do post-processing? Is that still true to the moment or is it okay to have the emotional state last a while (and possibly change) all the way back to the photoshop computer? Heck, why not have photoshop onboard? That would free the imagination. I lean towards yes, if only for the reason that more options can’t hurt. But there is unresolved tension on this question. A camera with low capabilities can have its quirks and constraints become part of the real phenomenon, if the photographer is presumed to feel the world through its viewfinder. Is this a bit recursive? Once the instrument has too much software, then what is to separate photography from virtual brush painting in its onboard photoshop?


  1. July 7th, 2011 | 5:56

    [...] this is due to aliasing of the damned camera. But wait now, I just said not too long ago that this is photography, so indeed, I only took this and this [...]

Leave a reply