Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The result is FOVO, a new method of rendering 3D space that emulates the human visual system, not cameras.

I don't buy this part. Sure, the human visual system does all kinds of processing that causes us to perceive our surroundings differently than we do a flat image on a screen. But the input to the visual system is still an image on your retina, which obeys the same laws as camera optics.

If you ignore depth of field (which the authors don't seem to be concerned with) then the human eye behaves essentially like a pinhole camera. No matter what happens behind the pinhole, all of the light rays entering the eye must pass through the pinhole from the environment, traveling in straight lines. For any given viewpoint, the relationships of which objects are occluded by which other objects will be exactly the same for an eye as for a camera. But the authors specifically point out that their algorithm doesn't preserve these relationships.

Of course, computer graphics is as much an art as a science. If deviating from the realistic model turns out to give aesthetically pleasing results, then by all means go for it. But the reason it's better wouldn't have anything to do with more closely mimicking what the human eye actually perceives.

In any case, I would question the aesthetic benefits. To my eye, the algorithm seems to distort relative shapes and sizes of objects in a weird way. It looks great in screenshots, but when moving through a scene[1], it creates a subtly unsettling "space-warping" effect.

I also think the comparisons in the article are a little disingenuous, because everyone knows that linear perspective projections with wide FOV look horrible. I'd like to see a comparison against something else, like a stereographic or fisheye projection, which would be both more physically realistic and more efficient to render.

[1]: https://www.fovotec.com/architectural



> No matter what happens behind the pinhole, all of the light rays entering the eye must pass through the pinhole from the environment, traveling in straight lines. For any given viewpoint, the relationships of which objects are occluded by which other objects will be exactly the same for an eye as for a camera.

This is too simplified.

1. The light falls onto a curved surface of the retina, not a flat screen behind it, but more importantly

2. Our brains interpret the light that falls and creates the experience of a 3D space in front of us.

If you actually simply flattened out the retina and mapped out, point to point, the light that falls, it would look a bit like their linear perspective example. The center would be clear, and the edges would be horribly stretched. And yet that's nothing like the way we perceive our vision. Looking at the image on the back of our retina rolled out like a painting, we wouldn't recognize it at all as what we see.

This is attempting to create a flat image that looks like what we see in front of us when we're standing in a space.


I agree that my explanation is simplified, but not in a way that affects my point. Maybe I didn't explain it well, so I'll try again.

Imagine you're trying to draw a flat map of the curved surface of the earth. There are many possible ways you could do it, with differing tradeoffs between preserving sizes, preserving angles, and so on. But any faithful map should respect basic geometric constraints. If your map shows the Statue of Liberty as being in Manhattan, it's quite simply wrong.

Likewise, we can think about which distortions can possibly arise from a physical model of the eye. Regardless how the curved surface of the retina is shaped, light travels in straight lines until it hits the pupil. That means the geometric relationships between distant and nearby points -- that is, which points "line up" with others -- are constrained to match those straight lines. FOVO does not obey these constraints (as mentioned in their explanation, and as demonstrated in their examples).

> 1. The light falls onto a curved surface of the retina, not a flat screen behind it, but more importantly

I didn't say anything about how the retina is shaped, because it doesn't matter. If a distant point A, a nearby point B, and the viewpoint V are all collinear, then A can't be visible because it's obscured by B. The exact details of where B appears in the resulting image would depend on the shape of the retina, true. But wherever B is, A must be mapped to the same point, which means A must be occluded unless the light rays are traveling in curved paths outside the eye.

> 2. Our brains interpret the light that falls and creates the experience of a 3D space in front of us.

Agreed. But your brain can only perceive things that are based on some transformation of the image that is projected on the retina. If it creates the "experience" of seeing an object whose light is physically obstructed, then it's not a perception, it's a hallucination.

> If you actually simply flattened out the retina and mapped out, point to point, the light that falls, it would look a bit like their linear perspective example.

Not really; it depends on how you do the mapping. Their example shows what an image projected onto an extremely wide flat retina would look like. This is mathematically equivalent to a "gnomonic" map projection, which hugely distorts shapes and distances as you get farther from the center. You could use any other projection (defined as a function mapping between view directions and image coordinates) without breaking the geometric perspective relationships, and without needing to modify the 3D geometry of the scene at rendering time.


I do see your point better now, thank you.

You're right, of course. Our visual perception of the space can't cause things to become occluded when they weren't before, and vice-versa.

So maybe that ends up being a side-effect of an implementation detail, and in their zeal to explain how, to achieve a realistic-feeling FOV, they had to change how light bends through space, they ended up harping too much on this side-effect as a way to describe what they meant.

I think the question is simply whether a better transform exists using just the 2D image. If one exists that feels like a not-too-distorted wide-angle view like our natural vision, then people should be using that. If this turns out to be the best way to achieve that currently, and it has a slight side-effect of a change in occlusion, then this seems like a good thing.


>But the input to the visual system is still an image on your retina, which obeys the same laws as camera optics.

The thing that makes the difference is that the resolution of the retina drops as you move off the center. Which I suppose is what is being simulated here. Or at least it could be efficiently simulated that way - like foveated rendering, only the fovea is kept at the center and the rest of the image is kept with the pixels smashed closer together instead of interpolated.


If that were the only difference, then FOVO would be equivalent to a 2D image warp. But the authors claim they're doing something different, and their examples demonstrate that they're actually moving objects relative to each other in 3D space. Despite the name, the system seems to have nothing to do with foveated rendering.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: