Ray Tracing

Brian_Barnes · January 25, 2013, 2:58pm

Since SDL made this cross platform, I thought I’d give everybody here a
chance to play with my new project. I added a true ray-tracing API
(free and open source) to my 3D engine.

It does something nobody else does (as far as I know) as it handles
rastering effects – normal, specular, and glow mapping – and things
you normally only see in rastering systems like layered effects. And
it’s fast.

But you need at least a fast quad core, or don’t bother.

For OS X (Lion is best):
http://www.klinksoftware.com/download/dim3RT_OSX.zip
For Windows (Tested up to 7):
http://www.klinksoftware.com/download/dim3RT_win32.zip

It’s a 5 level shooter. For those that might care to look at the API,
it’s here: http://www.klinksoftware.com/download/ray_interface.h

I know how hard this would be, but I’d love for this to become a
standard and maybe eventually get hardware support, so if anybody knows
a good place for me to show this to that might help, please send me an
email!

[>] Brian

Sik_the_hedgehog · January 26, 2013, 3:12am

I know how hard this would be, but I’d love for this to become a
standard and maybe eventually get hardware support, so if anybody knows
a good place for me to show this to that might help, please send me an
email!

Hardware support isn’t that far, in fact shaders are mostly there,
you’d just need to allow them to cast new rays and be able to
accumulate the results for all rays.

The problem, as usual, is the raytracing itself… Hardware algorithms
need to be much simpler than software can afford in order to be
possible to implement, so unless there’s an algorithm that’s fast
and extremely simple, we won’t see raytracing in hardware any time
soon (and by this I mean hardwired, much like how triangle
rasterization is hardwired).

Which reminds me: raycasting is essentially 2D raytracing. It could
make for an easy way to experiment with new raytracing algorithms,
since after that it’s only a matter of adding an extra axis, and most
of the time math can be easily expanded to just account for the extra
axis.

Brian_Barnes · January 27, 2013, 2:39am

Sik the hedgehog <sik.the.hedgehog at gmail.com> wrote:

I know how hard this would be, but I’d love for this to become a
standard and maybe eventually get hardware support, so if anybody knows
a good place for me to show this to that might help, please send me an
email!

Hardware support isn’t that far, in fact shaders are mostly there,
you’d just need to allow them to cast new rays and be able to
accumulate the results for all rays.

The problem, as usual, is the raytracing itself… Hardware algorithms
need to be much simpler than software can afford in order to be
possible to implement, so unless there’s an algorithm that’s fast
and extremely simple, we won’t see raytracing in hardware any time
soon (and by this I mean hardwired, much like how triangle
rasterization is hardwired).

Not to take up too much bandwidth as it’s a little OT, but GPUs would be a bad idea. Ray tracing is too loop and branch oriented. Since ray tracing is the perfect example of parallelism (not counting memory contention), a large collection of small regular CPU cores would be better. As processors get faster and more multicore, it’s possible to do it all CPU bound. Hardware ray tracing is very possible, it just might be a very large board; ARM cpus in bulk can be relatively cheap.

Anybody who knows anybody that constructs hardware like this, I’d love to get email from them.

Download one of the versions and try it out – I think you’ll be surprised at just how fast it is on a good machine.

[>] Brian

Sik_the_hedgehog · January 27, 2013, 3:06am

Not to take up too much bandwidth as it’s a little OT, but GPUs would be a
bad idea. Ray tracing is too loop and branch oriented. Since ray tracing
is the perfect example of parallelism (not counting memory contention), a
large collection of small regular CPU cores would be better. As processors
get faster and more multicore, it’s possible to do it all CPU bound.
Hardware ray tracing is very possible, it just might be a very large board;
ARM cpus in bulk can be relatively cheap.

Don’t get fooled, the biggest loop and branching in raytracing is the
raytracer itself - the very thing you want to keep outside the
shader. The shader would take care of processing a ray that already
hit, and to issue new rays as needed (which would be fed back to the
unit that does raytraces). A dedicated raytracer unit wouldn’t be
affected by this probably since it doesn’t need to decode opcodes - in
fact, since the algorithm would be hardwired you could even design it
around this fact.

As for how bounced rays could work, one idea would be to accumulate
the results of the computed shader and then let the bounced rays to
handle the parts that remain. You could impose a maximum of how much
bouncing is allowed, too (to avoid infinite loops - after a given
point you probably got enough to make a convincing image anyway).
You’d probably need to implement antialiasing this way too.

Also as you said itself: raytracing is the perfect example of
parallelism. GPUs specialize exactly in that kind of things. The
problem is that branches are the #1 enemy of parallelism (loops
aren’t if they always execute the same amount of iterations)

(and, CPUs aren’t getting faster, they’re bothering more about adding
more cores these days, and even that is seeing bottlenecks as memory
bandwidth can’t keep up with them)

Download one of the versions and try it out – I think you’ll be surprised
at just how fast it is on a good machine.

My system is the exact opposite of what you said would be the ideal.
Single core, low memory, and also runs Linux (you didn’t provide a
Linux executable).

Brian_Barnes · January 27, 2013, 5:14am

Sik the hedgehog wrote:

Not to take up too much bandwidth as it’s a little OT, but GPUs would be a
bad idea. Ray tracing is too loop and branch oriented. Since ray tracing
is the perfect example of parallelism (not counting memory contention), a
large collection of small regular CPU cores would be better. As processors
get faster and more multicore, it’s possible to do it all CPU bound.
Hardware ray tracing is very possible, it just might be a very large board;
ARM cpus in bulk can be relatively cheap.

Don’t get fooled, the biggest loop and branching in raytracing is the
raytracer itself - the very thing you want to keep outside the
shader. The shader would take care of processing a ray that already
hit, and to issue new rays as needed (which would be fed back to the
unit that does raytraces). A dedicated raytracer unit wouldn’t be
affected by this probably since it doesn’t need to decode opcodes - in
fact, since the algorithm would be hardwired you could even design it
around this fact.

As for how bounced rays could work, one idea would be to accumulate
the results of the computed shader and then let the bounced rays to
handle the parts that remain. You could impose a maximum of how much
bouncing is allowed, too (to avoid infinite loops - after a given
point you probably got enough to make a convincing image anyway).
You’d probably need to implement antialiasing this way too.

Also as you said itself: raytracing is the perfect example of
parallelism. GPUs specialize exactly in that kind of things. The
problem is that branches are the #1 enemy of parallelism (loops
aren’t if they always execute the same amount of iterations)

Branches aren’t the #1 enemy, shared data is

You’re now involving the GPU/CPU is a incredibly high amount of context switches; the only way ray tracing will work properly is if it’s self contained in a massively parallel system, either a card of small CPUs or a computer with CPUs. Handing it back and forth to GPUs/CPUs is asking for trouble. I know people have done work in this, but it’s not an optimal solution, it’s a solution trying to use the hardware that exists already.

Take it from somebody that build his own ray tracer – the vast bulk of time is in collisions, and I do an incredible amount of trickery to reduce those. The time spend in texture lookup – and doing all the rastering effects like normal or spec mapping – is nothing. Not even worth passing of to a GPU.

Note EVERY hit bounces rays, and multiple times. Every light source starts another ray (after culling.) That’s why it’s such a hard nut to crack.

Even more OT, sorry list people. It’s an interesting topic, though.

Download one of the versions and try it out – I think you’ll be surprised
at just how fast it is on a good machine.

My system is the exact opposite of what you said would be the ideal.
Single core, low memory, and also runs Linux (you didn’t provide a
Linux executable).

It actually runs in Linux, I just don’t have a compile

To give you an idea, on a quad core PC, with a 3.x level CPU, on a map that has a quake 2 level of complexity with quake 3 level models, it can run at about 10-15 fps. All live, all with bumps, specs, glow, and other effects.

[>] Brian

Sik_the_hedgehog · January 27, 2013, 6:47am

Just to make it clear before I continue: if we ever pretend to get
this in an even remotely acceptable implementation in hardware it’d
have to be a custom GPU (for lack of a better name) with dedicated
raytracing hardware. So from now on, whenever I say GPU, I mean that
kind of GPU. Current ones simply won’t do the job no matter what.

Branches aren’t the #1 enemy, shared data is

Branches are when parallelism is achieved as SIMD, because it
completely breaks the assumptions under which SIMD works (which is why
in shaders you have to avoid branches like the plague). If you ever
pretend to have raytracing into the many hundreds (thousands?) of
frames per second you will probably end up needing SIMD no matter how
hard you try, either for performance reasons or for cost reasons (you
can only cram so many cores before it becomes too costly).

Anyway, if you consider the raytracing step, you’d need to take into
account that the only thing that would ever be touching the primitive
data is the raycasting unit. As such, the only thing you’d need to
worry about is shared data among rays and nothing else. Furthermore,
at this point of the process the data is read-only, so you don’t even
need to worry about it getting modified, which allows for a lot of
assumptions that can make it even faster.

You’re now involving the GPU/CPU is a incredibly high amount of context
switches; the only way ray tracing will work properly is if it’s self
contained in a massively parallel system, either a card of small CPUs or a
computer with CPUs. Handing it back and forth to GPUs/CPUs is asking for
trouble. I know people have done work in this, but it’s not an optimal
solution, it’s a solution trying to use the hardware that exists already.

Um, no? That’s the worst thing one could ever do, and in fact why
immediate mode was dropped from OpenGL.

The CPU should be limited to just passing a list of primitives (or
list of VBOs, or instance lists, or whatever - you get the idea, same
as we do these days). Once that list is complete everything would run
on the GPU. Yes, this means you need to store such lists in memory in
video hardware, but hey, it needs to be stored somewhere after all,
right?

As you can see, the way I view it, the entire raytracing process would
happen in the video hardware. That’s about as fast as it can get, and
I’d say it’s probably 100% close to optimal.

Take it from somebody that build his own ray tracer – the vast bulk of time
is in collisions, and I do an incredible amount of trickery to reduce those.
The time spend in texture lookup – and doing all the rastering effects
like normal or spec mapping – is nothing. Not even worth passing of to a
GPU.

This is why I talk about a raytracing unit. Basically you have two
things: the raytracing unit takes care of the collisions, while the
shader unit does all those raster effects you talk about. And don’t
get fooled, you’re hardcoding the effects here, in practice you’ll
want to give the same flexibility that pixel shaders have (quirks
unique to raytracing aside, like the lack of a depth buffer).

Raytracing does make some algorithms simpler compared to rasterizing,
though (like shadows and mirrors). Some algorithms are still just as
complex, though. You really don’t want to underestimate how much it
could take given in the hands of an entire development team aiming for
fully detailed photorealism.

Note EVERY hit bounces rays, and multiple times. Every light source starts
another ray (after culling.) That’s why it’s such a hard nut to crack.

I imagine the hardest nut to crack is having to check collisions
against all primitives, not so much the bounced and retraced rays.
Again, more of a reason to have a dedicated raycasting unit that takes
care of all that much faster than a processing unit (CPU or GPU)
could.

Even more OT, sorry list people. It’s an interesting topic, though.

Well, the subject is “Ray Tracing”, and to be fair the thread seemed
to focus more on the raytracing than on SDL for starters. If you want
we can continue this discussion in private, though.

Brian_Barnes · January 27, 2013, 7:04am

Sik the hedgehog wrote:

Just to make it clear before I continue: if we ever pretend to get
this in an even remotely acceptable implementation in hardware it’d
have to be a custom GPU (for lack of a better name) with dedicated
raytracing hardware. So from now on, whenever I say GPU, I mean that
kind of GPU. Current ones simply won’t do the job no matter what.

OK, way off topic now. Look I wrote a fast ray tracer, designed for games. You can download it and run it. It does everything a ray tracer should, and more, like rastering effects. More than just mirrors, but refraction, too. I know what I’m talking about, I know it in and out. You can argue this off list, but I implemented one, and I know what it would take to run one in hardware.

I just asked if anybody knew anybody that might benefit, or anybody potentially involved in hardware, and a game for the people to play on this list that uses SDL, not a lengthy argument about the merits of this or that. Everything on this list doesn’t have to dissolve into that!

[>] Brian

JonnyD · January 27, 2013, 2:35pm

“Everything on this list doesn’t have to dissolve into that! :)”

Oh yes it does! Let me list the reasons…On Sun, Jan 27, 2013 at 2:04 AM, Brian Barnes wrote:

Sik the hedgehog wrote:

Just to make it clear before I continue: if we ever pretend to get
this in an even remotely acceptable implementation in hardware it’d
have to be a custom GPU (for lack of a better name) with dedicated
raytracing hardware. So from now on, whenever I say GPU, I mean that
kind of GPU. Current ones simply won’t do the job no matter what.

OK, way off topic now. Look I wrote a fast ray tracer, designed for
games. You can download it and run it. It does everything a ray tracer
should, and more, like rastering effects. More than just mirrors, but
refraction, too. I know what I’m talking about, I know it in and out. You
can argue this off list, but I implemented one, and I know what it would
take to run one in hardware.

I just asked if anybody knew anybody that might benefit, or anybody
potentially involved in hardware, and a game for the people to play on this
list that uses SDL, not a lengthy argument about the merits of this or
that. Everything on this list doesn’t have to dissolve into that!

[>] Brian

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

Zoltan_Kocsi · January 28, 2013, 4:23am

OK, way off topic now. Look I wrote a fast ray tracer, designed for
games. You can download it and run it.

You have an URL?

Zoltan

Sik_the_hedgehog · January 28, 2013, 4:29am

In the first e-mail (copying here)
http://www.klinksoftware.com/download/dim3RT_win32.zip
http://www.klinksoftware.com/download/dim3RT_OSX.zip
http://www.klinksoftware.com/download/ray_interface.h

2013/1/28, Zolt?n K?csi :>> OK, way off topic now. Look *I wrote a fast ray tracer, designed for

games*. You can download it and run it.

You have an URL?

Zoltan

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

Zoltan_Kocsi · January 28, 2013, 4:42am

In the first e-mail (copying here)
http://www.klinksoftware.com/download/dim3RT_win32.zip
http://www.klinksoftware.com/download/dim3RT_OSX.zip
http://www.klinksoftware.com/download/ray_interface.h

Thanks!

ZoltanOn Mon, 28 Jan 2013 01:29:26 -0300 Sik the hedgehog <sik.the.hedgehog at gmail.com> wrote: