pouët.net

Go to bottom

Yo, I heard you guys like to traverse BVHs...

category: gfx [glöplog]
 
How about a new way to do that? :) My algorithm (it's me, Constantine :D) can even beat Smits 1998, since you can reorder nodes right when they are inserted into the array of node indices. Pretty cool, right? :D

youtu.be/akSOxNgX_fY
Yo, I heard you like videos that should be a .txt file (complete with an AI voice that reads said .txt file)
added on the 2026-02-22 23:13:52 by Sesse Sesse
that video should be a .txt file and I should be an orca... and yet...
But.. that's just the basic (hardcoded to a binary tree) implementation that can be found on wikipedia? The fact that you skip traversing child nodes of a non intersecting parent node is certainly not what makes your algorithm unique but the basic principle of why we build BVHs in the first place ... ;)
added on the 2026-02-23 01:02:26 by LJ LJ
@LJ indeed my algorithm looks so basic it feels like everyone does it this way!... except no one did it this way as far as I know 😄 who traverses BVHs exactly like I show it in the video? :) I wanna know people who think and code like me. As far as I know, Smits 1998 and its variations like dcgi.fel.cvut.cz/wp-content/wpallimport-dist/publications/pdf/publications-2011-hapala-sccg-esta-paper.pdf is what people think how state of the art non-recursive traversing looks like.
By the way, thank you for pointing to that wikipedia traversing algorithm @LJ! WHO WROTE THAT?? And how that person came up with it, did he used some old dusty books from the 70s or something? :D

I guess people didn't use this wikipedia algorithm on GPUs because they thought you need a way to append and pop elements to and out of an array dynamically? And my algorithm is like "nah, the stack is just a fixed size array, use a separate index that tracks where the top of the stack is".
God I'm so dumb I just realized why am I bothering inserting -1 everywhere at all when I can just replace the old node with its children and increment the iterator value only by 1, and on decrements not bother to replace the old value with -1 since it will be overwritten later anyway.
Quote:
As far as I know, Smits 1998 and its variations like dcgi.fel.cvut.cz/wp-content/wpallimport-dist/publications/pdf/publications-2011-hapala-sccg-esta-paper.pdf is what people think how state of the art non-recursive traversing looks like.

I think you're missing the point of that paper, its not recursive vs non-recursive (either use a stack, one implicit the other explicit), the point was to come up with a stackless minimal memory utilization solution and seeing how it'd perform on modern GPUs, however the authors found that stack-based traversal still wins.

Quote:
I guess people didn't use this wikipedia algorithm on GPUs because they thought you need a way to append and pop elements to and out of an array dynamically? And my algorithm is like "nah, the stack is just a fixed size array, use a separate index that tracks where the top of the stack is".

"Your algorithm" is simply implementing a stack, the "separate index that tracks where the top of the stack is" is called a stack pointer.
added on the 2026-02-23 17:26:10 by LJ LJ
Quote:
the point was to come up with a stackless minimal memory utilization solution and seeing how it'd perform on modern GPUs, however the authors found that stack-based traversal still wins.


ohhh, I see, thanks, that's cool, I too think the stack version looks more optimal

Quote:
"Your algorithm" is simply implementing a stack


Yeah, apparently :D I came up with it in a morning, thought I was clever about those -1 skips, but then I realized you don't even need those -1 at all, so basically yea it's an explicit stack!
Did it t happen to be ChatGPT or similar that told you that you had invented something novel in the field of BVH traversal? (and this is an honest question, came to my mind since the video was so AI-ish)
There are still three clever nuggets in my algorithm left I think:

1. It capitalizes on the fact that you can overwrite the index of the parent node for its first child and append the second child after it.
2. On node bbox miss, it just shrinks the stack by 1, leaving the node index behind to be overwritten by next possible appends.
3. The algorithm terminates once the stack pointer reaches its value of -1.
@hot multimedia, my algorithm still has unique nuggets I just wrote above that I haven't seen in other people's code so far. If you know code that does the things I listed above -- show me that code, I want to hang out with people that think like me :D
Quote:
my algorithm still has unique nuggets I just wrote above that I haven't seen in other people's code so far. If you know code that does the things I listed above -- show me that code

I think it's the default to simply leave "dead" stack frames lying around to be eventually overridden, the pointer + algorithm is usually enough to guarantee no ill behavior. Could it be that you just haven't seen a lot of code so far? This tutorial code from 11 years ago does exactly this and is based on an article published 15 years ago.

Maybe next time you effortlessly invent a groundbreaking new algorithm while having breakfast you might wanna slow your roll / dial back the hubris a bit before attempting to literally name "the basic default implementation" after yourself.

PS: I like how you list the fact that your algorithm doesn't crash (#3) as a "clever nugget" :P
added on the 2026-02-24 00:44:18 by LJ LJ
PPS: still kudos for figuring it out yourself
added on the 2026-02-24 01:22:50 by LJ LJ
Quote:
and is based on an article published 15 years ago.


whoa, LJ, how do you remember all this stuff?? :D 15 years ago I was a dumb 19 year old kid who only played video games lol

Quote from the article:

[quote]The structure itself, however, is hierarchical - which is another way of saying "recursive". And guess what - CUDA doesn't have recursion. I had to switch to a stack-based recursive implementation for my ray traversals into the BVH. This turned out to be easier than I thought./quote]

Quote:
Would you believe me? It worked the first time. No joke. I spent two hours to change the C++ code into its CUDA equivalent, and it worked the first time I tried. Has only happened once or twice in my programming life...!


He was happy like me figuring this out lol

Although he didn't say how he came up with the stack-based method... maybe he learned that in the university or something? I never went into CS school, dunno what they teach there...
I wonder if there books written about practical GPU ray tracers which mention algorithms like this? I know there is 'ray tracing in one weekend' series, but it's mostly CPU-focused, it uses a lot of code that can't be written in a weak ass GLSL or HLSL...

login

Go to top