Go to bottom

Particle Z-sorting

category: code [glöplog]
just add a small random offset to particles if/when they pop. Even better, add it all the time. Jitter too. Makes for a better looking thing.

And yeah for fog, smoke whatnot you may get away without sorting and just additive or multiplicative blending. At least if it doesn't have to look like "light" smoke.
added on the 2009-08-27 13:27:55 by uncle-x uncle-x
talking about maybe 1000-5000 particles multicore systems nowadays

What this means? Are you adding only 1000 to 5000 particles?
added on the 2009-08-27 13:33:35 by texel texel
Kusma, the experiments where done 3 years ago, I lost the perlin noise example, but I have the Julia set apparently. It's made of 1.5 million particles, I think this exe is not the final version as it runs at 30 fps only in my 8600 mobile. I remember it was 50 fps in my old radeon 9800 (shaders 2.0).

I uploaded the binary here

added on the 2009-08-27 13:36:16 by iq iq
texel: no I mean the total amount of moving particles will be something like that, so if I put that load in a thread (asynch) it'll not disturb the main rendering thread and that can still use 100% cpu on one core. The rest of the calculations are already like that so.. Maybe use thread affinity on windows also (enumerate number of cores first) if you like, and there's some equivalent in the recent linux kernel (linux was pretty late to the affinity party IIRC).
added on the 2009-08-27 13:37:41 by jaw jaw
it really depends on how precise you need your sorting to be. for 1 million small particles bucketing works well, and can be done fast on the gpu.
added on the 2009-08-27 13:40:49 by smash smash
just as curiosity btw, did anyone ever try to raymarch fog in realtime or? :)
wondering now
added on the 2009-08-27 13:43:47 by nystep nystep
isn't 1000-5000 particles a bit too 90's?

Ok i guess it depends on the resolution, too :-)
added on the 2009-08-27 13:43:59 by uncle-x uncle-x
what about using "Mega Particles"? Don't you avoid the whole per-particle sorting issue with that? (ok, only works for cloud-like things mostly)
added on the 2009-08-27 13:45:28 by arm1n arm1n
uncle-x: 4096x4096x32 particles of course ;)
added on the 2009-08-27 13:45:59 by jaw jaw
nicestep: raymarched volumetric lights in frameranger is quite similar to fog..
jar: yes, but its an epic hack and very hard to make actually look good. :)
added on the 2009-08-27 13:48:50 by smash smash
iq: nice :)

It works flawlessly as long as you're "outside" the object, but when you go inside, there's some bugs due to field-of-view. I guess the "ideal" solution would be to render all 48(?) solutions, and culling the particles in the vertex shader if the configuration is wrong for the given particle. The eye-space direction to the particle should give the correct configuration, not the view-vector itself.

But bah, it looks cool ;)
added on the 2009-08-27 13:59:23 by kusma kusma
bah. raymarched fog is like totally 96 :)
added on the 2009-08-27 14:00:28 by 216 216
jaw: I guess uncle-x means screen resolution not voxel resolution, because particle rendering is usually fillrate limited due to the blending. That's why some people are rendering the smoke to a smaller resolution buffer and then composite it with the regular colorbuffer (which is not that easy).
added on the 2009-08-27 14:03:24 by iq iq
kusma, the other day I received a mail describing an improvement to the 48 directions problems:


I believe there is a way to reduce the cost of this from 8 to 6 in 2D, and from 48 to 24 in 3D. The savings should be increasingly dramatic in higher dimensions.

You order 2D space in four directions [ -x, +x, -y, +y ] when three is sufficient: vectors that point to the vertices of an equilateral triangle. Let's call them [ a, b, c ]. It is possible to sort objects in 2D by two of these three directions, and get 3! permutations: [ [a,b], [a,c], [b,a], [b,c], [c,a], [c,b] ] Your four directions require 4!! permutations.

Likewise, in 3D [ -x, +x, -y, +y, -z, +z ] can be replaced by [ a, b, c, d ] which point to the vertices of a regular tetrahedron.

Generally speaking for N dimensions, tetrahedra cost (N+1)! and boxes cost (N*2)!! For 3D: (3+1)! is 24, and (3*2)!! is 48.
added on the 2009-08-27 14:14:57 by iq iq
iq: yeah I see. I was joking with the fillreate also though, since in some instances when you force the graphics card to render multiple times to the same pixel it can't take advantage of the concurrency optimizations so I've experienced framedrops. and if you use many large particles covering the screen = even worse. That's what I thought anyway... but yes, rendering to maybe half the viewport size and then maybe blurring a little bit might(?) be faster. definetly a good idea.
added on the 2009-08-27 14:15:23 by jaw jaw
[the (?) should be next to the blurring remark, not the "might", of course it "might"]
added on the 2009-08-27 14:16:24 by jaw jaw
just because it hasn't been said:
...or use additive blending for your particles, in which case no sort is needed. (but no smoke shadowing then)
added on the 2009-08-28 11:45:59 by krabob krabob
since particles doesn't change order much if you have a camera animation I don't think qsort will be the quickest, but I haven't tried anything else.. I feel that some kind of insertion sort could be pretty quick.
added on the 2009-08-28 12:08:33 by thec thec
thec: insertion sort totally breaks down when you do rapid camera changes :(
added on the 2009-08-28 12:14:52 by kusma kusma
One question, just curiosity:

Is there any big difference in speed between a fast implementation of radix (you know, the 4 or 5 passes one) in CPU and the faster network sort implementation in GPU?

And, what kind of GPU do you have to have to be faster than the o(n) radix?

Do you have data for, let say, ordering 10,000, 1,000,000 and 100,000,000 numbers?
added on the 2009-08-28 12:20:12 by texel texel
sort by distance instead of z :)
added on the 2009-08-28 12:20:16 by 216 216
somehow off-topic, but when it comes to rendering shitloads of grass, try to do the following:

organize your grass in patches. sort them by depth and use those patches for frustum culling, etc.

now modulate your alpha channel of your grass texture with a good average between real noise and sort-of perlin noise and simply render that. no one will notice that your grass is not perfectly sorted, and you can render more than everyone would need.

works very fine for me (we used that in ungol, btw)
added on the 2009-08-28 17:14:50 by pro pro
I could use some more explanation on how to select correct sorting for volumetric sort if I go by sorting by just 24 permutations. Has people done this already?
added on the 2011-07-21 21:48:28 by jalava jalava
somewhat offtopic: I was hoping someday a truely volumetric effect would replace spamming a GPU with thousands of camera facing 2d "particles"
added on the 2011-07-21 23:57:37 by QUINTIX QUINTIX


Go to top