nVidia RTX innovation or marketing bullshit?
category: code [glöplog]
This should be the talk cupe mentioned: https://www.ea.com/seed/news/hpg-2018-keynote
Should demosceners consider buying a RTX card when renewing the GPU ?
Will we see "RTX only" kickass demos/intros soon ?
Will we see "RTX only" kickass demos/intros soon ?
arent all raymarched 4Ks etc pretty much "RTX only" already? :P
@Pablo : I am unsure if your answer was serious or not :)
AFAIK RTX will not help directly for raymarching, because it can only calculate intersections between rays and triangles.
what about this : the raymarched scene (which is a function that return closest distance to an object) would be converted once to triangles (eg : using marching cubes). Then to raytrace agaist those triangles to know where the closest collision will be. Then to start raymarching from that point (for extra performance).
AFAIK RTX will not help directly for raymarching, because it can only calculate intersections between rays and triangles.
what about this : the raymarched scene (which is a function that return closest distance to an object) would be converted once to triangles (eg : using marching cubes). Then to raytrace agaist those triangles to know where the closest collision will be. Then to start raymarching from that point (for extra performance).
Tigrou: OTOH, many demos require top-of-the-line GPUs - which eg. I have no money for.
porocyon: from what I understood the gains are minimal because the first steps of raymarching are actually fast enough, it's only when you get close that they start shrinking a lot.
xernobyl: I'm not talking about raymarching, but demos in general.
"We didn't buy 486s to code 386 demos"
❤️🔥
#stayamiga
I'm aware. It just annoys me that the hardware is quite pricey.
marching/tracing is already great for GENERAL purpose GPUs.
making that more special, will quickly end up in diminishing returns.
More promising than any tracing/marching specialized hardware are for me 2 pardigm shifts:
1) We may soon have a 3rd primitive data type, an unholy union between integer and float, with roughly 2x the coputation time, and not (much) lower precision. compiler upgrades for it may be relatively simple.
2) We may soon more commonly use a different hardware architecture, that is a LOT more energy efficient, and a lot better for NeuralNetwork-like code.
It is silicon based architecture, with a 2d grid of 3-to-8 bit gates, that behave more like neurons, like an array of accumulators, that trigger each other (but that may only count up to 255 each)
Its more efficient, by having better "memory access time", and by being more parallel. therefore it generates less heat and needs less power.
sadly it needs its very own compiler, comes with many constrains of being novel and overspecialized, and is only used in server farms, where AI (cooling and power consumption) is more important than ease of access and multiPurpose.
making that more special, will quickly end up in diminishing returns.
More promising than any tracing/marching specialized hardware are for me 2 pardigm shifts:
1) We may soon have a 3rd primitive data type, an unholy union between integer and float, with roughly 2x the coputation time, and not (much) lower precision. compiler upgrades for it may be relatively simple.
2) We may soon more commonly use a different hardware architecture, that is a LOT more energy efficient, and a lot better for NeuralNetwork-like code.
It is silicon based architecture, with a 2d grid of 3-to-8 bit gates, that behave more like neurons, like an array of accumulators, that trigger each other (but that may only count up to 255 each)
Its more efficient, by having better "memory access time", and by being more parallel. therefore it generates less heat and needs less power.
sadly it needs its very own compiler, comes with many constrains of being novel and overspecialized, and is only used in server farms, where AI (cooling and power consumption) is more important than ease of access and multiPurpose.
marketing bullshit
dynamic RTX GI AO look much worse than static textures(Nvidia in their official demos render RTX in 480p resolution for output...), and eat more than 50% of frame time
in reality, you can not make even 720p RTX scene with 30fps - if your RTX scene has more than 10k triangles and a single draw pass... then its dead
this why in real usage RTX scene has all textures static in single texture and bone-animation converted to static meshes
DLSS is much more "fake" and turn-back technology, every game will look like PS2 blurry mess
dynamic RTX GI AO look much worse than static textures(Nvidia in their official demos render RTX in 480p resolution for output...), and eat more than 50% of frame time
in reality, you can not make even 720p RTX scene with 30fps - if your RTX scene has more than 10k triangles and a single draw pass... then its dead
this why in real usage RTX scene has all textures static in single texture and bone-animation converted to static meshes
DLSS is much more "fake" and turn-back technology, every game will look like PS2 blurry mess
Quote:
Will we see "RTX only" kickass demos/intros soon ?
only if this demo "sponsored by Nvidia"
compare rendering same scene with RTX and SDF raymarchin(no RTX) is like 10-20% faster in RTX
developer easy can make no-RTX version, using SDF and it will work everywhere
@ollj : what you describe look very similar to tensor cores : the possibility to do tons of ADDs (and MULs) very fast, for deep learning related tasks (the hardware implementation is probably different from what you explain btw)
Quote:
compare rendering same scene with RTX and SDF raymarchin(no RTX) is like 10-20% faster in RTX
developer easy can make no-RTX version, using SDF and it will work everywhere
compare mixing same module with GUS and soundblaster(cpu mixing) is like 10-20% faster in GUS
developer easy can make no-GUS version, using cpu mixing and it will work everywhere
Quote:
dynamic RTX GI AO look much worse than static textures
no shit
Has anyone done much actual coding with this yet? If so how flexible are the APIs?
I’ve been looking at the Metal raytracing APIs earlier, and if the hardware support is there it’s potentially pretty useful for demo stuff. E.g. you can build an acceleration structure from arbitrary data - not just meshes, and you can provide your own intersection functions as well as using the standard triangle intersections. I.e. you could feed it a mesh of your scene and bounding boxes for a bunch of SDFs, rasterise the mesh and then raymarch the rest while skipping the empty space. And you could trace shadows / reflections of the SDFs from the mesh.
Oh and those tensor cores? You can run style transfer on that.
Seriously, sceners complaining fun new hardware is marketing bullshit? Go do some fun shit on it :D
I’ve been looking at the Metal raytracing APIs earlier, and if the hardware support is there it’s potentially pretty useful for demo stuff. E.g. you can build an acceleration structure from arbitrary data - not just meshes, and you can provide your own intersection functions as well as using the standard triangle intersections. I.e. you could feed it a mesh of your scene and bounding boxes for a bunch of SDFs, rasterise the mesh and then raymarch the rest while skipping the empty space. And you could trace shadows / reflections of the SDFs from the mesh.
Oh and those tensor cores? You can run style transfer on that.
Seriously, sceners complaining fun new hardware is marketing bullshit? Go do some fun shit on it :D
Quote:
Go do some fun shit on it :D
sure when someone will throw to me some videocard with RTX support
@alia : I did not personally play with it but I read the following tutorials. It will give you a good insight about how it works :
https://developer.nvidia.com/rtx/raytracing/dxr/DX12-Raytracing-tutorial-Part-1
https://developer.nvidia.com/rtx/raytracing/dxr/DX12-Raytracing-tutorial-Part-2
https://developer.nvidia.com/blog/practical-real-time-ray-tracing-rtx/
This paper explain how to implement hybrid rendering (which use both rasterization and raytracing) : https://media.contentapi.ea.com/content/dam/ea/seed/presentations/2019-ray-tracing-gems-chapter-25-barre-brisebois-et-al.pdf. Most AAA games that support RTX work the same way.
Tensor cores :
there is AFAIK 3 ways to use them :
- DLSS : some superscaling powered by DL that looks quite good (eg : it can upscale output from 2K to 4K)
- Denoiser : since the number of ray that you throw is limited (for performance) you get a noisy result. At the bottom of this page an example of what a denoiser can do : https://fabiensanglard.net/revisiting_the_pathtracer/index.html
- To do tons of ADDs and MULs very fast, for example for DL (you need to write custom code, this is similar to CUDA)
https://developer.nvidia.com/rtx/raytracing/dxr/DX12-Raytracing-tutorial-Part-1
https://developer.nvidia.com/rtx/raytracing/dxr/DX12-Raytracing-tutorial-Part-2
https://developer.nvidia.com/blog/practical-real-time-ray-tracing-rtx/
This paper explain how to implement hybrid rendering (which use both rasterization and raytracing) : https://media.contentapi.ea.com/content/dam/ea/seed/presentations/2019-ray-tracing-gems-chapter-25-barre-brisebois-et-al.pdf. Most AAA games that support RTX work the same way.
Tensor cores :
there is AFAIK 3 ways to use them :
- DLSS : some superscaling powered by DL that looks quite good (eg : it can upscale output from 2K to 4K)
- Denoiser : since the number of ray that you throw is limited (for performance) you get a noisy result. At the bottom of this page an example of what a denoiser can do : https://fabiensanglard.net/revisiting_the_pathtracer/index.html
- To do tons of ADDs and MULs very fast, for example for DL (you need to write custom code, this is similar to CUDA)
Thanks Tigrou! Looks pretty similar to the Metal tracing setup, although I don't see custom intersection tests there (probably because it's just 1 triangle).
That API tho 😭 It's almost as bad as vulkan. I think I'd probably give up coding rather than work with APIs like that. For some reference, Apple's sample project is similar in size but renders 9 path traced Cornell boxes using a mix of meshes and sphere intersections. And the code is much easier to understand, because the API is clear and concise and not full of meaningless crap like D3D12_FEATURE_D3D12_OPTIONS5 and ID3D12GraphicsCommandList4 (how the hell did this ever ship? I looks like machine generated code o_O)
That API tho 😭 It's almost as bad as vulkan. I think I'd probably give up coding rather than work with APIs like that. For some reference, Apple's sample project is similar in size but renders 9 path traced Cornell boxes using a mix of meshes and sphere intersections. And the code is much easier to understand, because the API is clear and concise and not full of meaningless crap like D3D12_FEATURE_D3D12_OPTIONS5 and ID3D12GraphicsCommandList4 (how the hell did this ever ship? I looks like machine generated code o_O)
"MPSRayDataTypeOriginMaskDirectionMaxDistance"
That's long but at least it says what it is :) What does "ID3D12GraphicsCommandList4" tell you?