[Question] Rotozoomer that will run on low specs
category: code [glöplog]
Quote:
I don't have comment up yet, but the main asm code was that, with a bit of unroll to win one reg and trying to use regs as much as possible, collecting sample in al and ah to write once for two pixels, using bp also as extra register.
If I understand correctly, when compiling, NASM will generate 160 repetitions of the block inside the "%rep" directive. So it’s more than just a "a bit of unroll" :)
Well, for a 386 this clearly works pretty well. It’s also possible that using "stosw" gives an advantage (compared to my code, where I use "mov" and "add", because on a 486 that’s faster).
Quote:
I was looking at the code of Second Reality where it's smoother.
As for the smoothness in the rotozoomer in Second Reality, it doesn’t really look smooth to me at high zoom levels. Seems like it’s still 8:8. I don’t really understand Psi’s code :)
Lots and lots of code can be removed and in the most extreme case reduced to only do 1 move per pixel -- oldtimers here will remember the 1995 ~ 1996 timeframe when one-movers were the latest tech for texture mapping, but in reality that technique dates back to 1990 ~ 1992 era via Syntax Terror by Delta Force and World of Commodore by Sanity
My suggestion is to stop with pen + paper or a white board and work out:
1. a way to decompose the UV of each pixel in terms of the UV contributed by the horizontal deltas and the UV contributed by the vertical deltas.
2. how a UV value corresponds to a memory address to read texture from
Once you can work out that and eliminate almost all of the per pixel code, you will be much better prepared to reach enlightment
My suggestion is to stop with pen + paper or a white board and work out:
1. a way to decompose the UV of each pixel in terms of the UV contributed by the horizontal deltas and the UV contributed by the vertical deltas.
2. how a UV value corresponds to a memory address to read texture from
Once you can work out that and eliminate almost all of the per pixel code, you will be much better prepared to reach enlightment
one-movers, never heard that term before, but if I understand is try to pack the UV together in on register, so that you do one add instead of two for interpolation, and some good strategy of texture data structure, so that the result with very little shift or and or even not, can insta-map to the address to sample from. I thought about that once but never tried so far.
You could also ping Trixter on Discord, he did the rotozoom in 8088 MPH.
Quote:
You could also ping Trixter on Discord, he did the rotozoom in 8088 MPH.
as far as I know, in "‘8080 MPH’ is nothing special, just unrolled loop:
Code:
add cx,si
add dx,bp
mov al,ch
mov bh,dh
xlat
stosbok tried to read most of this, but I havent seen this trick mentioned:
_IF_ you only shrink the texture from 100% then there is a trick that I believe all (fast) 8 bit rotozoomers do.
the UV addresses have to be calculated for only one "scanline", all other scanlines are the SAME but with a changing offset per scanline, and that offset is the starting UV for that scanline.
if you try to zoom with this then it will look ugly the zoomed in texels will not look like rotating squares. but they will be ugly irregular sized rectangles. or you cheat and just zoom in a bit or you do it very fast over 100% and then its not really visible.
_IF_ you only shrink the texture from 100% then there is a trick that I believe all (fast) 8 bit rotozoomers do.
the UV addresses have to be calculated for only one "scanline", all other scanlines are the SAME but with a changing offset per scanline, and that offset is the starting UV for that scanline.
if you try to zoom with this then it will look ugly the zoomed in texels will not look like rotating squares. but they will be ugly irregular sized rectangles. or you cheat and just zoom in a bit or you do it very fast over 100% and then its not really visible.
Quote:
ok tried to read most of this, but I havent seen this trick mentioned:
the UV addresses have to be calculated for only one "scanline", all other scanlines are the SAME but with a changing offset per scanline, and that offset is the starting UV for that scanline.
In this thread I just gave a link to my source code, where there is a version with this trick.
let's do it again https://files.scene.org/view/resources/code/sources/rotozoom.zip
TAIL_LOW.PAS - calculates offsets for one line and substitutes the values directly into the code.
In the inner loop, in the end, only this is executed:
Code:
mov ah, ds:[bx+1111h];
mov al, ds:[bx+2222h];
shl eax, 16
mov ah, ds:[bx+3333h];
mov al, ds:[bx+4444h];
mov es:[di], eax
Yeah, and people talked about this right on the first page
Quote:
As for the smoothness in the rotozoomer in Second Reality, it doesn’t really look smooth to me at high zoom levels. Seems like it’s still 8:8. I don’t really understand Psi’s code :)
It pretty much uses every trick mentioned in this thread:
- Renders in 160x100 with doublewide pixels and doubled (quadrupled) scanlines
- Pre-rotates the texture and switches when needed
- Self-modifying code as mentioned above
- Unrolled both for calculating the increments and the rendering
- Uses movs instead of stosws
Also, the movement / "camera" motion is fully keyframed instead of a spline / formula.
Quote:
Self-modifying code as mentioned above
Now I rewatched it again, didn't noticed it before. When it zooms in too close, there are no rotated squares like in normal interpolator, but irregular squares just like Oswald mentioned. Easier to not notice on 160x100.
someone should host a rotozoom golf compo, so we can see what folks can come up with. if there are any unknown tricks.
I can offer a 32 byte solution but that also wins "slowest rotozoomer" ;)
Rotastic 32b (2017)
also, it's not full 360 so ...
Rotastic 32b (2017)
also, it's not full 360 so ...
HellMood: looks nice! even if its slow it does not look jerky :)
