[Question] Rotozoomer that will run on low specs

category: code [glöplog]

Quote:

I don't have comment up yet, but the main asm code was that, with a bit of unroll to win one reg and trying to use regs as much as possible, collecting sample in al and ah to write once for two pixels, using bp also as extra register.

If I understand correctly, when compiling, NASM will generate 160 repetitions of the block inside the "%rep" directive. So it’s more than just a "a bit of unroll" :)

Well, for a 386 this clearly works pretty well. It’s also possible that using "stosw" gives an advantage (compared to my code, where I use "mov" and "add", because on a 486 that’s faster).

Quote:

I was looking at the code of Second Reality where it's smoother.

As for the smoothness in the rotozoomer in Second Reality, it doesn’t really look smooth to me at high zoom levels. Seems like it’s still 8:8. I don’t really understand Psi’s code :)

added on the 2026-02-20 19:05:32 by bitl

Lots and lots of code can be removed and in the most extreme case reduced to only do 1 move per pixel -- oldtimers here will remember the 1995 ~ 1996 timeframe when one-movers were the latest tech for texture mapping, but in reality that technique dates back to 1990 ~ 1992 era via Syntax Terror by Delta Force and World of Commodore by Sanity

My suggestion is to stop with pen + paper or a white board and work out:

1. a way to decompose the UV of each pixel in terms of the UV contributed by the horizontal deltas and the UV contributed by the vertical deltas.

2. how a UV value corresponds to a memory address to read texture from

Once you can work out that and eliminate almost all of the per pixel code, you will be much better prepared to reach enlightment

added on the 2026-03-23 09:45:44 by winden

one-movers, never heard that term before, but if I understand is try to pack the UV together in on register, so that you do one add instead of two for interpolation, and some good strategy of texture data structure, so that the result with very little shift or and or even not, can insta-map to the address to sample from. I thought about that once but never tried so far.

added on the 2026-03-23 10:17:57 by Optimus

You could also ping Trixter on Discord, he did the rotozoom in 8088 MPH.

added on the 2026-03-23 18:02:41 by phoenix

Quote:

You could also ping Trixter on Discord, he did the rotozoom in 8088 MPH.

as far as I know, in "‘8080 MPH’ is nothing special, just unrolled loop:

Code:

add  cx,si
add  dx,bp
mov  al,ch
mov  bh,dh
xlat
stosb

added on the 2026-03-23 18:27:56 by bitl

ok tried to read most of this, but I havent seen this trick mentioned:

_IF_ you only shrink the texture from 100% then there is a trick that I believe all (fast) 8 bit rotozoomers do.

the UV addresses have to be calculated for only one "scanline", all other scanlines are the SAME but with a changing offset per scanline, and that offset is the starting UV for that scanline.

if you try to zoom with this then it will look ugly the zoomed in texels will not look like rotating squares. but they will be ugly irregular sized rectangles. or you cheat and just zoom in a bit or you do it very fast over 100% and then its not really visible.

added on the 2026-03-23 21:17:22 by Oswald

Quote:

ok tried to read most of this, but I havent seen this trick mentioned:

the UV addresses have to be calculated for only one "scanline", all other scanlines are the SAME but with a changing offset per scanline, and that offset is the starting UV for that scanline.

In this thread I just gave a link to my source code, where there is a version with this trick.

let's do it again https://files.scene.org/view/resources/code/sources/rotozoom.zip

TAIL_LOW.PAS - calculates offsets for one line and substitutes the values directly into the code.

In the inner loop, in the end, only this is executed:

Code:

      
      mov ah, ds:[bx+1111h]; 
      mov al, ds:[bx+2222h]; 
      shl eax, 16
      mov ah, ds:[bx+3333h]; 
      mov al, ds:[bx+4444h]; 
      mov es:[di], eax

Yeah, and people talked about this right on the first page

added on the 2026-03-24 10:32:01 by bitl

Quote:

As for the smoothness in the rotozoomer in Second Reality, it doesn’t really look smooth to me at high zoom levels. Seems like it’s still 8:8. I don’t really understand Psi’s code :)

It pretty much uses every trick mentioned in this thread:
- Renders in 160x100 with doublewide pixels and doubled (quadrupled) scanlines
- Pre-rotates the texture and switches when needed
- Self-modifying code as mentioned above
- Unrolled both for calculating the increments and the rendering
- Uses movs instead of stosws
Also, the movement / "camera" motion is fully keyframed instead of a spline / formula.

added on the 2026-03-24 10:53:07 by Gargaj

Quote:

Self-modifying code as mentioned above

Now I rewatched it again, didn't noticed it before. When it zooms in too close, there are no rotated squares like in normal interpolator, but irregular squares just like Oswald mentioned. Easier to not notice on 160x100.

added on the 2026-03-24 13:40:39 by Optimus

someone should host a rotozoom golf compo, so we can see what folks can come up with. if there are any unknown tricks.

added on the 2026-03-24 13:56:12 by rudi

I can offer a 32 byte solution but that also wins "slowest rotozoomer" ;)

Rotastic 32b (2017)

also, it's not full 360 so ...

added on the 2026-03-24 14:52:08 by HellMood

HellMood: looks nice! even if its slow it does not look jerky :)

added on the 2026-03-24 16:28:23 by rudi

Quote:

Yeah, and people talked about this right on the first page

can you quote those posts?

added on the 2026-03-31 01:18:17 by Oswald

Quote:

Quote:
Yeah, and people talked about this right on the first page

can you quote those posts?

Quote:

hfr wrote:
Smaller plattforms unroll speed-code for a quarter scanline or so and fix the interpolation errors in between. This precalced code can be reused for all scanlines because the deltas are constant. That should result in 3 instructions for 2 pixels.

In fact, it's about the same thing, although he's talking about a quarter of a line, not a full line.

added on the 2026-03-31 08:20:47 by bitl

With a completely different approach, you could rotate with three shears. This is what Peabrain did for The Loop on 68000. Other notable implementations include Pet's version in Roots 2.0 (w. the Escher texture) on 68020 and Jobbo's variations from the Cosmic Orbs demos on 68000 (e.g. Backslide to Arcanum, although they are not endlessly repeating the texture, as the others). The Amiga's chipset is very helpful to achieve the effect, esp. the blitter and the copper.

Tom Forsyth also wrote a nice general article about the concept.

added on the 2026-03-31 11:23:11 by noname

thanks for the quote!

added on the 2026-03-31 22:18:16 by Oswald

Quote:

In fact, it's about the same thing, although he's talking about a quarter of a line, not a full line.

I was refering to the same "one movers" Winden mentioned above, where you cut all the fixed point precision and store integer UV offsets directly into the memory-read opcode (self modifying code) and run that for every scanline.
This costs a lot of precision, though. To fix that, you can take a break every few dozens of pixels and check if you are more than half a texel of and fix that (that's why I mentioned "a quarter scanline").
We don't do this any more since they invented caches, though.

added on the 2026-04-14 14:01:48 by hfr

...Ah, and that's exactly was bitl was showing above, beautifully unrolled for 32bit writes.
Sorry for not reading enough backlog ;)

@bitl: does a 486 still benefit from this, btw? Seems like this thread got you a bit in the mood for "Remoded" :)

added on the 2026-04-14 14:08:18 by hfr

Quote:

@bitl: does a 486 still benefit from this, btw? Seems like this thread got you a bit in the mood for "Remoded" :)

Yeah, on a 486 all these tricks still pay off, especially if you manage to hit the cache. If you get cache misses, then nothing really helps :)

Actually, Remoded had been in development long before that topic came up, just a coincidence. But during the discussion here, I did come up with a couple more optimizations for my code :)

added on the 2026-04-14 14:38:30 by bitl

pouët.net

[Question] Rotozoomer that will run on low specs

login