OpenGL framework for 1k intro
category: code [glöplog]
Thread necromancy one more time:
We opened up our 1k framework incl. source. please get it here:
http://laturi.haxor.fi/files/laturi_latest.tar.gz
Instructions could be always better (non-existant) but I'll let the (bad) code do the talking...
I have also been experimenting with PPM-compressor with the latest binaries. Even though the decompressor is only 159 bytes, it is still not very usable for 1k products. Or maybe our products have just been too much hand-crafted to play nicely with LZW. Oh well, more experimentation needed.
  
We opened up our 1k framework incl. source. please get it here:
http://laturi.haxor.fi/files/laturi_latest.tar.gz
Instructions could be always better (non-existant) but I'll let the (bad) code do the talking...
I have also been experimenting with PPM-compressor with the latest binaries. Even though the decompressor is only 159 bytes, it is still not very usable for 1k products. Or maybe our products have just been too much hand-crafted to play nicely with LZW. Oh well, more experimentation needed.
Bumping this because of Crinkler 2.0 release.
Using Crinkler 2.0 repacking the same intro with TINYHEADER and TINYIMPORT took the size from 1024 down to 897. 127 extra bytes is quite a lot in a 1k so now I must decide what to spend those extra bytes on.
Unfortunately I don't think I have time to get something finished for assembly.
  
Using Crinkler 2.0 repacking the same intro with TINYHEADER and TINYIMPORT took the size from 1024 down to 897. 127 extra bytes is quite a lot in a 1k so now I must decide what to spend those extra bytes on.
Unfortunately I don't think I have time to get something finished for assembly.
Well, thats interesting development (I expect we are going to get our asses kicked in assembly)
Also, I think the tiny packer that crinkler has for 1k is quite similar to what us have now (ppm without hashes) hint being the quadratic complexity
Anyway, if someone could tell from their crinkler 2.0 packed 1k-intro how much it consumes bytes for overhead, code, music and shader I would appreciate the info.
Our current situation is almost the same as earlier (see the release) minus 15 bytes.
  
Also, I think the tiny packer that crinkler has for 1k is quite similar to what us have now (ppm without hashes) hint being the quadratic complexity
Anyway, if someone could tell from their crinkler 2.0 packed 1k-intro how much it consumes bytes for overhead, code, music and shader I would appreciate the info.
Our current situation is almost the same as earlier (see the release) minus 15 bytes.
ts: It is not actually using PPM, but a PAQ inspired context mixing scheme similar to the normal crinkler compressor. A model can use any combination of the previous 5 bytes as context, which is a total of 32 potential models. In our experience having support for sparse contexts is vital for achieving good compression, especially on code.
The compressor chooses a subset of the 32 models, which can conveniently be represented by a single 32bit bitmask. This is a significant saving compared to the normal crinkler approach of a 1-byte descriptor per model.
It also turned out that weighting the model predictions was not really worth the added complexity for 1k, so it just uses a completely flat weighting.
The code/data split doesn't seem to be worth it for 1k either.
  
The compressor chooses a subset of the 32 models, which can conveniently be represented by a single 32bit bitmask. This is a significant saving compared to the normal crinkler approach of a 1-byte descriptor per model.
It also turned out that weighting the model predictions was not really worth the added complexity for 1k, so it just uses a completely flat weighting.
The code/data split doesn't seem to be worth it for 1k either.
The compressor is more or less the same as what we used for untraceable
  
Just learning the new 2.0 version and it's a work-in-progress, but so far it looks like this, final packed bytes allocation approximately
Music code+data: 135+24, not including initialization which is in main program
Main program code+data: 128+40
Import&hash etc code+data: 88+27
Shader: 335
Total exe size: 1013
In total that would be 662 bytes actual content + 115 bytes import&hash. And 11 bytes still unused. ;) I guess you still have a bit more "payload" on the Mac?
  
Music code+data: 135+24, not including initialization which is in main program
Main program code+data: 128+40
Import&hash etc code+data: 88+27
Shader: 335
Total exe size: 1013
In total that would be 662 bytes actual content + 115 bytes import&hash. And 11 bytes still unused. ;) I guess you still have a bit more "payload" on the Mac?
Quote:
It is not actually using PPM, but a PAQ inspired context mixing scheme similar to the normal crinkler compressor. A model can use any combination of the previous 5 bytes as context, which is a total of 32 potential models. In our experience having support for sparse contexts is vital for achieving good compression, especially on code.
The compressor chooses a subset of the 32 models, which can conveniently be represented by a single 32bit bitmask. This is a significant saving compared to the normal crinkler approach of a 1-byte descriptor per model.
Well, using PAQ-like context mixing is easy when you have a flat probabilities. You can get rid of all those pesky log/exp stuff on stretch and squash.
I experimented using 4bit masks for the context as well. It turned out that it is better to have a full 8 bytes +1 of history than 4 bytes +1 using nibbles. So you are then getting different results...
How does your compression fare against gzip/paq8i? Whats the length of the decoder? I'm hitting a wall trying to get my decompressor smaller than 143 bytes. (but that includes weights for models and proper context mixing)
Quote:
I guess you still have a bit more "payload" on the Mac?
Nope, now we are splitting hairs. These sizes are with rounding errors from our 2015 intro, (which we still use gzip since it produces smaller executable):
gzip decompress header: 44 bytes
mach-o header: 92 bytes
import + hashes: 136 + 72 bytes
code: 138 bytes
music: 126 bytes
shader 414 bytes
so we have 678 of actual content out of 1023 bytes currently used.
Although compression ratios are probably a bit different I guess I can say it is not about platform anymore
(Mental note: I need to rewrite the import-by-hash, it is from 2010 and not really updated)
ts: As I understand your explanation, it is not exactly the same thing. It is actually somewhere in between the two approaches you mention. It is 5 + 1bytes of context. Typically around half of the potentially 32 models are used for encoding, so using a 4-byte mask to indicate which models to use turns out to be a lot simpler than using a byte or nibble mask per model.
What do you mean by proper context mixing? I'm guessing you are not doing logistic mixing or anything like that? In my experience there is little benefit to this compared to counter space mixing when you use the right discounting and boosting heuristics.
I don't have any numbers at hand, but afair the compression is generally a lot better than gzip and somewhat worse than paq8. I will have to perform some experiments to tell you exactly how it fares against PAQ these days.
  
What do you mean by proper context mixing? I'm guessing you are not doing logistic mixing or anything like that? In my experience there is little benefit to this compared to counter space mixing when you use the right discounting and boosting heuristics.
I don't have any numbers at hand, but afair the compression is generally a lot better than gzip and somewhat worse than paq8. I will have to perform some experiments to tell you exactly how it fares against PAQ these days.
ts: As a point of reference, I have tried compressing main_shader_1280_720.fsh from the laturi_latest archive. Compressed in isolation it ends up being 308.24bytes. Let me know if you have a larger lump of data you would like me to test.
  
Quote:
What do you mean by proper context mixing?
Yes, logistic mixing. I did not get any good results without it. I avoid having exps and log operations by having the context weights 2^n and total weight 2^m as well.
Then it reduces into multiplications, divisions and square root and I save a cpu register by having accumulator in the fpu. I think code wise it is not much larger...
Quote:
there is little benefit to this compared to counter space mixing when you use the right discounting and boosting heuristics.
Care to share what are these?
Quote:
Compressed in isolation it ends up being 308.24bytes.
I get 312 bytes with my packer. (355 bytes with gzip) Ballpark seems to be the same.
ts:It is very interesting that you ended up finding similar results going down that path. I experimented with logistic mixing at some point, but never managed to get worthwhile improvements compared to the counter mixing approach we were already using.
The heuristics are:
boost_factor is a parameter optimized by the compressor. It is usually 4-10.
  
The heuristics are:
Code:
Context update: c[bit]++, c[!bit] = (c[!bit] + 1) >> 1
Add model counters to accumulator: n[i] += c[i] * (c[0]*c[1] ? 1 : boost_factor)
boost_factor is a parameter optimized by the compressor. It is usually 4-10.
Well I finished something that I am happy with but I won't be able to enter it at assembly. Good luck to yzi and ts and everyone else in the 1k compo. Sorry I can't compete with you but maybe next year I can plan it better to have something ready for assembly.
I guess I will hold on to my 1k and release it at a local party here in Australia in october.
Just want to say thanks again to Mentor and Blueberry, this crinkler update gave me motivation and inspiration to create another tiny intro.
1021 bytes compression breakdown:

  
I guess I will hold on to my 1k and release it at a local party here in Australia in october.
Just want to say thanks again to Mentor and Blueberry, this crinkler update gave me motivation and inspiration to create another tiny intro.
1021 bytes compression breakdown:

(Since I love resurrecting this thread again and again)
I made my compressor now open source (Simplified BSD)
Stolen ideas but fresh code. :P
Although we did not use it for our 1k stuff yet, our 4k for ASM15 had a pre-version of it. Now the decompressor is small enough to make it feasible to use it for 1k as well
http://www.pouet.net/prod.php?which=66926
Go and experiment, compare (and report your results)
  
I made my compressor now open source (Simplified BSD)
Stolen ideas but fresh code. :P
Although we did not use it for our 1k stuff yet, our 4k for ASM15 had a pre-version of it. Now the decompressor is small enough to make it feasible to use it for 1k as well
http://www.pouet.net/prod.php?which=66926
Go and experiment, compare (and report your results)




