Crinkler by Loonies [web] & TBC
CRINKLER - Compressing linker for Windows specialized for 4k intros Aske Simon Christensen "Blueberry/Loonies" Rune L. H. Stubbe "Mentor/TBC" Version 2.3 (July 21, 2020) Web: http://crinkler.net/ GitHub: https://github.com/runestubbe/Crinkler Forum: http://www.pouet.net/prod.php?which=18158 Mail: authors@crinkler.net VERSION HISTORY --------------- 21.07.20: 2.3: Crinkler is now open source under the Zlib license! /TINYHEADER size reduced by another 3 bytes. Faster hash size optimization, especially with many cores. Include function name in unresolved symbol error message. Automatically import from msvcrt. No need for lib any more. Built-in, minimal console statup code supplying args to main. Disable built-in startup and msvcrt import via /NODEFAULTLIB. Support more kinds of lib files (from rustc, for instance). Don't crash on missing export table (for Wine kernel32.dll). Only write one dump file if several threads crash. 15.06.19: 2.2: /TINYHEADER decompression code is 6 bytes smaller. Fixed memory size increase of recompressed /TINYHEADER intros. 19.01.19: 2.1a: Fixed width of report to make room for the 32 hex columns. /REUSEMODE:WRITE to write the reuse file without reading it. 18.12.18: 2.1: Crinkler executable built for both 32 bit and 64 bit. New, slightly different model estimation. 8-12x speedup. Optimized section reordering. About 3-4x speedup. Optimized and multi-threaded hash size optimization. New /COMPMODE:VERYSLOW option for a few extra bytes. Changed default compression mode to SLOW. Changed default HASHSIZE to 500 and HASHTRIES to 100. /REUSE option: Use models and ordering from last run. /REUSEMODE:STABLE to quickly iterate when making changes. /REUSEMODE:IMPROVE to improve upon previous compression. Print output file size in report. More compact bits-per-byte color legend in report. Choose configuration instead of hiding/showing in report. 32 column hex view and other adjustments in report. Avoid crash if an existing file could not be opened. Updated internal function list to Windows version 1809. 28.03.18: 2.0a: Fixed Crinkler crash on recent Windows SDK versions. Fixed Crinkler crash on forwards from ole32.dll. Corrected horizontal alignment issue in HTML report. Support forwarded RVA imports with /TINYIMPORT. Fixed spurious import of MessageBox with /TINYIMPORT. Print compatibility warning when using /TINYIMPORT. Updated internal function list to Windows version 1803. Extended description in the manual of /TINYIMPORT. Updated download link for the lib file for msvcrt.dll. 28.07.15: 2.0: /TINYHEADER option: smaller decompressor for 1k intros. /TINYIMPORT option: smaller import code for 1k intros. /EXPORT option to export code and data symbols. /SATURATE option to saturate context counters. /FALLBACKDLL option for when a DLL is not available. /UNALIGNCODE option to set alignment of all code to 1. Support for /REPLACEDLL during recompression. Consistent size between model estimation and reordering. Header size reduced by 2 bytes. Print previous size of output file. Accept version specifier after /SUBSYSTEM value. Switched from Intel OpenMP to MSVC concurrency API. 19.01.13: 1.4: Output EXE files work with recent NVIDIA drivers. New zero-section header layout saving around 30-50 bytes. Forwarded RVA imports supported via link-time forwarding. Dynamic C++ initializers supported. Support for producing Large Address Aware executables. Crinkler is Large Address Aware, handling larger inputs. Report all unresolved symbols and the location of each. Better resolving of ambiguous label references in report. Various adjustments to textual output. /RECOMPRESS overwrites input file by default. 05.03.11: 1.3: Fixed Crinkler crash on some AMD systems. Header size reduced by 21 bytes. Slightly improved model hash function. /OVERRIDEALIGNMENTS option to specify label alignments. No limit on the number of calls in call transform. Import code and entry point movable by section reordering. Fixed bug in handling of files with absolute path. Fixed labels in report showing up in the wrong section. Crinkler writes .dmp files in case of a crash. 05.09.09: 1.2: Output EXE files are now Windows 7 compatible. Output EXE files are no longer Windows 2000 compatible. Header size reduced by 16 bytes. Non-range import code is (usually) slightly smaller. Slightly improved section ordering estimation. /RECOMPRESS option to recompress Crinkler-compressed executables, optionally with different parameters. /FIX removed, as it is subsumed by /RECOMPRESS. 14.01.09: 1.1a: Fixed /TRUNCATEFLOATS crashing in some cases. Improved /ORDERTRIES estimation when call transform is used. Sometimes sections were misplaced in the HTML report. Various improvements to the HTML report. The /FIX option can input and output to the same file. Helpful error messages for various unsupported features. Prefer a custom entry point to a standard library one. New section in the manual about runtime libraries. 12.01.08: 1.1: Support for weak externals (virtual C++ destructors). Fixed compatibility with Data Execution Prevention. /REPORT option for a colorful HTML compression report. /TRUNCATEFLOATS option to mutilate float constants. /SAFEIMPORT is now default, disabled with /UNSAFEIMPORT. Slightly smaller overhead if range importing is not used. Fixed some problems with compressing very small files. /VERBOSE:FUNCTIONS removed, as it is subsumed by /REPORT. Remaining /VERBOSE options renamed to /PRINT. Maximum number of ORDERTRIES increased to 100000. 07.01.07: 1.0a: New /VERBOSE:FUNCTIONS options to sort the functions. Various verbose output fixes. Various crash fixes. A fix to the /FIX Crinkler version recognizer. 27.12.06: 1.0: Output EXE files are now Windows Vista compatible. Compression tweak for greatly improved compression ratio. Much faster compression. Automatically takes advantage of multiple processors. Improved Visual Studio 2005 integration. /COMPMODE:INSTANT option for very quick compression. /ORDERTRIES option to try out different section orderings. /SAFEIMPORT option to insert a check for nonexistent DLLs. /PROGRESSGUI option for a graphical progress bar. /REPLACEDLL option to replace one DLL with another. /FIX option to fix compatibility problems of older versions. 09.02.06: 0.4a: Fixed linker crash problem with blank member entries in some library files (such as glut32). The /PRIORITY option was not mentioned in the commandline usage help. 18.12.05: 0.4: Changed header and import code to make output EXE files compatible with 64-bit versions of Windows. Fixed a bug in the ordinal range import mechanism. Added a switch to control the process priority. Added a warning for range import of an unused DLL. Some more header squeezing. 31.10.05: 0.3: Output EXE files are now Windows 2000 compatible. Added a number of verbose options to output useful information about the program being compressed. Added an option for transforming function calls to use absolute offsets to improve compression. Fixed a bug in the linker regarding identically named sections. Fixed a potential crash bug in the linker. Various small tweaks and optimizations. 23.07.05: 0.2: Fixed bug in the decompressor. Changed the behaviour of the /CRINKLER option. Added timing to the progress bars. Some updates to the manual and usage description. 21.07.05: 0.1: First release. BACKGROUND ---------- Ever since the concept of size-limited demo competitions was introduced in the early 1990's (and before that as well), people have been using executable file compressors to reduce the size of their final executables. An executable file compressor is a program that takes as input an executable file and produces a new executable file which has the same behaviour as the original one but is (hopefully) smaller. The usual technique employed by executable file compressors is to compress the contents of the executable file using some general purpose data compression method and prepend to this compressed data a small piece of code (the decompressor) which decompresses the contents into memory in such a way that it looks to the code as if the original executable file had been loaded into memory in the normal way. The size of the decompressor is usually around a few hundred bytes, depending on the complexity of the compression method. This constitutes an unavoidable overhead in the compressed file, which is particularly evident for small files, such as 4k intros. Furthermore, the header of the Windows EXE file format contains a lot of information that needs to be there at fixed offsets in order for Windows to be able to load the file. The presence of these overheads from the header and decompressor motivated people to look for other means of compressing their 4k intros. Until Crinkler came around, the most popular strategy for compressing 4k intros for Windows was CAB dropping: A few simple transformations are performed on the executable to make it compress better (such as merging sections and setting unused header fields to zero), and the result is compressed using the Cabinet Compression tool included with Windows. The resulting .CAB file is renamed to have .BAT extension, and some commands are inserted into the file such that when the .BAT file is executed, it decompresses the executable to disk (using the Cabinet decompression command), runs the executable and then deletes the executable again. This saves the size of the decompression code (since an external program is used to do the decompression) and some of the size of the header (since the header can be compressed). Various dropping strategies combined with other space-saving hacks people employed on their 4k intros (in particular import by ordinal) caused severe compatibility problems. More often than not, people who wanted to run a newly released 4k intro found that it did not work on their own machine. It became customary to include a 'compatible' version in the distribution which was larger than 4k but worked on all machines. For a time, it seemed that the term '4k intro' meant '4k on the compo machine' intro. The main motivation for starting the Crinkler project was the feeling that the existing means available for compressing 4k intros were unsatisfactory. We want 4k intros that are self-contained EXE files. We want 4k intros that are 4 kilobytes in size. Our aim for Crinkler is to be the cleanest, most effective and most compatible executable file compressor for Windows 4k intros. COMPATIBILITY ------------- The goal of Crinkler is for the produced EXE files to be compatible with all widely used Windows versions and configurations. As of version 2.0a, the EXE files produced by Crinkler are, to the best of our knowledge, compatible with Windows XP, Windows Vista, Windows 7, Windows 8 and Windows 10, both 32 bit and 64 bit versions. They are compatible with Data Execution Prevention and with execution hooks that inspect the import or export table of launched executables (graphics drivers are known to do this). It is not a primary goal of Crinkler to anticipate incompatibilities that may arise in the future as a consequence of new Windows versions, graphics drivers or other widespread system changes. Guaranteeing such compatibility would require Crinkler to follow the EXE file format specification to the letter, precluding most of the header hacks that Crinkler utilizes in order to reduce the size overhead of the EXE format as much as possible. Rather, we strive to continually monitor the compatibility situation and release a new, fixed version of Crinkler whenever a situation arises that affects the compatibility severely (such as a new, incompatible version of Windows). This has occurred several times already throughout the history of Crinkler. Each new version of Crinkler not only produces executables that are compatible with the current majority of targeted systems. It also includes a way of fixing old Crinkler executables to have the same level of compatibility. See the section on recompression for more details on this feature. This compatibility strategy ensures that intros made using Crinkler will continue to be accessible to their audience, even if the Windows EXE loader changes in an incompatible way that could not be anticipated at the time the intro was produced. INTRODUCTION ------------ Crinkler is a different approach to executable file compression. While an ordinary executable file compressor operates on the executable file produced by the linker from object files, Crinkler replaces the linker by a combined linker and compressor. The result is an EXE file which does not do any kind of dropping. It decompresses into memory like a traditional executable file compressor. Crinkler employs a range of techniques to reduce the size of the resulting EXE file beyond what is usually obtained by using CAB compression: - Having control over the linking step gives much more flexibility in the optimizations and transformations possible on the data before and after compression. - The compression technique used by Crinkler is based on context modelling, which is far superior in compression ratio to the LZ variants used by CAB and most other compressors. The disadvantage of context modelling is that it is extremely slow, but this is of little importance when only 4 kilobytes need to be compressed. It also needs quite a lot of memory for decompression, but this is again not a problem, since the typical 4k intro uses a lot of memory anyway. - The actual compression algorithm performs many passes over the data in order to optimize the internal parameters of the compressor. This results in slower compression, but this is usually a reasonable price to pay for the extra bytes gained on the file size. - The contents of the executable are split into two parts - a code part and a data part - and each of these are compressed individually. This leads to better compression, as code and data are usually very different in structure and so do not benefit from being compressed together. - DLL functions are imported by hash code. This is robust to structural changes to the DLL between different versions while being quite compact - only 4 bytes per imported function. For DLLs with fixed relative ordinals (such as opengl32), a special technique, ordinal range import, can be used to further reduce the number of hash codes needed. - Much of the data in the EXE header is actually ignored by the EXE loader. This space is used for some of the decompression code. Using Crinkler is somewhat different from using an ordinary executable file compressor because of the linking step. In the following sections, we describe its use in detail. INSTALLATION ------------ To use as a stand-alone linker, Crinkler does not need any installation. Simply run crinkler.exe from the commandline with appropriate arguments, as described in the next section. However, if you are using Microsoft Visual Studio to develop your intro, the easiest way to use Crinkler is to run it in place of the normal Visual Studio linker. Crinkler has been designed as a drop-in replacement of the Visual Studio linker, supporting the same basic options. All of the options can then be set using the Visual Studio configuration window. Unfortunately, Visual Studio does not (as of this writing) support replacing its linker by a different one. So what you have to do to make Visual Studio use Crinkler for linking is the following: - Copy crinkler.exe to your project directory or to some other directory of your choice and rename it to link.exe. If you are using some other linker with a different name, such as the one used with the Intel C++ compiler, call it whatever the name of the linker is. - For Visual Studio 2008 and older, select Tools/Options... and go to Projects and Solutions/VC++ Directories. For Visual Studio 2010 or newer, open a project, select View/Property Manager, expand a project and a configuration, double click on Microsoft.Cpp.Win32.user and go to Common Properties/VC++ Directories. - At the top of the list for Executable files, add the directory where you placed Crinkler named link.exe, or add $(SolutionDir) to make it search in the project directory. - In the Release configuration (or whichever configuration you want to enable compression), under Linker/Command Line/Additional Options, type in /CRINKLER, along with any other Crinkler options you want to set. See the next section for more details on options. Also set Linker/Manifest File/Generate Manifest to No and C/C++/Optimization/Whole Program Optimization to No. If you have Visual Studio installed but want to run Crinkler from the commandline, the easiest way is to use the Visual Studio Command Prompt (available from the Start menu), since this sets up the LIB environment variable correctly. You can read off the value of the environment variables by running the 'set' command in this command prompt. If you are using a different command prompt, you will have to set up the LIB environment variable manually, or use the /LIBPATH option. USAGE ----- The general form of the command line for Crinkler is: CRINKLER [options] [object files] [library files] [@commandfile] When running from within Visual Studio, the object files will be the ones generated from the sources in the project. The library files will be the standard set of Win32 libraries, plus any additional library files specified under Linker/Input/Additional Dependencies. Crinkler automatically links to msvcrt.dll - the Visual Studio 6 runtime library. You don't need to specify a library file for this. The following options are compatible with the VS linker and can be set using switches in the Visual Studio configuration window: /SUBSYSTEM:CONSOLE /SUBSYSTEM:WINDOWS (Linker/System/SubSystem) Specify the Windows subsystem to use. If the subsystem is CONSOLE, a console window will be opened when the program starts. The subsystem also determines the name of the default entry point (see /ENTRY). The default subsystem is WINDOWS. /LARGEADDRESSAWARE /LARGEADDRESSAWARE:NO (Linker/System/Enable Large Addresses) Specify whether the executable is able to handle addresses above 2 gigabytes. If this option is enabled, the executable will be able to allocate close to 4 gigabytes of memory. /OUT:[file] (Linker/General/Output File) Specify the name of the resulting executable file. The default name is out.exe. /ENTRY:[symbol] (Linker/Advanced/Entry Point) Specify the entry label in the code. The default entry label is mainCRTStartup for CONSOLE subsystem applications and WinMainCRTStartup for WINDOWS subsystem applications. If you use the CONSOLE subsystem without changing the entry point from its default value and don't define mainCRTStartup yourself, Crinkler will insert a small entry point that will read the commandline and call main with the usual argc and argv parameters. /NODEFAULTLIB (Linker/Input/Ignore All Default Libraries) Disable the automatic linking against msvcrt.dll and the automatic insertion of mainCRTStartup for CONSOLE applications. /LIBPATH:[path] (Linker/General/Additional Library Directories) Add a number of directories (separated by semicolons) to the ones searched for library files. If a library is not found in any of these, the directories mentioned in the LIB environment variable are searched. @commandfile Commandline arguments will be read from the given file, as if they were given directly on the commandline. In addition to the above options, a number of options can be given to control the compression process. These can be specified under Linker/Command Line/Additional Options: /CRINKLER Enable the Crinkler compressor. If this option is disabled, Crinkler will search through the path for a command with the same name as itself, skipping itself, and pass all arguments on to this command instead. This will normally invoke the Visual Studio linker. If the name of the Crinkler executable is crinkler.exe, this option is enabled by default, otherwise it is disabled by default. /RECOMPRESS Decompress a Crinkler-compressed executable and recompress it using the given options. The resulting executable will have the same level of compatibility as one produced directly by the current version of Crinkler. See the section on compatibility for more information on the compatibility of Crinkler-produced executables. When this option is specified, Crinkler takes a single file argument, which must be an EXE file produced by Crinkler 0.4 or newer. See the section on recompression below for a description of the options that can be given to control the decompression process. /PRIORITY:IDLE /PRIORITY:BELOWNORMAL /PRIORITY:NORMAL Select the process priority at which Crinkler will run while compressing. The default priority is BELOWNORMAL. Use IDLE if you want Crinkler to disturb you as little as possible. Use NORMAL if you don't need your machine for anything else while compressing. /COMPMODE:INSTANT /COMPMODE:FAST /COMPMODE:SLOW /COMPMODE:VERYSLOW Choose between four different algorithms for the model estimation. The FAST compression mode performs a very quick estimation, whereas the SLOW mode takes up to some tens of seconds for a typical 4k, but also compresses significantly better. VERYSLOW is about 5-10x slower than SLOW and typically a few bytes better. INSTANT skips model estimation entirely and just uses a fixed set of models and weights. It also skips section reordering and hash table size optimization. Use INSTANT if you just want to check that your program works in compressed form and don't care about the size. The default compression mode is SLOW. /SATURATE The compressor and decompressor use pairs of 8-bit counters to track the distributions of 0 and 1 bits for each context. If your data is very repetitive (contains large blocks of the same pattern of values repeated over and over again), these counters may wrap around, which can sometimes hurt compression of these repetitive areas. This option inserts extra code in the decompression header to keep these counters from wrapping. It is worth trying out if you have large, repetitive regions and see in the compression report that the data in these regions suddenly jumps up from lightest green to slightly darker green for no apparent reason. /HASHSIZE:[memory size] Specify the amount of memory the decompressor is allowed to use while decompressing, in megabytes. In general, the more memory the decompressor is allowed to use, the better the compression ratio will be, though only slightly. The memory requirements of the final executable (the size of the executable image when loaded into memory) will be the maximum of this value and the original image size. The memory will not be deallocated until the program terminates, and any heap allocation the program performs will add to this memory usage. The default value is 100, which is usually a good compromise. /HASHTRIES:[number of retries] Specify the number of different hash table sizes the compressor will try in order to find one with few collisions. More tries lead to longer compression time but slightly better compression. The default value is 20. Higher values rarely improve the size by more than a few bytes. /TINYHEADER Enables an alternative compression algorithm trading off some compression efficiency for an even smaller decompression overhead. This can be beneficial when targeting extremely small file sizes such as 1kb. The simpler decompressor gathers statistics by repeated linear searches instead of hashing. This results in an O(n^2) decompression time which can become prohibitively slow for files significantly larger than 1kb. The COMPMODE, HASHSIZE, HASHTRIES, REUSE, SATURATE and EXPORT options are ignored when TINYHEADER is enabled. /TINYIMPORT Enables a more compact, but less future-proof, function importing scheme which does not require the explicit storage of function name hashes. This is achieved by indiscriminately importing every function from the relevant DLLs. The imported functions are scattered in an import table based on their function name hashes. Intuitively, this embeds the hash code entropy directly into the call instruction. Crinkler ensures that the import table size and hash function are chosen such that there are no collisions between the functions used by the linked program and other functions which are imported later from the DLLs. This way, the desired function pointers will be intact in the import table. However, Crinkler can only ensure this for functions that it knows about. These include the functions present in the DLLs on the system on which Crinkler is run, plus an internal list consisting of functions from commonly imported DLLs covering most supported Windows versions available at the time of release (Spring Creators Update 2018 version 1803 as of Crinkler 2.0a). Thus, this import technique is less resilient to changes in future windows versions, since when functions are added in a future version of the DLL, they may collide with functions used by the program, in which case the program will cease to work. Programs broken this way cannot be fixed by recompression. When using this options, it is strongly recommended to also distribute safe versions using ths normal import mechanism. The UNSAFEIMPORT, FALLBACKDLL and RANGE options are ignored when TINYIMPORT is enabled. /ORDERTRIES:[number of retries] Specify the number of section reordering iterations that the linker will try out in search for the ordering that gives the best compression ratio. The default is not to do any reordering. Crinkler starts from a heuristic ordering (the one used when initially estimating models) and incrementally makes small, random changes to the ordering to see if it can find one that compresses better. Specifying this option drastically increases the compression time, since Crinkler has to calculate the compressed size anew on every reordering. Usually, the size does not improve noticeably after a few thousand iterations. /REUSE:[reuse parameter file name] /REUSEMODE:STABLE /REUSEMODE:IMPROVE /REUSEMODE:WRITE /REUSEMODE:OFF After compression, write information about the selected models, the ordering of sections and the optimized hash table size to a text file with the specified name. If the file exists already, use the parameters in the file as input to the compression in a manner dependent on the chosen REUSEMODE: With STABLE (the default), skip all model estimation, section reordering and hash table size optimization and simply use the parameters exactly as in the file. Keep the reuse file as is. This option can be used to try out small changes to the contents of the code and data with a stable compression. Thus, it gives a much more reliable estimation of whether the change was an improvement or not. It is also useful as a way to compress very quickly after the first time with a similar compression ratio. With IMPROVE, only the section ordering from the file is reused, and a normal compression procedure is performed. If section reordering is enabled, it starts from the ordering in the reuse file and tries to optimize the ordering based on that. The file is written back only if the final file size is smaller than what the parameters in the reuse file would have given (which is not necessarily the size of the existing file, depending on what changes and operations are performed in the meantime). The option can be used to check whether better parameters can be found than the ones cached in the reuse file. It is also a way to run some extra reordering iterations (if reordering is enabled) to see if this improves compression. For both modes, it can be useful to edit the reuse file by hand to try out parameters manually or to nudge Crinkler in some direction. With WRITE, the reuse file is not read, but is still written after compression, overwriting the file if it exists. This can be conveniently used when reuse is not desired, such that it can be switched on at any time (by changing the reuse mode to STABLE or IMPROVE) without needing another compression run. With OFF, it is as if no reuse file is specified. This is simply a way to disable the option without removing the file from the commandline. If COMPMODE is set to INSTANT, the reuse mode is also considered to be OFF. /RANGE:[DLL name] Import functions from the given DLL (without the .dll suffix) using ordinal range import. Ordinal range import imports the first used function by hash and the rest by ordinal relative to the first one. Ordinal range import is safe to use on DLLs in which the ordinals are fixed relative to each other, such as opengl32 or d3dx9_??. This option can be specified multiple times, for different DLLs. /REPLACEDLL:[oldDLL]=[newDLL] Whenever a function is imported from oldDLL, import it from newDLL instead. DLL replacement is useful when the end user might not have the version of the DLL that you are linking to. A typical use is to replace one version of d3dx9_?? by another. Only use this option if you know that the two DLLs are compatible. When REPLACEDLL and RANGE are used together, RANGE must refer to the new DLL. /FALLBACKDLL:[firstDLL]=[otherDLL] If firstDLL fails to load, try loading otherDLL and import the functions from there instead. For instance, to use d3dcompiler_47 when available but fall back to d3dcompiler_43 otherwise (since the shader compiler in d3dcompiler_47 is much faster), link to d3dcompiler_47 and use: /FALLBACKDLL:d3dcompiler_47=d3dcompiler_43 The FALLBACKDLL option can be used together with REPLACEDLL to specify a primary DLL other than the one your SDK links to. For instance, if you are using the legacy DirectX SDK (which links to d3dcompiler_43) and want to have the above prioritization, use: /REPLACEDLL:d3dcompiler_43=d3dcompiler_47 /FALLBACKDLL:d3dcompiler_47=d3dcompiler_43 Arbitrarily long chains of DLL fallback can be used by specifying the FALLBACKDLL option multiple times, though the chains can of course not be cyclic. /EXPORT:[name] /EXPORT:[name]=[symbol] /EXPORT:[name]=[value] Include an export table into the executable, containing an export with the given name. The first version exports an existing symbol under its existing name. The second version exports an existing symbol under a different name. The third version creates a 32-bit integer with the given value and exports it under the given name. The value can be specified in octal (prefixed with 0), decimal or hexadecimal (prefixed with 0x) format. The first version is compatible with the VS linker, but there is currently no specific field for it in the configuration window. The export table will be compressed along with the other data in the executable and decompressed to the memory address specified in the export table pointer in the PE header. Thus, the exports defined this way are only visible to code inspecting the export table after decompression has taken place. For PE header technical reasons, all exports must be placed earlier in memory than the export table. Thus, only symbols in the code and data sections can be exported. If an uninitialized (BSS) symbol is exported, it will be automatically moved to the data section (with a warning). Beware that this will move the whole section containing the symbol, so other symbols might be moved along with it. The EXPORT option can be used to signal to the graphics driver that your program desires to run on the high-performance GPU in a multi- GPU system. This saves the user from having to right-click on the executable and select "Run with graphics processor...". To request high performance on NVIDIA Optimus systems, use: /EXPORT:NvOptimusEnablement=1 To request high performance on AMD PowerXpress/Enduro systems, use: /EXPORT:AmdPowerXpressRequestHighPerformance=1 An arbitrary number of exports can be specified, so the two high performance declarations can be used together if you have space enough to spare. /UNSAFEIMPORT If the executable fails to load some DLL, it will normally pop up a message box with the DLL name. This option disables this check to save a few bytes (usually around 20). With unsafe import, the executable will crash if a needed DLL is not found. /TRANSFORM:CALLS Change the relative jump offsets in all internal call instructions (E8 opcode) into absolute offsets from the start of the code. This usually improves compression, since multiple calls to the same function become identical. The transformation has an overhead of about 20 bytes for the detransformation code, but the net savings on a full 4k can be as large as 50 bytes, depending on the number of calls in your code. /NOINITIALIZERS Disable the inclusion of dynamic C++ initializers. The default is to insert calls to each of the initializers just before the entry point. /TRUNCATEFLOATS:[number of bits] Floating point constants can take up a significant amount of space in an intro, and often much of this space is wasted because the constants have more precision than needed. Typically, many bytes can be saved by rounding floating point constants to "nice" values - that is, values where many bits in the mantissa are zero. However, such rounding is cumbersome, especially when the constants are written in decimal notation. The purpose of the /TRUNCATEFLOATS option is to automate this rounding process. When this option is given, Crinkler tries to identify float and double constants and round them to the number of bits given (between 1 and 64). If no number is given, 64 is assumed. Typically, object files do not contain any information about what data is floating point constants and what is not (though the file format does support such information). This means that in order to identify floating point constants, Crinkler has to resort to heuristics based on label names. These heuristics are able to recognize constants in code and some variables, but far from all. You can tell Crinkler explicitly that some variable contains float data and how much it should be truncated by having the variable name (or label) start with tf[n]_ where [n] is the number of bits to truncate the constants to. The number of bits can be omitted, in which case the number of bits given in the argument to /TRUNCATEFLOATS is used. Such variables will still only be truncated if the /TRUNCATEFLOATS option is given. Example: const float tf14_positions[] = { 0.1f, 0.35f, 0.25f }; This will truncate the constants in the table to 14 bits (5 bits of mantissa), resulting in the values 0.099609375, 0.3515625 and 0.25, respectively. Tip: rather than changing the variable name and all references to it each time you want to change the truncation precision, use a define: #define positions tf14_positions Note that /TRUNCATEFLOATS is an unstable and highly experimental feature. Make sure to test the compressed file to verify that the result is acceptable. Remember to include the musician in this verification process. :) /OVERRIDEALIGNMENTS:[bits of alignment] It is often possible to improve compression by placing uninitialized variables at addresses divisible by high powers of two, since this will cause all references to these addresses to contain more zeros. The PE file format only supports up to 13 bits of alignment (8192), and some tools do not even expose this support fully (for instance, Nasm only supports alignments up to 64). Usually, much higher alignments are desirable. Crinkler supports explicit alignment of labels at up to one gigabyte (30 bits). When you specify the /OVERRIDEALIGNMENTS option, Crinkler will look for labels containing the string align[n] where [n] is the number of bits of alignment desired (e.g. 8 for 256-byte alignment). It will then align the section containing that label such that the label address is divisible by 2^[n]. The label does not have to be at the beginning of the section, but there can be at most one explicitly aligned label in each section. The alignment specifier can optionally include an alignment offset, specified by the string align[n]_[m] where [n] is the number of bits of alignment and [m] is the offset in bytes. This will place the label [m] bytes after an aligned address, i.e. such that the address minus [m] is divisible by 2^[n]. If a numerical argument is given to /OVERRIDEALIGNMENTS, all uninitialized sections which do not contain an explicitly aligned label will be aligned to the given number of bits (if larger than their original alignment). If the option is specified without argument, uninitialized sections which do not contain an explicitly aligned label will be aligned as specified in the object file, as normally. A convenient way to specify explicit alignments in C++ code is in a header file included by all files in the project, containing definitions like this: #define MusicBuffer MusicBuffer_align24 In assembler files, alignments can be specified as local labels: MusicBuffer: .align24 ; buffer space here Explicit alignment can be used on code and data sections as well, except for the section containing the entry point, which will always be 1-byte aligned. The space between the sections will be padded with zero bytes. /UNALIGNCODE Force all code sections to use alignment of 1, eliminating all padding between them. This usually improves compression, but can result in slightly lower performance if some functions are called in performance critical loops. The /OVERRIDEALIGNMENTS mechanism has priority over /UNALIGNCODE, so if you want to excempt a few functions from being unaligned, you can specify an explicit alignment for these as described for /OVERRIDEALIGNMENTS. Finally, Crinkler has a number of options for controlling the output during compression. Just like the other options, these can be specified under Linker/Command Line/Additional Options: /REPORT:[HTML file name] Write an HTML file with a detailed, colorful, interactive report on the compression result. The code section will be shown as hex dump and disassembly of the code, and the data section will be shown as hex and ascii dump. All bytes will be colored to show how much that byte was compressed. This report can be useful in determining which parts of the executable take up the most space and which things to change to reduce the size. /PRINT:LABELS Print a list of all labels in the program along with uncompressed and compressed sizes for the data between the labels. This is a stripped down version of the information provided by the /REPORT option. /PRINT:IMPORTS List all functions imported from DLLs. The functions are grouped by DLL, and functions imported by ordinal range import are grouped into ranges. /PRINT:MODELS List the model masks and weights selected by the compressor. This is mostly for internal use. /PROGRESSGUI Open a window showing a graphical progress indicator. An example commandline for linking and compressing an intro could look like this (split on multiple lines for readability): crinkler.exe /OUT:micropolis.exe /SUBSYSTEM:WINDOWS /RANGE:opengl32 /COMPMODE:SLOW /ORDERTRIES:1000 /PRINT:IMPORTS /PRINT:LABELS kernel32.lib user32.lib gdi32.lib opengl32.lib glu32.lib winmm.lib micropolis\startup.obj micropolis\render.obj micropolis\render-asm.obj micropolis\sound.obj micropolis\sound-asm.obj RECOMPRESSION ------------- A new feature in Crinkler 1.2 is the abillity to recompress an already Crinkler-compressed executable. The main purpose for the feature is to patch an executable compressed using an earlier version of Crinkler so that it runs on recent Windows versions. But it can also be used more generally to change some of the compression parameters of a compressed program without performing the whole linking and compression process from scratch and without access to the original object files. Particularly, if your output executable after a long time spent compressing is just a few bytes too big due to bytes lost to hashing, you can recompress the output executable, specifying a higher value for /HASHSIZE and/or /HASHTRIES, and thus avoid running through the whole compression process again. Recompression mode is activated by the /RECOMPRESS option. When this option is specified, Crinkler takes a single file argument, which must be an EXE file produced by Crinkler 0.4 or newer. Most options then take on slightly different meanings, as described here. The /CRINKLER, /PRIORITY, @commandfile and /PROGRESSGUI options work as normally. The /ENTRY, /LIBPATH, /ORDERTRIES, /RANGE, /FALLBACKDLL, /UNSAFEIMPORT, /TRANSFORM:CALLS, /NOINITIALIZERS, /TRUNCATEFLOATS, /OVERRIDEALIGNMENTS, /UNALIGNCODE, /TINYHEADER and /TINYIMPORT options are ignored, as the parameters specified by these options cannot be changed via recompression. The /PRINT options are also ignored. The remaining options work as follows: /SUBSYSTEM:CONSOLE /SUBSYSTEM:WINDOWS If this option is given, it specifies the Windows subsystem to use as normally. If it is omitted, the original subsystem will be used. /LARGEADDRESSAWARE /LARGEADDRESSAWARE:NO If this option is given, it specifies large address awareness of the executable as normally. If it is omitted, the original large address awareness will be used. /OUT:[file] Specify the name of the resulting executable file. The default is to overwrite the input file. /COMPMODE:INSTANT /COMPMODE:FAST /COMPMODE:SLOW /COMPMODE:VERYSLOW If this option is specified, the compression models will be reestimated using the specified compression mode. If the option is omitted, the models used for the original compression will be used for the recompression, and no model estimation will be performed. If the executable was originally produced by Crinkler 1.0 or newer, this will typically yield a compression ratio similar to the original compression. /SATURATE /SATURATE:NO If this option is given, it specifies saturation as normally. If it is omitted, the original saturation mode will be used. /HASHSIZE:[memory size] If neither this option nor a compression mode is specified, the original, optimized hash size will be used. Recompression speed will be similar to INSTANT compression mode in this case. If a compression mode is specified but this option is omitted, hash size optimization will be performed using the hash size specified for the original file. If this option is given, hash size optimization takes place normally, using the specified maximum size. /HASHTRIES:[number of retries] If hash size optimization takes place, this option specifies the number of tries as normally. Otherwise it is ignored. /REPLACEDLL:[oldDLL]=[newDLL] Replaces an original DLL by a new one. Only works if the names of the DLLs are exactly the same length. /STRIPEXPORTS This is a recompression specific option which instructs Crinkler to strip away any existing exports from the executable. New exports can be added using the /EXPORT option whether or not the existing exports are stripped away. /EXPORT:[name] /EXPORT:[name]=[symbol] /EXPORT:[name]=[value] Adds an export to the executable, as normally. The first two versions can only refer to an existing export in the executable that was exported using one of the first two versions in the first place. They can refer to such an export even if existing exports are stripped away using the /STRIPEXPORTS option. If an export already exists with the same name, the new export replaces the existing one. /REPORT:[HTML file name] Writes out an HTML file as normally. Since no symbol information is available, this will be a plain disassembly/hex dump without labels or cross-linking. STANDARD RUNTIME LIBRARIES -------------------------- Under normal circumstances, the Visual Studio compiler generates code that requires a C runtime library containing standard C functions and various support functions. These functions can either be linked in statically (included into the executable) or dynamically via a runtime DLL. For size-sensitive applications, you should always link dynamically, which is achieved by setting C/C++/Code Generation/Runtime Library to Multi-threaded DLL (/MD). Note however, that the standard runtime libraries for Visual Studio 2005 or newer will not work with Crinkler-compressed executables, since these runtime libraries require a manifest in the executable, and Crinkler does not support manifests. Furthermore, these DLLs are not present by default on Windows installations, so you will usually not want your program to be dependent on them. Unless explicitly disabled by using the /NODEFAULTLIB option, Crinkler automatically links to the Visual Studio 6 runtime library - msvcrt.dll - which is distributed with all Windows versions. There are a couple of caveats to using an older runtime library than the compiler expects, though. With out-of-the-box compilation options, the Visual Studio compiler generates code that requires some support functions which are only present in newer runtime DLLs. To avoid these dependencies, set the following options under C/C++/Code Generation: - Basic Runtime Checks: Default - Buffer Security Check: No (/GS-) Also, do not use C++ exception handling in your code. And do not use STL classes, since they use exceptions all over. The best strategy is of course to avoid linking to a runtime DLL at all, assuming you can do without the functions provided by the standard runtime library. This will save the space for importing the runtime DLL. To reduce the dependencies on the standard runtime DLL as much as possible, set the following options: - C/C++/Optimization/Enable Intrinsic Functions: Yes (/Oi). This will cause several standard functions (mainly math, string and memory functions) to generate inline code rather than a function call. - C/C++/Code Generation/Floating Point Model: Fast (/fp:fast). - C/C++/Command Line: Add the option /QIfist. This will cause conversions from floating point to integer to use the FIST instruction rather than calling a conversion function. Note that this changes the semantics of conversions from truncation to round-to-nearest (unless you explicitly change the rounding mode of the FPU). On the other hand, it will also give a considerable speed boost. RECOMMENDATIONS --------------- There are a number of things you can do as intro programmer to boost the compression achieved by Crinkler even further. This section gives some advice on these. - Since much of the effectiveness of Crinkler comes from separating code and data into different parts of the file and compressing each part individually, it is important that this separation is possible. Mark your code and data sections as containing code and data, respectively, and do not put both code and data into the same section. See your assembler manual for information about how to do this. For instance, in Nasm, you can write the keyword "text" or "data" after the section name and give sections different names to prevent them from being merged by the assembler. - Split both your code and your data into as many sections as possible. This gives Crinkler more opportunities to select the ordering of the sections to optimize the compression ratio. - If you are using OpenGL, try using ordinal range import for opengl32. If you are using Direct3D, try using ordinal range import for d3dx9_??. This may reduce the space needed for function hash codes. - If you are only importing functions from DLLs which are present on all Windows systems (d3dx9_?? is not), you can "safely" use the /UNSAFEIMPORT option. Run Crinkler with the /PRINT:IMPORTS option to check which DLLs you are importing from. - Avoid large blocks of data, even if they are all zero. Use uninitialized (bss) sections instead. Crinkler does not cope well with large amounts of data. Be aware that the compressor may use an amount of memory up to about 4000 times the uncompressed code/data size (whichever is largest). - When you perform detailed size comparisons, always use the SLOW compression mode with plenty of ORDERTRIES and compare the "Ideal compressed total size" values. The INSTANT and FAST modes are only intended for use during testing and to give a rough estimate of the compressed size. Use the /REUSE option when making small changes to achieve stable size comparison. Also note that the compression is tuned for the 4k size target, so any size comparisons you perform on smaller files might turn out to behave differently when you get nearer to 4k. - As a matter of good conduct, do not use TINYIMPORT or UNSAFEIMPORT if you can spare the space, and do not set HASHSIZE higher than you need. In other words, if your final intro is well below the size limit, remove the UNSAFEIMPORT option (if you added it in the first place) and then lower HASHSIZE in order not to waste memory unnecessarily. Also consider adding the high-performance GPU request exports as described under the EXPORT option if your intro could benefit from it. COMMON PROBLEMS, KNOWN BUGS AND LIMITATIONS ------------------------------------------- Any DLL that is needed by a program that Crinkler compresses must be available to Crinkler itself. If you get the error message 'Could not open DLL ...', it means that Crinkler needed the DLL but could not find it. You must place it either in the same directory as the Crinkler executable or somewhere in the DLL path, such as C:\WINDOWS\system32. Alternatively, you can use the REPLACEDLL option to replace it by one that is available. If you get this message for msvcr?? DLLs, you have a dependency on the runtime DLL you need to get rid of. See the section on standard libraries. When running inside Visual Studio, the textual progress bars are not updated correctly, since the Visual Studio console does not flush the output until a newline is reached, even when explicitly flushed by the running program. Use the /PROGRESSGUI option to get a graphical progress bar. The code for parsing object and library files contains only a minimum of sanity checks. If you pass a corrupt file to Crinkler, it will most likely crash. The final compressed size must be less than 128k, or Crinkler will fail horribly. You shouldn't use it for such big files anyway. If Crinkler crashes, it will write two dump files named dump<n>_mini.dmp and dump<n>_full.dmp, where <n> is an integer making the file name unique. These files contain information about the execution state of Crinkler at the time of the crash. When reporting a crash, please include at least the mini dump, or, if possible, both. ACKNOWLEDGEMENTS ---------------- The compression technique used by Crinkler is much inspired by the PAQ compressor by Matt Mahoney. The import code is loosely based on the hashed imports code by Peci. The disassembly feature of the compression report uses the diStorm disassembler library by Gil Dabah. Many thanks to all the people who have given us comments, bug reports and test material, in particular to Rambo, Kusma, Polaris, Gargaj, Frenetic, Buzzie, Shash, Auld, Minas, Skarab, Dwing, Freak5, Hunta, Snq, Darkblade, Abductee, iq, Las, pirx, Hitchhikr, Gloom, Zephod, coda, KK, XMunkki, KammutierSpule, acidbrain, xTrim, jix, SubV242, w23, ryg, shinmai, Decipher, xtrium, TomasRiker, smoothstep, XT95, NeKoFu, n3Xus, Moerder, merry, RCL, zoom, vampire7, Key-Real, quiller, Seven, and all the ones we have forgotten. Also thanks to Dwarf, Polygon7 and Gargaj for suggestions for our web design. Big thanks to Rrrola and TomCat for their valuable suggestions for optimizing the decompression code, and to qkumba for his guidance on the zero-section header and for tracking down the NVIDIA driver issue. Our special thanks to the many people who have demonstrated the usefulness of Crinkler by using it for their own productions. Keep it going! We greatly appreciate your feedback.
[ back to the prod ]