Blog
An efficient image format for SDL
Superfluous Returnz is a 2D video game with a very classic “cartoon” style: if we put aside the descriptions of the levels and the sound part, the entire content to be loaded therefore consists of 2D images (the animations being simply successions of images).
In order to avoid excessively long loading times between levels and an overly large memory use on disk, choosing the right image format is therefore critical.
Note: the summary of the benchmark is given at the end of the article.
Obvious options
“Basic” SDL can only read BMP, the “historic” raster image format, released by Microsoft and IBM in 1990. This is probably one of the simplest image formats: it is simply a description of each pixel successively by a certain number of bytes (in general, one per channel, i.e. 4 if an RGB format + the alpha transparency channel is used).
The big advantage of BMP is that it is extremely easy to “decode” for SDL: basically, to send the image to the GPU, you just have to copy the memory block occupied by the file, since the format is already unpacked.
On the other hand, BMP is heavy, very heavy in memory, since it uses absolutely no compression algorithm. So, just to store ONE level of non-scrolling background image at HD resolution (1920x1080 pixels, my game's base resolution), the BMP format would require close to 6MB! This means the game would quickly end up weighing several gigabytes. It can be justified for an AAA game with huge 3D universes, not for an indie 2D game like mine.
Benchmark for game images in BMP format:
- Total memory space: 1 GiB
- Loading time: 0.59 s
Anyway. Thanks to the SDL Image module, we can fortunately use other formats that use less disk space. The most obvious format for cartoons with lots of plain color tints is PNG1. The size is drastically reduced compared to the BMP (30 times less space occupied!), but by running my benchmark, I realize that the loading time is considerably longer ...
Benchmark for in-game images in PNG format:
- Total memory space: 36 MiB
- Loading time: 3.54 s
At first glance, this seems counter-intuitive: we know that disk accesses are slow, so the simple fact that PNG is almost 30 times lighter than BMP should make it faster to read, right?
With the SDL functions SDL, we can separate the “disk reading” step in itself, and the “creating the SDL image from the loaded data” step. And then we understand what's going on:
Benchmark for game images in BMP format:
- Total memory space: 1 GiB
- Time to read file from disk: 0.22 s
- Time to create image: 0.36 s
- (Total time: 0.59 s)
Benchmark for in-game images in PNG format:
- Total memory space: 36 MiB
- Time to read file from disk: 0.01 s
- Time to create image: 3.53 s
- (Total time: 3.54 s)
There it is: the huge gain in terms of reading the file from disk is completely crushed by the fact that creating the image now takes ages! And for a good reason: PNG compression is meant to be disk space efficient, but not necessarily time efficient. Decompressing a PNG image takes time, although we usually don't realize it!
Indeed, in general, your usage is just loading one image at a time and looking at it: it doesn't matter then that the loading time is a few milliseconds instead of a few microseconds, you don't see the difference. For a game where you're going to have to load lots of HD images, sometimes animations with many frames per second, it's a different story: the loading time for a level can quickly reach several seconds.
You may think a few seconds isn't so bad, but then again, for a small 2D game like mine, that seems overkill to me: you'd expect levels to load almost instantly.
So let's try to do better.
First, let's note that while PNG is basically a non-destructive compression format (each pixel of the image is exactly the same as that of the same uncompressed image in BMP format), it is nevertheless possible to apply algorithms that slightly modify the image to make it more efficiently compressible by PNG. This is for example what pngquant does, by calculating an optimal reduced color “palette” for your image, even if it means slightly changing some color pixels.
The results are impressive in terms of disk space, it takes up 4.5 times less space than with classic PNGs, or 125 times less space than with BMPs!
The problem, unfortunately, is again in terms of loading times ... If it's faster than the classic PNG, it's still twice as slow as the BMP.
Benchmark for in-game PNG images powered by pngquant
:
- Total memory space: 8 MiB
- Time to read file from disk: 0.01 s
- Time to create image: 1.22 s
- (Total time: 1.22 s)
Let's also add that these results are obtained by applying a strong
quantification with pngquant
, which is done at the cost of an
alteration of the colors of the images which becomes quite
visible. Not great.
Well, it looks like we can forget PNG if we want fast load times. However, using BMP is still out of the question because of the crazy weight of the images. Other formats supported by SDL did not give better results. So what can we do?
LZ4: the fast compressor
If BMP allows for the rapid construction of images but at the cost of huge memory space, then why not compressing a BMP to get the best of both worlds? In practice, we just might fall back on the same problem as with PNG: too high decompression times which neutralize the gain in disk space and disk read time.
Then comes the LZ4 format, a compression algorithm Wikipedia tells us is “focused on compression and decompression speed”. Of course, this is done at the cost of efficiency, as the disk usage is not as optimal as 7zip for example, but it can be tried.
Rather than compressing BMPs, I thought it would be even more efficient to directly compress the memory contents of an SDL image: thus, the creation of the image will come down to a pure copy of a memory block: we can hardly do it faster.
SDL Surfaces VS Textures
A technical detail that is important: what we compress is the
SDL_Surface
structure, which will contain the decompressed image in
a format similar to BMP. Technically, to display it, we will convert
it to SDL_Texture
, which is a format used directly by the graphics
card. This last format is dependent on the GPU and the driver you
are using, and it is neither documented nor publicly accessible, so
you cannot directly compress and store an SDL_Texture
.
But we can optimize our SDL_Surface
so that the construction of an
associated SDL_Texture
is fast: indeed, if our SDL_Surface
uses a
storage format (the order of the different channels and the number of
bytes) which is not natively supported by our graphics driver, then
we will pay for a conversion.
SDL allows us to know the natively supported formats thanks to the
information contained in
SDL_RendererInfo
. So I
experimented with the few platforms available to me:
Platform | Linux Mint | Windows 10 VM | Android | Mac OS |
---|---|---|---|---|
Name | opengl | direct3d | opengles2 | opengl |
SDL_PIXELFORMAT_ARGB8888 |
X | X | X | X |
SDL_PIXELFORMAT_ABGR8888 |
X | X | X | |
SDL_PIXELFORMAT_RGB888 |
X | X | X | |
SDL_PIXELFORMAT_BGR888 |
X | X | X | |
SDL_PIXELFORMAT_YV12 |
X | X | X | X |
SDL_PIXELFORMAT_IYUV |
X | X | X | X |
SDL_PIXELFORMAT_NV12 |
X | X | X | |
SDL_PIXELFORMAT_NV21 |
X | X | X | |
SDL_PIXELFORMAT_UNKNOWN |
X |
So it seems that SDL_PIXELFORMAT_ARGB8888
is the most universally
supported RGB
+transparency format (an alpha transparency channel +
the RGB channels, in that order, each taking 8 bits, so one byte). It
is therefore this format that we will compress with LZ4.
SDL+LZ4: an optimal image format
The benchmark results are clear: images compressed in LZ4 format give the best loading time being 1.4 times faster than BMP (and 8 times faster than PNG), while maintaining an acceptable size, only 20% higher than PNG (and over 25 times smaller than BMP).
Benchmark for game images in LZ4 format:
- Total memory space: 43 MiB
- Time to read file from disk: 0.01 s
- Time to create image: 0.41 s
- (Total time: 0.42 s)
Great, then! And to further improve the results, we can use an HC (high compression) variant of the LZ4 compression algorithm: this variant is very slow to compress, but it offers even smaller file sizes while maintaining a fast decompression time. As the (slower) compression will only be done once (by me) and the size reduction and decompression will benefit everyone playing, it's worth it!
Benchmark for game images in LZ4HC format:
- Total memory space: 33 MiB
- Time to read file from disk: 0.01 s
- Time to create image: 0.38 s
- (Total time: 0.39 s)
The gain is not huge compared to the classic LZ4, but there is no reason to not take advantage of it. Note that the data is then even lighter than regular PNG!
Implementation
Compression is very simple: we read our image in the input format
(BMP, PNG, whatever), create our SDL_image
and convert it,
optionally, to SDL_PIXELFORMAT_ARGB8888
, then compress it.
In practice, we will first write the dimensions of our image (as well
as the code of the pixel format, to be able to possibly support
formats other than SDL_PIXELFORMAT_ARGB8888
). I wrote a function
using the same syntax as the functions in SDL Image:
int IMG_SaveLZ4_RW (SDL_Surface* surface, SDL_RWops* dst, int freedst, int hc)
{
Uint16 width = (Uint16)(surface->w);
Uint16 height = (Uint16)(surface->h);
Uint32 surface_format = surface->format->format;
SDL_RWwrite (dst, &width, sizeof(width), 1);
SDL_RWwrite (dst, &height, sizeof(height), 1);
SDL_RWwrite (dst, &surface_format, sizeof(surface_format), 1);
Uint8 bpp = surface->format->BytesPerPixel;
Uint32 uncompressed_size = width * height * (Uint32)bpp;
const char* uncompressed_buffer = (const char*)(surface->pixels);
int max_lz4_size = LZ4_compressBound (uncompressed_size);
char* compressed_buffer = malloc (max_lz4_size);
int true_size = -1;
if (hc)
true_size = LZ4_compress_HC(uncompressed_buffer, compressed_buffer,
uncompressed_size, max_lz4_size,
LZ4HC_CLEVEL_MAX);
else
true_size = LZ4_compress_default (uncompressed_buffer, compressed_buffer,
uncompressed_size, max_lz4_size);
SDL_RWwrite (dst, &true_size, sizeof(int), 1);
SDL_RWwrite (dst, compressed_buffer, 1, true_size);
free (compressed_buffer);
if (freedst)
SDL_RWclose (dst);
return 0;
}
We also give the function that saves directly to a file (and not to an
SDL RW
structure):
int IMG_SaveLZ4 (SDL_Surface* surface, const char* file, int hc)
{
SDL_RWops* dst = SDL_RWFromFile (file, "wb");
return (dst ? IMG_SaveLZ4_RW (surface, dst, 1, hc) : -1);
}
At the reading/decompressing level, it's again very simple: we read
the dimensions of the image and the pixel format, which allows us to
allocate the desired SDL_Surface
, then we decompress the memory
block directly in the memory space allocated by the SDL.
SDL_Surface* IMG_LoadLZ4_RW (SDL_RWops* src, int freesrc)
{
Uint16 width;
Uint16 height;
Uint32 surface_format;
int compressed_size;
SDL_RWread (src, &width, sizeof(width), 1);
SDL_RWread (src, &height, sizeof(height), 1);
SDL_RWread (src, &surface_format, sizeof(surface_format), 1);
SDL_RWread (src, &compressed_size, sizeof(compressed_size), 1);
SDL_Surface* out = SDL_CreateRGBSurfaceWithFormat (0, width, height, 32, surface_format);
Uint8 bpp = out->format->BytesPerPixel;
Uint32 uncompressed_size = width * height * (Uint32)bpp;
char* compressed_buffer = malloc (compressed_size);
SDL_RWread (src, compressed_buffer, 1, compressed_size);
char* uncompressed_buffer = (char*)(out->pixels);
LZ4_decompress_safe (compressed_buffer, uncompressed_buffer, compressed_size, uncompressed_size);
free (compressed_buffer);
if (freesrc)
SDL_RWclose (src);
return out;
}
Same thing, we use an additional function:
SDL_Surface* IMG_LoadLZ4 (const char* file)
{
SDL_RWops* src = SDL_RWFromFile (file, "rb");
return (src ? IMG_LoadLZ4_RW (src, 1) : NULL);
}
Benchmark summary
FORMAT | READING TIME (s) | CREATING SURFACE (s) | CREATING TEXTURE (s) | TOTAL (s) | SIZE (MiB) |
---|---|---|---|---|---|
BMP | 0.22 | 0.22 | 0.14 | 0.59 | 1046 |
PNG | 0.01 | 3.15 | 0.38 | 3.54 | 36 |
PNG (quant) | 0.01 | 1.07 | 0.15 | 1.23 | 8 |
LZ4 | 0.01 | 0.25 | 0.16 | 0.42 | 43 |
LZ4 (HC) | 0.01 | 0.22 | 0.16 | 0.39 | 33 |
The LZ4 format compressed with the HC method therefore offers the shortest loading time. In terms of data size, it is beaten only by a quantized version of PNG which involves destructive compression and visible color alteration of images.
To conclude, in my use case, storing my images as SDL_Surface
objects compressed with the LZ4HC algorithm is an optimal choice both
in terms of memory and in terms of loading time.
Source code
The encoding/decoding functions to this SDL LZ4 format as well as the files necessary for the benchmark (without the images) are available on this repo.
-
the JPG format is totally ignored for 3 reasons. First, it was primarily designed for photography and is counterproductive on cartoon-like designs; second, it is destructive and creates highly visible artifacts on cartoon-like drawings, especially around black strokes; third, it doesn't handle transparency which is needed in my case. ↩