Some notes on duplicate data-uris in CSS

If you are considering using data-uris instead of CSS sprites for your background images, you are probably doing this (at least partly) for performance reasons. Each image you can stuff into a data-uri saves you an HTTP call, after all.

But if you are thinking about performance, you will also be paying attention to the size of your CSS, and you might be concerned about the use of the same background image multiple times, but in different contexts. With image files, referencing the same background image multiple times is lightweight, but with data-uris, the base-64 encoded data takes up a lot more space:

.monkey {
	background-image:url(http://sunpig.com/martin/code/2011/gzipcss/erlenmeyer.png);
}
.fez {
	background-image:url(http://sunpig.com/martin/code/2011/gzipcss/erlenmeyer.png);
}

vs.

.monkey {
background-image:url("data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAlotsmoredata...");
}
.fez {
background-image:url("data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAlotsmoredata...");
}

There are, of course, other ways of assigning the same background image to different elements (you can group your selectors, or create a mixin classname for use in your HTML). But if, for whatever reason, you do end up embedding the same base-64-encoded resource multiple times in the same file, provided you are serving your CSS gzipped, you don’t need to worry too much about the extra file size.

The reason for this is that the DEFLATE algorithm (which powers gzip compression) is all about eliminating duplicate strings. When the algorithm finds the second instance of a long base-64-encoded string, it can just replace it with a reference to the first instance. And because the reference is much shorter than the base-64 block itself, the size increase of the gzipped file is negligible.

Here’s a short table showing some actual figures. The files 1.css, 2.css, and 3.css contain 1, 2, and 3 copies of the same base-64-encoded image respectively. I used nginx (0.7.67) on my laptop to check the gzipped file sizes.

File Uncompressed size (bytes) Gzipped size (bytes)
1.css 1437 1139
2.css 2871 1135
3.css 4311 1156

Note that the file with the data embedded twice comes out even smaller than the version with only one copy. gzip can be a bit strange that way, which is why you should always check the gzipped version of any CSS or JS files you’re fine-tuning for size. Minor optimizations don’t always work out the way you might expect.

Just for clarity: embedding the same resource multiple times is not always a great strategy for re-usable CSS. But when you do it, at least you don’t have to worry about file size bloat.

Further reading

CategoriesUncategorized

3 Replies to “Some notes on duplicate data-uris in CSS”

  1. A few comments, seeing that I wrote a performance testing tool for websites not so long ago 🙂

    1. Web browsers cache images and CSS (usually, make sure to send the appropriate headers), so downloading an image file vs. sending the image in-line in CSS only matters the first time the image is requested.
    2. Web browsers tend to keep multiple connections open to servers, somewhere between 4 and 8 currently (depends on the browser/version), and request files over those. Much of the overhead of an additional request lies in establishing the connection, not so much the actual download (for typical image file sizes you’d use in a web page; the image above + response headers could easily be transmitted in a single ethernet frame.). And those requests happen in parallel.
    3. Base64-encoding blows up data by quite a lot: about 30% over the raw binary data (blow up that image above by 30% and add response headers, and you’re dangerously close to not being able to transmit it in a single ethernet frame any longer).
    4. Your typical image file is already compressed by image-specific algorithms in a way that the more general DEFLATE algorithm can’t hope to deal with; DEFLATE-ing image files is therefore not efficient, and often generates bigger files than the raw image file to be compressed. Now apply that to the ca. 30% bigger base64-encoded data…

    Put the above together, and you’re likely to trade in some small latency on the first request of an image file for potentially substantial bandwidth increases (which in turn affect latency again even in an ideal world, but more so when you’re using a web server that doesn’t deal well with multiple persistent connections); the specifics depend largely on how often your CSS changes relative to your images.

    More specifically, the latency improvement you’re gaining is all about how fast the browser parses & applies CSS: with an in-line image, it can render the image the moment the CSS is applied.

    The upshot? I’d probably try serving your images from a separate webserver, possibly powered by node.js or a similar lightweight server implementation that is great at handling tons of connections. It’ll help with scalability (quite a bit) while hurting latency only on the very first request of an image in comparison with embedding images in CSS.

    None of which suggests that what Martin wrote is wrong. Absolutely not. I just wouldn’t recommend embedding images in CSS in all but the most extreme cases. As always, though, it’s only testing that can tell you wether it’s a good idea or not…

  2. The other big reason for using data-uris instead of external images, which I didn’t touch upon at all, is for maintenance when it comes to image sprites. If you’re creating your sprites by hand, rather than using a tool, you spend a lot of time tweaking background-position offsets, which gets to be a pain in the nuts after a very short while. Using data-uris means that you can treat each image resource separately, just as if we used to do before the spriting technique became popular.

    Using multiple data-uris in a single CSS file does not bypass another problem with using sprites, namely that if you change even one image within a large sprite file, a user has to refresh the whole file instead of grabbing it from cache. If you change a single data-uri in the CSS file, the whole cached CSS file is invalidated instead.

    If you intersperse data-uris wth your regular CSS, the increase in CSS download and parse time is significant. Browsers will generally block rendering until the CSS is available. By bulking up your CSS, it will take longer for first-time viewers to see their first content.

    A possible way to balance this would be to create a separate CSS file containing the data-uris, and place the <link> tag for it at the end of the page (or load it with an asynchronous piece of script). This would allow the primary CSS to be loaded just as quickly as before. You still end up eventually loading a single big file with images, but this way it’s a (gzipped) CSS file with multiple data-uris instead of a PNG file with multiple sub-images; the maintenance benefits of not having to worry about background-position offsets are preserved, at the cost of a small increase in file size.

    The most imporant point that Jens makes is that you should test this stuff for yourself. Sites have different visitor profiles, different performance requirements, and different constraints. You can’t improve what you don’t measure. You have to understand your own case, and understand all the techniques available to you, before you can come up with a good solution.

Comments are closed.