2014-03-18

engineering:

Who doesn’t love animated GIFs?

Believe it or not, support for GIFs at Tumblr was a happy accident! When Tumblr put together the code for handling JPEGs, support and GIFs (and PNGs) happened to also work using the same code. Perhaps even more surprising is that the tools used to handle GIFs at Tumblr hadn’t changed much from those early days.

The image above is an original from sukme that could not be posted to Tumblr last June. It also would have failed if he’d tried last Sunday. If you click-through to the original post, you will see a muddy, reduced-saturation mess. All this because our resizer couldn’t handle the original.

I’ve got ninety-nine problems and the GIF is one

There is a lot of misinformation about GIF limits on Tumblr, so let me set the record straight: We don’t count colors or frames or pixels. We only count bytes and seconds. Every image that comes in is scaled to a number of smaller sizes and the smaller your image is, the fewer resizes need to happen, which means less time.

We had two core failure modes in our prior resizer: Some images would take as much as several minutes to convert. This was not directly attributable to color, dimensions, or frame count, but a mysterious mix of all of them. Some images would balloon in size (600KB at 400x400, 27MB at 250x250).

The unpredictability of these failures made our GIF limits feel arbitrary and terrible to the end users. Some have gone so far as to threaten monkey kicks. I don’t want to get kicked by a monkey, so we started working hard late last year to fix it.

A proposed solution

Some of you may have seen this post where the performance of our current converter was compared with a new “mystery” converter. The mystery converter was roughly 1000x faster on the “slapping” GIF and happened to look great, but had quality problems on other images. Those were more fully explored in here a couple of days later.

If you haven’t figured it out yet, the mystery converter is gifsicle.

Getting a better handle on it

To get an unbiased test set, I took a random sample of roughly 90K GIFs that Tumblr users tried to upload, not limiting the corpus only to those that succeeded. These were tested against the current converter, resizing down to the next size we produce. Each resize is given up to 20 seconds to complete in our application, but all resizes must complete in 30 seconds. All resizes must be under 1MB or we will convert the first frame to JPEG and call it a day.

2.6% of my 90K GIFs took longer than 20 seconds to resize. This is an underestimation of how many GIFs would be rejected for time because this is only one of several resizes required. A whopping 17.1% of all GIFs were over 1MB. Even if we bump up to 2MB, the rejection rate is 2.75%. The converter was making over 25% of all resizes larger than the higher-resolution originals! The total rejection rate for my sample set was 4.46% of all original GIFs uploaded.

Using gifsicle is so much faster that our CPU rejection rate drops to 0.00 on my test set. Also, just under 99% of all images were smaller when resized than they were at their original resolution. The size rejection rate was a much lower 0.59%.

Gifsicle problems

As compelling as the performance of gifsicle is, the quality problems are too much to ignore. We played around with the code a bit, but eventually we just got in touch with the author, Dr. Eddie Kohler. The specifics are in this post, but the short version is that Eddie was able to improve quality by adding some more advanced resampling methods as well as palette expansion for small-palette images. This increased our size rejection rate to 0.68% while still keeping us well under our CPU budget.

Proving it

Image processing is all about choices. How do you resample? Do you sharpen? Where in the workflow is gamma correction applied, if at all? The list goes on and on.

As you can imagine from the performance differences, our previous converter and gifsicle take very different approaches to GIF resizing. The output images look different. Sometimes it is slight, sometimes it is significant, but there is no way we could put out a converter that messes up your images, even if it messes them up quickly.

We set up a qualitative study. The goal was simply to prove that we weren’t doing worse than our old converter, not necessarily that we were doing better. This study was opened up to all Tumblr employees, as well as some “randomly selected” outsiders (my friends and family). Participants were presented with one of two questions:

1.) Given an original and 1 resize, decide whether it is ok, unacceptable, or completely broken.

2.) Given an original and 2 resizes (randomly choses which was left and which was right, sometimes they were identical), choose the better image or say there is no difference.

The results were everything I could have hoped for. The “acceptable” test showed that users found gifsicle better at producing acceptable results (87% vs. 84%), but not by a statistically relevant amount (p=0.086) and that gifsicle produced fewer broken GIFs (0.71% vs. 1.38%), but again not enough to say it is definitively better (p=0.106). The “better” test found users preferring gifsicle 37% of the time, the prior converter only 16% of the time, but users also preferred one identical image over the other 27% of the time. Again, it is hard to say that gifsicle is better, but it is clear that it is no worse.

Putting it all together

The development and testing described above took from late October until the beginning of March. Packaging, deployment, and integration took only a couple of weeks!

We aren’t done. There is work underway exploring how we handle JPEGs and PNGs. There are a slew of features that we can go after. This was a big step, a necessary step, but not the end for sure.

We are a community, it takes a village, there’s no “i” in GIF

This project couldn’t have happened without the excellent work of Eddie Kohler in creating, maintaining, and enhancing gifsicle. Tumblr’s Site Reliability Engineering group packaged and helped deploy gifsicle onto hundreds and hundreds of machines in our datacenter. Tumblr’s Security Team vetted the code, both by inspection and by attacking it to make sure we stay safe. This was all for the awesome Tumblr creators, but I have to mention qilme/sukme (same dude, two blogs), reallivingartist, and especially gnumblr for their help in understanding and ultimately attacking this monstrous problem.

Show more