Wow it’s really been 1 year since Gravatar joined the Automattic family. Time sure flies when you’re having fun!
Gravatar has come a long way from the service that it was back in October 2007. My wife likes to laugh at me because I’ll pick something up in one room, like a remote control, and move around the house fidgeting with it. Then I’ll absentmindedly leave it in some random place like the bathroom, or the freezer. One year ago we picked up a small avatar service with a great name and an awesome fan base. Now, in an attempt not to leave it in a random location, we’re looking back on the last year (and letting you look with us.)
The service was running version 2.0, and set up on two rented (or collocated, I don’t honestly know) servers. The servers were running at loads of around 20, and could spike to well over 100 (that’s a lot.) It was obvious that we needed both some stop-gap fixes and a plan. The first thing we did was throw some caching servers in front of the service — a couple of varnish servers as I recall. This dropped the workload of the two boxes considerably, and allowed us to look at Gravatar without bringing the service down. Next we replicated the setup 1 for 1 to two of our own (more powerful) servers. This gave us a bit more breathing room. And we began to plan.
It was obvious, from the very beginning that the service was going to have to handle a constant torrent of requests from the internet. Most of those requests would be for email addresses with no Gravatar and come from URLs that could be crafted in an unlimited number of ways. On top of that we knew that we wanted to make all the paid features free, and expand the size a Gravatar could be from 80 pixels to 512 pixels. So basically instead of our goals being to make the undertaking less daunting we we’re actively making it a more intense challenge than it could have been. But that was OK with us, because our goals for Gravatar weren’t to make it easier but to make it better. We wanted to make Gravatar the kind of free service that we could use, would want to use, and would be proud to share with the world. I know that last bit sounds like marketing crap, but that’s really what we wanted to do and is really how we look at Gravatar.
Pretty much the next thing we did was port Gravatars code from RoR (Ruby on Rails) to PHP. As I mentioned when we announced this change the reason for this wasn’t about Ruby or Rails. Simply put we’re a PHP shop, and once rewritten in PHP we have many more great minds that we can easily throw at it than if it were still in RoR. Since we ported it *pretty much* directly from rails there are some left-over rails-isms in Gravatars code that you wont find in, say, WordPress. Shh…. Don’t tell Matt
In the rewriting we tried to tackle the largest scalability problems with the design of the service. You can imagine that for an avatar serving service… storing, searching, and serving avatars is paramount. Gravatar 2.0 (pre PHP) suffered from some pretty significant inefficiencies in this regard, and I think that a big part of that was limited resources (time and servers.) Luckily we we’re now not significantly limited by either of those things.
– warning beginning technical details which may be safely skipped over if you don’t care –
The way that images were stored originally was: a complete image was made for all sizes between 1×1 and 80×80 pixels, a directory made for each rating, and a symlink placed from the rating to the appropriate image (either the users image or the default image in case the rating was too high.) So that’s 80 images, 5 directories, and 240 symbolic links. The reason for this, I believe, was to attempt to serve the avatar content without any database interaction whatever. The files were then archived, uploaded to Amazon S3, and an entry added to Amazon SQS. Finally the SQS entry was retrieved by the serving server, the file downloaded, extracted, and placed on the filesystem. So this is why it took several minutes once you uploaded and cropped your image for you to be able to browse the rest of the site again. You can imagine how many files Gravatar was comprised of by the time we got a hold of it! We knew that this would simply NOT work for our new 512x512px avatar sizes. Lastly there were a couple of directories which had several hundred thousand entries (either files or other directories) which were nearly impossible to even get a listing inside of. So we had a list of things NOT to do. We just needed to figure out what TO do
So we decided that we would render all our avatars dynamically from the highest quality copy of the image we can manage… down. We would only store one version of the image, though we would store it in multiple places (a local file server for speed, and S3 for redundancy.) We would still rely heavily on caching. And we would asynchronize as much of the workload as was possible, so that you don’t have to wait for things to happen after you finish cropping (to do this we employed various techniques and hacks best left for another day and another story.)
– ok this batch of details has been concluded –
So the problems were many, one year ago, and the challenges were fascinating. I recall being overwhelmed by support requests for quite some time. I would get 40 emails on a good day, more on a bad. And believe it or not your emails very much shaped the future of Gravatar. I would group them into specific problems, and always fix at least the largest problem (volume wise) each week. Over time the service has grown quite stable, support requests have gone down to just a handful every day, and things are generally peppier than ever.
We had some some bumps tuning our caches… for a while there we accidentally told your web browsers never EVER to re-validate an image. But we got that handled in short order… and things are zipping along quite nicely.
Gravatar now lives on about 20 servers: 2 Database servers, 1 File server, 2 Load balancers, 5 Caching servers, 9 Web servers, and 1 Development server. That combination of servers is handling an average of 7,214 of your requests every second of every day. That’s a whopping 623,293,056 requests daily! 96% of all of those requests are served directly from cache. These days we get around 5,000 uploaded images every day. Even with this staggering increase in the number of requests we’ve been able to make Gravatar faster, and more reliable than it’s ever been.
So here we are, one year later, looking out over the vast frontier of the internet and contemplating the future of Gravatar. There are a great many things that it could become. We know that we don’t want to loose focus on the core of the project: Serving your avatars (that’s what it’s all about!)
We know that an avatar is “a graphical image that represents a person, as on the Internet,” But it’s also “an embodiment or personification, as of a principle, attitude, or view of life.” And that is exactly where we are headed: Making Gravatar a place where you can do more than just store an image, making it a place that can be your presence online. So we’ll be rolling out more features in the near future to allow you to store more data inside Gravatar — and more importantly to allow you to use that information in other places on the internet through open standards.
We hope that you’ve had as awesome a time using your Gravatars as we’ve had making it all work. And we look forward to the future — to when your identity doesn’t have to be cemented to a specific site, but is fluid and flexible, and persistent. We hope to see you there!