Project Mercury: A pre-configured Drupal+Varnish EC2 AMI

Josh Koenig

29

July 15, 2009 - 2:27am

Do you want your Drupal front page to render in less than a second? Do you want your site to be fast for logged in as well as for anonymous users? Do you want to have total confidence in your ability to weather the storms of internet fortune (e.g. links from Digg, Drudge, Slashdot or MSN.com)? If so, then we hope the Mercury project will be of interest to you.

The goal of this project is to make Drupal as fast as possible for as many people as possible. To that end, we are developing a pre-built Amazon Machine Image (AMI) which will allow anyone with an Amazon Web Services account to spin up an EC2 instance and see how all this works in real-time. The ultimate goal is a production-ready release that can be used for deploying real websites.

Today, thanks to this inspiring post from Eric Hammond at Alestic and some excellent feedback from the Drupal community, I'm proud to announce the public availability of an initial Alpha release. Don't use this for production, but if you want to see how these techniques work in action, you can get a working copy with root access for just a little scratch; ten cents an hour to be precise.

For those hungry to get started, the public AMI id is ami-0722c36e. It's a 32-bit instance, and I've run all my tests on the "small" instance type. You can find it pretty easily by searching for "chapter3" in the AMI list:

This ready-to-run machine image contains the following high-performance options, all configured to work for a harmonious liquid-metal fast WIN:

  • Ubuntu Jaunty base operating system
  • The latest Pressflow Drupal
  • Varnish HTTP acceleration
  • mod_deflate configured to compress pages
  • An up-to-date memcached/libevent and libmemcached install. This includes experimental support for the new libmemcached-based PHP library via this patch to Cacherouter.
  • A basic boot script to update Pressflow from the BZR repository and move disk-intensive operations like MySQL and Varnish storage into /mnt

Since the install comes pre-configured and I didn't have time to do this as a profile, you'll need to use the user #1 credentials I set up. Login: root. Pass: drupal. Change this immediately.

Now, if you want to know more about how this works, see links to the giants who's shoulders we're standing on here, or if you've got a minute to kill while your instance spins up, read on for a full explanation of what we've done in this initial Alpha release.

Pressflow
As our good friends at Four Kitchens say, Pressflow makes drupal scale. The Pressflow project is important because Drupal core is code-frozen well in advance of most real-world deployments, meaning the kinds of tuning patches and tweaks necessary to make Drupal-powered sites screamin' fast aren't included in the stock download. Pressflow fills that gap by including core patches which are well-tested and necessary for high performance.

The project is maintained by some of the top high-availability minds (we're working on some advanced memcached features, for instance) and is what powers drupal.org itself. The upshot here is that you can trust it, and it doesn't materially differ from stock drupal in ways that mater for 99% of development.

As you can see, it's just like Drupal, but with a geared icon to remind you you're a power user:

Varnish
If you run a complex Drupal site and want to stop stop stressing about traffic spikes and start seeing pageloads in under a second, you need to start treating Apache+PHP as an application server, and not a simple web server. Varnish is a purpose-built HTTP accelerator, carrying the work of traditional reverse-proxy features forward by focusing on the specific application of delivering web content as quickly as possible.

Pressflow allows us to configure Varnish to run "in front" of Apache+PHP+Drupal, handling all anonymous page requests as well as static files. This requires configuring Apache to run on a non-standard HTTP port (in this case we use 8080), and configuring Varnish to respect the headers and conditions which Drupal operates within for determining a logged-in vs. logged-out page request.

The result is that Apache no longer has to bother itself with css, js, jpeg or even anonymous page requests (except for once, after which Varnish will take over). This gives you the fastest possible anonymous pageview performance. Better than boost, even.

Drupal, PHP and Apache
Back in more familiar territory, we've configured Drupal's performance settings to maximize the benefits of Varnish, and enabled the must-have APC opcode cache to accelerate Drupal in general.

We also installed and configured Steve Rude's exciting CacheRouter module to allow the use of different engines for Drupal cache optimization. Since I wanted to try out the newest version of memcached, I built some support for that, as well as a "none" cache option which we use for page caching since Varnish handles all that for us. Patches for these methods are here and here.

I was previously using APC as the local drupal cache as it is a little simpler as it doesn't involve running a separate service. This is less error-prone, more secure, and allegedly as fast (if not faster) than running memcached according to the folks at Facebook. More testing is needed to see what the difference is here, if any. The major limitation of APC is that the caches cannot be shared across machines, but since our initial goal is a one-box solution, it's all good.

Finally, our Apache config enables mod_deflate, which is important for end-user page load times. Not only does mod_deflate compress (gzip) all possible content transferred out of the server, which can shave 100s of milliseconds off the real-world pageload, it also allows us to turn off Drupal's own page compression, and avoid additional modules like css_compress. This means less work for Drupal and faster response times overall.

Testing
To see what all this mean, I spun up a Mercury instance, logged in as user #1 (login: root, pass: drupal, change this immediately), and installed devel to generate 500 dummy nodes.

I then set up a local jmeter test script to hammer the /node page. This isn't a good real-world use-case, but it's fine for this simple performance benchmark. With 50 threads hammering as hard as they could, I got up to over 2000 successful requests per minute:

During which server load was a whopping 0.02.

Indeed, with Varnish handling requests, your operative bottleneck rapidly becomes bandwidth and networking capacity. That's where you want to be though, as these are the simplest and least expensive resources to scale.

Credit Where Credit Is Due

I should acknowledge all the great developers who have made this possible with their trailblazing documentation and code. I am doing virtually no innovation in this process, just integrating a bunch of existing pieces. As they say, we stand on the shoulders of giants.

Roadmap

The next steps for me are to continue working on the ec2 rollout piece, as well as developing a better test suite so I can easily benchmark configuration changes and more finely-tune the stack.

I also plan on integrating a much more fully-featured install of Drupal, with a lot of a-list modules included and configured. The goal is to eventually get to a public release that can provide immediate value to folks looking for a Drupal CMS solution in the cloud.

The roadmap is something like:

  • Continued alpha releases (at least one or two) focusing on improving underlying infrastructure and making testing/benchmarking easy.
  • Beta releases focusing on the pre-configured Drupal install and admin experience. Maybe including apache Solr?
  • A public release (in time for drupalcon?) that could be used in small-scale production cases. Possibly including a 64-bit version.

Ok then. If you've read this far, what are you waiting for? Go get an Amazon Web Services account and try this thing out. Running it for an afternoon will literally cost you only a dollar. :)

Comments

Damien points me to this Apache module:

http://stderr.net/apache/rpaf/

Which will help with the redirection. Sometimes in the current setup, you can end up on a port 8080 url when varnish does a redirect.

While you might be standing on the shoulders of giants, it's great to see all the pieces pulled together into a single, usable installation. Providing deployment simplicity obviously helps to make the process much faster, easier, and more standardized. Since we don't have to install and configure each part along the way, we can only hope this will allow us to continue to push the entire Drupal (Pressflow) architecture forward for enterprise clients. Thanks & kudos.

--Bill

Woo hoo!

Now I guess I'm on the hook to get to an actual release version!

Is there a way to exclude certain paths ( e.g. /cart ) from caching, similar to Boost cache exclusion?

Thanks

Currently the Varnish config is to respect the Drupal headers, and to pass through pages where POST, GET or cookies are present.

However, there are a huge number of options in the Varnish Control Language. This install has a very simple vanilla setup, but the sky is the limit in devising your cache/nocache conditions.

As the patches in Pressflow are included in Drupal 7 to set useful cache-control headers, it would probably make sense for that to be exposed for some back-end settings. This should also be possible now in contribspace.

Talked w/david about this a bit in IRC and since Pressflow backports the Druapl 7 header/caching stuff, we can make use of this:

http://api.drupal.org/api/function/drupal_page_is_cacheable/7

Which includes the mightily awesome drupal_static function. Essentially, it should be pretty simple to create a small contrib module that gives you the same UI as boost for declaring pages to be ineligible for caching. I'll add this to the Project Mercury TODO.

Can't wait to see the production release of the image.

Any ETA?

-Anil

I'm hoping to get something "beta" out in the next week. I will need other people to try using it to push out more edge cases, etc. I'm also going to try standing up more complex things (maybe Open Atrium!) to see what happens with a complex stack of modules, etc. I also know that currently the image doesn't handle mail at all, which will have to be addressed.

There's a lot to do, but my ambitious goal is to have something like a Release Candidate to show off in Paris. ;)

i have see it the new image in the top!

When you say that the current EC2 images doesn't handle mail do you simply mean that it's not been configured with mail services etc., as opposed to mail handling being 'broken' due to caching mechanisms in place causing the mail parsing/sending to fail?

In terms of testing, I've got the current image running and will be upgrading a copy of an existing large-ish site to see how it runs. Will report back later today hopefully. Where do you want me to do that? Via this site, or drupal.org?

Best, David

I don't have a MTA up and running.

If you're testing this, that's awesome! I will figure out a good place for handling feedback from more people (maybe a drupal.org project or at least another g.d.o post) but for now just contact me directly. I'm just my firstname (at) chapterthree (dot) com. ;)

I have just spent the last couple weeks researching drupal performance optimizations and this sounds very promising! Great work josh! Have you got any process documentation anywhere we could have a look at?

Thanks for the credit there. Please correct the spelling of my name. Don't worry, I get that all the time ...

This isn't even the first time. Very sorry.

I try Mercury, and amazing with quick install hight performance Drupal on Ec2. Good Job! I find some info on http://dc2009.drupalcon.org/session/backend-drupal-performance-optimizat..., it‘s mention about Apache alternatives like lighttpd. Do you paln to try?

My only problem with Amazon EC2 is the bandwidth cost which seems expensive. I have been evaluating recently my options and going with a $50 with 512MB RAM plus 1.5 TB bandwidth seems cheaper.

The 10 cent per hour small AMI would be fine, but there are these additional costs of bandwidth and POST requests.

Will it be enough to copy all modules, themes and files and import database from current site using standard Drupal 6 (not Pressflow)?

Or there is some more complicated way to switch to Pressflow?

Pressflow is a managed version of Drupal 6 with a set of high-value performance patches. I don't know what (if any) database schema changes need to be made. I would holler at the FourKitchens brain-trust and see if they want to put up a page about "upgrading to pressflow".

In any case, I can't imagine it's extremely difficult. Haven't had cause to try this myself yet, but I'm sure you can do it! :)

We have a big basket of modules, and those exact steps (+ copying the imagemagick module's .inc into includes) worked when I tried to copy a site. A couple date fields didn't show up at first, but that might be because I hadn't enabled the relevant modules before uploading the backed-up database.

There are no schema changes in Pressflow. Upgrading is a code-only change, making it simpler than even a minor Drupal upgrade. Going back to standard Drupal is just as easy, assuming you're not using any Pressflow-specific APIs or configuration options.

-David (the Pressflow release manager)

Hello guys. I've moved a Drupal 5 site to PressFlow 5 and everything seems to be working correctly.

Now I need to setup PressFlow 5 to use two MySQL servers in a master slave setup. I've already done the MySQL setup part and I can confirm that what's inserted or updated on the master is correctly replicated to the slave.

My problem is I don't know how to tell PressFlow 5 to use the master and slave.

I tried updating my settings.php to use all of the suggestions on http://tag1consulting.com/patches/replication but it seems that PressFlow 5 only uses the master db server.

I've enabled logging on both db servers to make sure of what queries are running on each server and I'm sure PressFlow is not using the slave server at all.

Any suggestions?

Thanks.

I'm seeing the same issue...

Alexis posted on G.D.O and we got it figured out here

maybe someone can help me...

my varnish doesn't cache things...

92454 27.00 29.99 Client connections accepted

217344 57.00 70.50 Client requests received

382 0.00 0.12 Cache hits

3 0.00 0.00 Cache hits for pass

2612 1.00 0.85 Cache misses

my cache misses is much bigger than cache hits. what's wrong here?

i'm running drupal and ipb board

To take advantage of Varnish, you need to run Pressflow, not stock Drupal. There's more information here and here.

Is there way to change the EC2 images to an ISO for standard hardware.

Or can you publish a vanilla ISO so those without Amazon can test this stack?

Tom

Thanks for sharing and keep up the good work!

hello, I'm interesting in your great project.
Is there any chance to get ami for ec2 micro?
To run locally using smth like virtualbox or wmware?

Post a comment

The content of this field is kept private and will not be shown publicly.
CAPTCHA
Let us know you're human by typing in this code. The code is case sensitive.
Image CAPTCHA
Enter the characters shown in the image.
To prevent automated spam submissions leave this field empty.