Josh Koenig's blog

For the growing universe of developers turning to Drupal as a solution for mission-critical or highly ambitious applications, the question is less and less "can we build it?" and more and more "how do I scale it?"

For those of you considering attending DrupalCon Copenhagen this August looking to answer those kinds of questions, I humbly submit that in addition to immersing yourself in the inspirational slipstream that is the Drupal community, you come a day early — and get your employer or client to find a little extra budget ;) — to attend the Scalability and Performance Workshop on Monday August 23rd:

Train with the Dream Team

This training is going to cover the practical details of setting up a Drupal server that's ready to handle internet-scale traffic — a top to bottom build with nothing left out or glossed-over — and answer your questions about best practices and processes for building high performance and high scalability sites. I am extremely proud to be helping with this workshop, along with the rest of the "Dream Team" of Matt Westgate, Robert Douglass, David Strauss and Narayan Newton.

This is a group that has built and launched sites that are collectively by now running into the hundreds of millions of pageviews. We've been there. We've made the mistakes. We've vetted the cutting edge technologies. Spending a day with us could save you weeks in getting your next big project up and running.

Not only will the day be packed with real hands-on work, it should also be a lot of fun. We're not quite the Harlem Globetrotters, but we're pretty close. I can personally promise some good laughs and interesting connections to go along with the firehose of know-how.

Hope to see you in Copenhagen!

Previously I've written about setting up APC and memcached on your desktop (or in my case laptop) using Acquia's handy stack installer (aka "DAMP"). This is another quick post in that vein.

The other main caching system I work with on a daily basis is the mind-blowingly-fast Varnish httpd accelerator. If you haven't checked it out yet, get on board (srsly; good MacOS instructions from Nate@lullabot here). I'm more than happy to maintain the Varnish integration module on drupal.org, but I've been bugged by my inability to take this work with me on airplanes and other offline places.

See, DAMP ships without the required socket support needed to "talk" to the Varnish control terminal. I'm not saying this is Acquia's fault. They can't put the kitchen sink in there, and most regular users will never miss the socket extension. Those of us who might, well, we can help ourselves, right?

Right! To get rolling, grab the source and untar it. As of this article, we're working off PHP 5.2.13, but this should be pretty solid advice for any 5.2.x release.

cd php-5.2.13
./configure --with-sockets=shared
make
cp modules/sockets.so /Applications/acquia-drupal/php/ext/

Then add the following to the bottom of your /Applications/acquia-drupal/php/bin/php.ini:

[sockets]
; Sockets are useful!
extension=sockets.so

Restart your DAMP application and hit up your friendly local phpinfo() and you should see a section for sockets. Enjoy talking directly to network interfaces! :)

(UPDATE: Videos are already up less than 48 hours later! Nice job organizers!)

Just a quick blog post to let everyone know that slides are available from the presentation I gave yesterday with Bret Piatt from Rackspace explaining how our PANTHEON on-demand service works. Video should be up as well soon, but for anyone who wants it we've got slides video! now:

It's also really humbling to see that we got so many good ratings. I hope those who were looking for "more content/less evangelism" made it to today's BoF, which was also great, and join us on the Drupal group and check out the source.

Finally, a clarification. At one point I mentioned that there was some code in our system which talked to the Rackspace API which we wouldn't be releasing. That's both not entirely true, and to the extent that it's true, is a matter of choice for us. Let me explain.

Our cloud communication layer runs libcloud, which implements a wonderful common API for many cloud platforms. However, it's just a python library, so we needed to create a service around that. This hasn't been accepted into libcloud's SVN, but we're working on it, and the code will be 100% open-sourced. Additionally, we have an custom Drupal module which implements this service — currently via an Ubercart hook — to handle the business logic of when and how to launch a server. This is not fit for public release at this point, but we have a roadmap to contribute the service implementation piece as part of the 0.4 revision cycle for Aegir. This will allow savvy users to set up a cloudhub socket, and launch new cloud instances using their Aegir server!

However, much like the site theme, the exact module that runs our management console will not be released. To be clear — since I know there was a bit of confusion here — that's 100% our decision and has nothing to do with Rackspace. ;)

I (heart) the DAMP/Acquia Stack Installer. It makes rapid testing and development easy, and helps me be productive on airplanes even without wifi. In this post I'll explain how I was able to quickly set up an advanced caching testbed on my desktop to investigate some advanced APC and memcache configurations.

Caveat: these instructions are all MacOS centric — apologies in advance to the Redmond-faithful, happy to finally be caught up with all the Ubuntu-ites — but that's my environment and over the years I've found it to be a good one.

First two easy point-and-clicks.

  1. Download and run the DAMP installer.
  2. Next, since I roll with Pressflow and what I wanted to test was a potential new branch, I grabbed myself a copy of BZR, another cool next-generation version control system with an installer.

Now's where the work starts. I pop open my handy iterm and quickly grab a fresh branch of Pressflow:

mithras-4:pressflow joshk$ cd
mithras-4:~ joshk$ cd Sites/
mithras-4:Sites joshk$ bzr branch lp:pressflow
Branched 78 revision(s).

Then fire up your Acquia Stack, pick "More..." from the little sites drop-down, click the import button, and find the freshly branched codebase. Note: this is a great way to set up/run any codebase locally, not just a Pressflow branch. If you've got an existing site tarball, your own SVN repository, or want try out a cool Drupal product like Open Atrium, you can use DAMP to easily import the pre-existing Drupal.

Next thing I do is install Drush. Drush: don't leave home without it. Now, there's an important tweak you'll want to use to get Drush and the DAMP installer working in harmony: adding the acquia-stack php path to the drush helper script. Hopefully that issue will be committed soon, but in the mean time it's a very small fix: just add /Applications/acquia-drupal/php/bin/php to the bit where Drush searches for *AMP installations.

Once I had drush set up, I ran drush dl devel memcache-6.x-1.x-dev to grab the latest Devel and Memcache modules, since my eventual aim was to do some performance.module based testing.

And now we come to the fun part. See, while DAMP ships with a lot of commonly-needed libraries like GD and curl, it doesn't come with the php components to support APC or memcached, both of which I need for my tests. Thankfully JAM has my back, and we got support for custom php libraries baked into the latest releases of the DAMP installer. Those copy/paste instructions worked like a charm to install APC, so that was out of the way and it was on to memcached.

My buddy Lynn Bender — superconnector and consigliere of the Austin, TX tech scene — has an important post that addresses on of the big questions as various groups, shops and individuals ramp up in responses to Dries' call for more and better Drupal Training. The post was provoked by this comment from one of his friends and readers:

You don't want to get into a situation like CS had a few/several/many years ago, with a big influx of people learning Java because that's where the money was, and we end up with a wealth of lackluster, unmotivated, average Java developers who can "get by" but who aren't ever going to build you anything interesting.

It may seem like hubris, but this is something we could actually have to worry about in the next few years. I've seen more than one RFP cross my radar lately for a big site moving off a proprietary CMS like Vignette. I've also seen a few about a major organization (fortune 500 company, university, etc) looking to adopt Drupal "institutionally" as a go-to web solution. All indications are that we're across the chasm and standing at the end of the early-adopter wave, a.k.a. the beginning of Mainstream Drupal.

That's a good thing. It bodes well for all of us, but it also means we can expect a rush of new participants over the next few years who are primarily (or at least initially) motivated by the market opportunity. Which can be problematic. As Lynn says:

I can't remember from whom I heard this remark first — either Eric Raymond or Tim O'Reilly — it was something like: "It's important not to grow your organization faster than you can propagate its culture." This is a dilemma we Drupalers now face. There is tremendous pressure to grow more qualified Drupal developers and designers. Yet many fear that, in responding to this demand, we may encourage an influx of those for whom Drupal is merely a means to a paycheck.

My own experience with this via the Drupal Dojo (which is making a comeback!) is that the culture can propagate relatively quickly if the right mix of factors are in place:

  • Role Models: people who are new to a community or practice will take their cues from how others behave. We should all keep in mind that when we train people (or just answer questions in IRC) there's always an audience, and that audience is always learning.
  • Pragmatism: one of the big upsides of Drupal's culture is that it's full of norms that are quite practical. "Don't hack core" and do file an issue, not just because we say so, but because it will make your life easier.
  • The Gestalt: in addition to being useful, many of Drupal's technical design patterns (open, hookable architecture) are echoed by the surrounding community and culture. Making these connections helps to promote said culture, and seeing the proverbial "big picture" is always helpful in getting people inspired.
  • Personal Projets: the single greatest way for people to be passionate about Drupal is to have them use the software in a context in which they're already passionate. The unfortunate truth is that for a lot of folks, this isn't how they'd describe their day-job. To the extent that we can all make it easy for people to have personal projects that use Drupal too, we'll have a more passionate community of practitioners.

There's a synergistic potential here, and practical usefulness is the key. One of the greatest kudos I ever got was reading one of our team member's personal blog posts and hearing that spending time working on Chapter Three's "Drupal Farm" made things much easier:

building sites used to involve a lot more struggle and hacking to get the last 10% working, but it's really satisfying smooth sailing these days. i love it.

You only love it when it works, but when it does, and it makes you powerful and free, you can really learn to love it a lot. This is why Apple has such an passionate consumer culture around its family of products. Unlocking this experience for more and more newbie developers is a requirement for turning them into contagiously passionate Drupalists.

Lynn's ambition to create a local group (using the "Dojo" name!) that's focused on skill-building and the "practice" of Drupal development is quite exciting. What will be even more exciting is if it bears out as a repeatable pattern.

As the Drupal market grows there will be an inevitable influx of paycheck-seekers — and to be honest at this point we can use all the help we can get — but the more we can propagate The Drupal Way and increase the percentage of developers who are of the excited and passionate variety, the better off we are.

I expect the pending release of Drupal 7 and associated ventures to create another groundswell of interest in the platform. It's going to be exciting not only getting people up to speed, but also making them converts in the process.

The Drupal takeover of the White House has been well documented and discussed, and need not be debated further in this forum. As some of the first Drupalists to attempt to overthrow the government, we were pleased to see this development. We take it as a signal that the time to initiate our master plan has arrived.

Drupal 8.0's Admin RoomNow that we finally have a socialist president (and our strongest back-channel ally in Michelle "the Marxist Harpy" Obama), along with the necessary purges and infrastructure projects, we are finally ready to activate Project Cybersyn 2.0.

Simply put, while others are focusing on low-hanging fruit like Venture Capital, we are jumping the chump. Our plan is to run the entire US (and eventual world) economy using Drupal. We are picking up on ideas first initiated in the Allende era in Chile, where a socialist internet was created using a network of telex machines. Thankfully today we have a real internet, and can leverage the power of open-source tools (aka "communism") in constructing our workers utopia.

Stafford Beer aka 'merlinofchaos'The first step in this plan was to get Stafford Beer (original Cybersyn architect, pictured at right) to be a contributor to the shadowy "Ctools" project, creating a platform for our ultimate feature-set. By utilizing his principles of cybernetics and the easy modal/ajax framework, we will interconnect a network of Drupal websites which will monitor all aspects of economic production, eventually providing a system for all for decisions to be made.

Some features on the early roadmap:

  • RDF versions of all government paperwork.
  • Drag-and-drop death-panels.
  • New core API's, eg: hook_congressional_votes_alter($voteid, $op) where $op is "executive order", "pork spending", "offer of government job", "threat from rahm".

It's not all happy news unfortunately. For instance, much like the current Health Care Reform bill, while the drupal.org redesign has been formally approved, it will not go into effect until 2014. Such are the prices we pay for utopia.

NOTE: the original Project Cybersyn is totally real. Who knows what might have happened if a military coup didn't murder Salvador Allende?

Coming back from an excellent time (and an excellent talk, updated slides here) out at SxSw Interactive, I've been thinking once more as always about what it really takes to get Drupal "to the next level."

Clearly there are multiple fronts we are proceeding along as a community. The amazing development work being done in core and contrib is obviously key, as are the boundary-pushing efforts to integrate more and better infrastructure for enterprise-scale use-cases, as is the continuing drive to test everything in an automated/continuous basis. However, I wanted to throw out yet another thing we should be thinking about as a development community: monitoring.

Now to be clear, I'm not talking about what you can get out of Munin (although that's nice). It's good to know your server load, but what I really care about is my page execution time, my per-bootstrap memory consumption, my most-frequently run (or longest running) queries. Many of the pieces are there in terms of code already written into the devel module and others, but we don't have anything to compare with what other platforms are doing to expose the internal metrics of their application:

RPM screenshot

In addition to being great eye-candy, this kind of monitoring gives people running large sites (or large numbers of sites) the kind of confidence they need to see "at a glance" that things are ok. It also helps engineers like me spot problems and troubleshoot issues without having to add intrusive changes to live environments.

There's always more to be done, and this is something we'll be thinking about long and hard as part of the PANTHEON project in 2010. As Drupal continues to mature as a platform technology, the addition of supporting services like monitoring (and more advanced analytics) will be important paths for development.

(Update: thanks for the editorial support from Adrian, who notes that "Enterprise Drupal needs spellcheck.")

Just a note to all kindly Drupalists and your followers. I'll be appearing at SxSw interactive to talk about Drupal in the Cloud, sporting an updated presentation which includes info on how we're using BZR to create a "cloud platform", where all that's going anyway, plus details about our forthcoming Mercury on-demand service.

I am looking forward to seeing all sorts of great folks in/around the conference. If you're going to be in Austin this coming weekend, drop me a comment and let's coordinate! You can mark the session on your planner right here.

For those unable to attend, there will be some video and other media, and my slides will be posted online as always. See you in the Lone Star state!

Mercury 1.0 Release

After well over 1,000 hours of development and thanks to the help of many legendary open source engineers, Chapter Three is proud to announce the release of PANTHEON Mercury 1.0, the quick-start server environment for those looking for the best in Drupal performance. You can launch Mercury now with our free Amazon Machine Images (AMI) or follow our install instructions yourself using any Debian Jaunty server. Commercially supported hosting options will be announced in the coming weeks.

What is Mercury?

Mercury is a standardized best-practice server configuration (aka "stack") for running your Drupal website that takes the best of the collected community practices, combines them with cutting-edge open-source tools for high-performance hosting, and delivers it all in a complete package. With Mercury, you can launch a speedy new Drupal server in less than five minutes.

In addition to deploying the required technologies to host Drupal (Apache, MySql, PHP etc), Mercury includes a number of "optional" elements that are rapidly becoming must-haves for any ambitious installation. We give you Varnish for bulletproof protection from the Digg/Drudge effect, Memcached as an application cache to keep your logged-in/admin pages zippy, and Apachesolr to deliver faster and more relevant content search results.

Mercury also implements Pressflow, a high-performance variant of Drupal core similar to the version that runs drupal.org itself (handy comparison chart) and provides critical support for high-availability and high-performance requirements.

What's New in 1.0?

Changes to the hosting stack in 1.0 have been restricted to bug-fixes and minor tweaks with one big exception: we've implemented the BCFG2 system configuration management systems, meaning users who launch today can take advantage of innovations and fixes that land months from now. Lemme lemme upgrade ya!

Stand on the shoulders of giantsThe system remains 100% open-source, and there's no "mothership" or lock-in here. Our BCFG2 setup works by pulling configuration from our public Launchpad repository and serving as its own master. You're free to turn it off once the system launches, merge your own changes, or simply continue pulling updates as they become available.

We think this is a game-changer for Drupal hosting. With this innovation, we can offer a system which remains free and open-source, where you keep root access, but where we can also deliver incremental improvements and new services in a completely transparent fashion. Simply put: if you start with this version, expect to stay on the bandwagon as we continue pushing the envelope with best-practice development, deployment and hosting infrastructure.

Already using Mercury?

If you've been braving our beta releases we'd love to learn from your experience. Please take a few minutes to complete this 8 question survey. This feedback will help us understand how people are using Mercury and what we should focus on in the future to improve it.

Standing On The Shoulders of Giants

Our tagline is "Stand on the Shoulders of Giants" and we're serious about it. None if this would be possible without the enormous efforts of literally thousands of brilliant individuals going back to Linus Torvalds, RMS, and beyond. Somewhat more specifically, we'd like to thank:

  • David Strauss of Four Kitchens, who basically wrote the whole 1.0 roadmap into his Scalable Drupal Infrastructure presentation, is responsible for Pressflow, and has be an continuous source of moral support and technical inspiration (and vice-versa) throughout this process.
  • Our open-source comrades from #varnish, #bcfg2 and #libcloud, particularly desai, sojl, phk, bjorn, polvi and pquerna.
  • Greg Chaix and Damien Tournaud, who helped pioneer the use of Varnish with Drupal and provided invaluable initial support and documentation.
  • Dries, of course.
  • Khalid Baheyeldin of 2bits who's articles comprise a solid corpus of "how to" for high-performance Drupal, as well as Matt Westgate, Robert Douglass, Steve Rude, Narayan Newton, Jeremy Allare and the whole crowd of usual suspects from the high-performance group for the code and knowhow.
  • Our early adopters, for being brave, not listening to the "not ready for production" disclaimers, and running real websites with this thing. Without all the real-world feedback, we'd probably still be too nervous to call it 1.0. ;)

The list could go on for quite a bit longer. Really, it wouldn't be possible without you, so if you've so much as clapped at a presentation, give yourself a pat on the back. Thanks to everyone who helped make this possible!

One of the great things happening this year at DrupalCon North America — Moscone Center, San Francisco, April 19th - 21st; sign up now — is the community is organizing to offer a number of focused trainings in advance of the conference itself. For those looking to build their Drupal skills and come away with practical knowledge, the pre-conference training sessions offer solid takeaways on a number of subjects.

This is a great time to be training. As our fearless leader Dries has said, the need to develop more talent is more pressing now than ever. It reminds me of the early days of the Drupal Dojo: more ninjas please! Lullabot has clearly been setting the pace here for years, but with Drupal continuing to grow there's simply more demand out there than any one shop can satisfy. Chapter Three has been taking up that challenge by offering regular training events in San Francisco, which has been a good experience for us and our students so far. It's exciting to see others following suit, like our good friends at Zivtech.

Anyway, I'm particularly proud to be helping to organize a session on Drupal performance and scalability. This is a hot topic that I'm happy to see demystified. The more people we can get up to speed on core best-practices, the better we'll be able to drive the next round of Drupal growth, which is going to include a lot of Enterprise-scale projects with intense uptime, scalability and performance requirements.

It's also something people seem to be hungry for; after all, everyone wants their internet to be fast. Last Friday at the Silicon Valley Drupal Users Group I gave a presentation on Drupal in the Cloud, and spent almost two hours doing Q&A about Cloud hosting in general, but really about performance and scalability in particular. It's not a simple thing, but the good news is that with the amount of information-sharing that goes on in the Drupal universe (e.g. via the high performance group), over the past few years some real best-practices are emerging.

If spending a whole day talking to the likes of myself, David Strauss, Matt Westgate, and other luminaries about these issues sounds exciting, sign up now. Space is limited, and I'm pretty sure this one will sell out quick!