Enterprise Drupal Needs Monitoring

Profile picture

Coming back from an excellent time (and an excellent talk, updated slides here) out at SxSw Interactive, I've been thinking once more as always about what it really takes to get Drupal "to the next level."

Clearly there are multiple fronts we are proceeding along as a community. The amazing development work being done in core and contrib is obviously key, as are the boundary-pushing efforts to integrate more and better infrastructure for enterprise-scale use-cases, as is the continuing drive to test everything in an automated/continuous basis. However, I wanted to throw out yet another thing we should be thinking about as a development community: monitoring.

Now to be clear, I'm not talking about what you can get out of Munin (although that's nice). It's good to know your server load, but what I really care about is my page execution time, my per-bootstrap memory consumption, my most-frequently run (or longest running) queries. Many of the pieces are there in terms of code already written into the devel module and others, but we don't have anything to compare with what other platforms are doing to expose the internal metrics of their application:

RPM screenshot

In addition to being great eye-candy, this kind of monitoring gives people running large sites (or large numbers of sites) the kind of confidence they need to see "at a glance" that things are ok. It also helps engineers like me spot problems and troubleshoot issues without having to add intrusive changes to live environments.

There's always more to be done, and this is something we'll be thinking about long and hard as part of the PANTHEON project in 2010. As Drupal continues to mature as a platform technology, the addition of supporting services like monitoring (and more advanced analytics) will be important paths for development.

(Update: thanks for the editorial support from Adrian, who notes that "Enterprise Drupal needs spellcheck.")

Comments

I will be trying Server Density soon for server monitoring. Easy on the eyes and less work than Munin. Paired with Pingdom and this should be enough monitoring for most. Not sure if these services meet your enterprise needs. They are affordable, easy to setup and attractive, which is more than can be said about most other options.

there's definitely a need for it. one of my aims is to move the monitoring out of drupal so that another system can keep an eye on it. i've been working on some custom code to output status values in plain text for parsing by nagios. the idea is to let us monitor things like the last time cron ran, un-indexed solr content, etc.

I'd love to hear more about what your doing with that. We're using nagios now with the nagios drupal module but the reporting isn't very granular. Wish I had more time to play around with it.

Yeah, this is an interesting nut to crack. Its a dashboard project, fundamentally. We need cache hit rate from one place, slow queries from another, watchdog form another, etc.

I've thought about this quite a bit as of late too. I think the key is to implement a SNMP mib for Drupal, once that is exposed, you can pick your monitoring tool, whether it be Nagios, OpenNMS, Hyperion, or whatever.

Thanks for that Server Density link. Great stuff for the 'No IT department' crowd. They also offer a plugin API that looks dead simple - http://blog.boxedice.com/2010/03/16/server-monitoring-plugins-monitor-an...

Splunk has been awesome for us at Acquia search. We get crazy metrics like the 90th percentile of response times for users using 10 facets on their sites with databases of over 1GB, etc... Some nginx logging and a few splunk configuration tricks and it is rocking.

The only downside is the massive price tag. I really wish they'd make their product (or a subset of it) more acceptable when you've got a lot of logs.

For a site with under 500MB of logs per day though, it is well worth it.

-J

Page execution time has been on my radar for a while, not just because its a usability issue that can easily result in a visitor bouncing from a website, but also because Google is now taking this parameter into account as an important factor of SEO. So not only is usability affected but also the marketing of the website (specifically search engine marketing) which is one of my top concerns when building a site. If a visitor bounces due to a slow response time its not a good thing BUT if the search engines are not even showing the website in the SERPs its far worse. A good monitoring solution for some of the issues mentioned in the above article and comments would go a long way.

Hmm ... maybe splunk?

Josh,

Thanks for the screen shot and link to my company's SaaS app monitor, which is called New Relic RPM. We currently offer support for web apps written in Java and Ruby, but PHP is definitely on the long term roadmap. When we address this, we'll definitely pay particular attention to how well the product does in managing Drupal. We'll be sure to get you on the short list for the beta.

We won't be able to start work on this project until later this year at the earliest; we want to make sure we stellar job managing every platform we support, rather than adopting a mediocre-but-wide strategy. But I do look forward to broadening our offering to support Drupal and other PHP apps in the future. Please send me your thoughts and feature requests to cirne at newrelic dot com. Cheers,

Lew Cirne
Founder and CEO
New Relic

Hi Josh,

Neat article. While we certainly don't (yet) have strong monitoring support for memory management, our tool Droptor is for monitoring and managing multiple Drupal sites in one place:

http://www.droptor.com/

It's goal is to give you a sense of what is happening on all of your sites in one place.

I think the memory management data is a great idea, and something that is now on our feature request list.

Cheers,
jpe

Hi Josh,

Just wanted to let you know that Droptor 2.0 launched today:
http://www.droptor.com/v2

Droptor now includes memory profiling:
http://www.droptor.com/tour#memory

It certainly isn't as cool as New Relic RPM (yet), but it's a solid start.

Cheers,
jpe

Add comment

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <br> <br/> <br /> <p> <img> <blockquote> <i> <b> <u>
  • Lines and paragraphs break automatically.
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.

Client Testimonial

Working with Chapter Three was great! They understood the development workflow perfectly and accommodated our needs to handle the theming and IA of the site. Add to this the short timeline and last minute changes they put up with. One area that Chapter Three really helped was setting up the development environment for us. This has proved invaluable to managing the site and pushing new features.

Michael Shaver, Web Administer, Intel Moblin Case Study

Drupalcon SF 2010