This is the first in a series of posts about Drupal 8's configuration management system. This system is one of its most eagerly anticipated features, according to a recent survey. The Configuration Management Initiative (CMI) was the first Drupal 8 initiative to be announced in 2011, and we've learned a lot during thousands of hours of work on the initiative since then. These posts will share what we've learned and provide background on the why and how.
Store data that needs synchronising in the configuration system
Drupal 8 ships with four major storage APIs. If you are writing a module and you need to store information, then where you store that information is important. Which one to depends on what you are storing.
State is a place to store stuff that is unique to a particular site instance. A good example is the last time that cron has run. This is something that can not be recalculated and should not be shared across development and production versions of the same site.
Cache is a place to store stuff that is expensive to calculate but can be rebuilt. An example of this is an array containing all the entity type annotation information. To get this information, Drupal needs to read all of the annotated entity classes. Doing this is on every request is expensive, so we store it in the cache system. Data that is cached must be rebuildable from code and other data stores if the cache is cleared.
Configuration is a place to store information that you would want to synchronise from development to production. This information is often created during site building and is not typically generated by regular users during normal site operation. The configuration API comes in two flavours - the (simple) Config API and the Configuration Entity API. The key difference is that the Config API is the singleton usecase. A singleton is where there can be only a single instance of this configuration. A good example would be the site's name. The Configuration Entity API should be used to store multiple sets of configuration - for example node types, views, vocabularies, and fields.
And of course there is the Content Entity API, which should be used for storing content, not configuration. Nodes, taxonomy terms, and users are stored with the Content Entity API. Sometimes the boundaries between content and configuration can be blurred, but that is a topic for another blog post.
As we've migrated Drupal 7 variables to the configuration system, we've had to choose which of these four APIs to use for storing each piece of information, and we have not always made the correct choice initially. For example, we originally put the maintenance mode into configuration. The result was that that if you put your site into maintenance mode to do a configuration synchronisation (as is recommended) your site could suddenly leave maintenance mode halfway through! (See the original bug report for discussions of this.)
Further reading: Overview of Configuration (vs. other types of information).
Configuration should not change unexpectedly
Loading configuration and re-saving it immediately should not result in any changes. Put another way, if the all the runtime dependencies of the configuration are satisfied (i.e it works), then regardless of any additional modules being enabled, the configuration should not change.
An interesting case of configuration chainging unexpectedly was discovered in a bug with the filter module. The filter module was adding all the disabled text filters to every filter format configuration entity. This meant enabling any module that provides text filters would change existing filter formats if they were re-saved with no configuration changes. This could cause a site owner to think they need to synchronise configuration even when nothing on the site had changed, and would give anyone reviewing a deployment more to review than necessary.
Sort configuration keys in a predictable order that does not change often
When synchronising configuration between your site instances, you can tell what changes are being deployed by looking at diffs between the active and the staged configuration. These diffs need to be easy to read and understand.
For example a configuration entity might contain the following information about some displays:
# An example configuration entity YAML file display: default: position: 0 page_articles: position: 2 page_pages: position: 1
It would be tempting to sort the displays in the above configuration by the position property. This is a bad idea because if a user reconfigures the position of a display, then when the another user synchronises the configuration and diffs the changes, it will not be clear what has changed because many lines will simply have been reordered, as in the screenshot below:
If the displays are instead sorted by the display identifier, then the only change the user will see is the position keys changing, and so it will be clear what change is being synchronised, as below:
This makes it much more obvious what change is being deployed and improves usability.
The Configuration Management Initiative has learnt about more than just configuration
This post has covered some of the technical lessons that we learned whilst working on the configuration management system. If you want to read about some of the other challenges faced by the initiative and the wider community, then I recommend reading heyrocker's post Stay for the community.
If you've been inspired by this post and want to help get the configuration system finished for release, we have a few critical and major issues to fix. Don't hesitate to reach out to me in IRC if you want to work on any of them. My nick is alexpott.