How markup is safe in Drupal 8

Nobody wants a website that can be hacked. Drupal has a great security track record and works hard to ensure that core and contributed modules are safe for everyone to use. One of the most common types of security issue is a cross-site scripting attack (XSS). In Drupal 8 we've made extensive changes to the theme system that reduce XSS vulnerabilities.

HTML banana skin. Credit: alexpott

A little bit of history

In previous versions of Drupal, developers had to be very careful not to cause a security issue by printing variables in templates without considering their source. For example, the Drupal 7 handbook page for handling text in a secure fashion shows the following examples for printing variables in templates:

Bad:

print '<a href="/..." title="' . $title . '">view node</a>';

Good:

print '<a href="/..." title="' . check_plain($title) . '">view node</a>';

The Drupal 7 best practice places multiple demands upon developers and makes maintaining secure code complicated. One problem is that the developer cannot know if the $title variable in the example has already been escaped in other code. If the developer calls check_plain() on something that has already been escaped with check_plain(), it will be double-escaped. Double escaping turns sensible markup like this:

<p>Copyright &copy 2016</p>

into this:

&lt;p&gt;Copyright &amp;copy 2016&lt;/p&gt;

Another problem is that the developer might actually want to print markup that is included in the variable, but has no way of knowing whether that markup is safe. And just to complicate matters further, at the time the template is created, a variable might be completely safe for output, but later, a contributed module could be enabled that alters the variable and makes it unsafe to use without check_plain().

We've done a lot of work on the theme system

Drupal 8 has a new template engine called Twig. The work done to integrate Twig with Drupal 8 has brought a number of benefits including:

  • Templates are less code, more markup.
  • We've moved a lot of markup that was previously assembled in module code into templates. In Drupal 8, all the logic needed to theme a site is in these templates (there are no more theme_*() functions in core).
  • Twig is not PHP, so you won't find an unexpected SQL query in a template file killing your site performance.
  • Twig has a sandbox preventing the template from accessing unsafe methods on objects. You can drill down into an object's data (a great feature), but you can't delete a node in a template by doing {{ node.delete }}.
  • Twig comes with the ability to automatically escape text.

It's this last benefit that means developers no longer need to work out which output is unsafe and run it through check_plain() (nor the Drupal 8 equivalent, Html::escape()).

How Twig autoescape works in Drupal 8

Take a simple Twig template:

<div>{{ variable }}</div>

If the variable is a string, it will be automatically escaped. If the template has access to a node object, for example, and the title is printed using {{ node.title }}, that will be automatically escaped as well.

However, we don't always want to autoescape variables. For example, the header is printed in the page.html.twig template like so:

<header role="banner">{{ page.header }}</header>

The {{ page.header }} variable is a Markup object created by the render system. It is not escaped because it implements the MarkupInterface.

If you want to have a form title which contains markup:

$form['#title'] = $this->t('How do you like your coffee?');

The #title element here is using $this->t() to generate translatable markup. It returns a TranslatableMarkup object which also implements MarkupInterface (just like the output of the render system). This means that the render system will not escape this text and browsers will mark up the "you" with emphasis.

One of the interesting things about translations is that the English version might not need markup, but the translation still might. For example, it is possible the translator might want to use the BDO tag to switch the direction of the language. Since all translations are TranslatableMarkup objects in Drupal 8, developers do not need to worry about whether a translation contains HTML or not. This is a significant improvement over previous versions of Drupal, which did not integrate translatable strings with the render system.

What's next?

Automatic escaping in Drupal 8 is a great feature that makes developing secure modules and themes much simpler. The next post in this series will look into what this means for developing modules and, specifically, how to markup should be joined together. In Drupal 8 the following code does not work!

t('Concatenating <em>markup') . ' ' . t('objects</em> does not work.')

Thanks to xjm, jacine and susanmccormick for valuable input to this post.

Topics