Drupal hosting companies like Pantheon or Platform.sh provide tooling, fine-tuned for Drupal, and eliminate many pain points of cloud-based hosting. But sometimes a client can’t use those services. We would like to share the story of crafting an infrastructure for Drupal on AWS using ElasticBeanstalk. A semi-managed environment - generic for every kind of web application.

Architecture Overview

ElasticBeanstalk web application environment architecture - load-balanced, auto-scaled, managed instances.

This diagram from the documentation of AWS contains almost every argument for considering AWS EB. Major points:

  • Web instances can auto-scale horizontally.
  • DB instance can be configured for high-availability.
  • Security groups have a sane default configuration.
  • The DB instance is managed.
  • EC2 web instances are managed.
  • Deployment tooling is backed by EB CLI.
  • Deployment itself is well tested and supports blue-green deploys.
  • Minimal infrastructure maintenance in the long run.

As we had to go with an AWS solution, the above points made it compelling enough for us to try it out. Out of the box, just uploading the source code of Drupal to ElasticBeanstalk isn’t enough. After we figured out that there’s a sample configuration that tailors EB to Drupal, we were more brave to continue our experiments.

Web Server Stack

To make our starter kit work with ElasticBeanstalk, we tweaked the webserver and PHP config like this:

option_settings:
  aws:elasticbeanstalk:environment:proxy:
    ProxyServer: apache
  aws:elasticbeanstalk:container:php:phpini:
    document_root: /web/
    max_execution_time: 60
  aws:elasticbeanstalk:application:environment:
    SYNC_DIR: config/sync

We switched to Apache, so every change in .htaccess gets effective immediately. With a Composer-based workflow, the document root is web and the config sync directory is placed outside of the docroot as stated above. These modifications went to drupal.config.

Drupal Filesystem

As the EC2 web instances are ephemeral, we need to have a persistent storage for Drupal files. Here are all the needed configuration files to have this shared filesystem out of the box.

Cron

Cron jobs are not present out of the box. We can solve that by adding new files under .ebextensions/, that is with a new file, cron-linux.config. There are two challenges: First, to expose the environment variables properly, then to sudo to the needed user. We created a pull request against the sample repository to document the process for everyone.

Drush

By default, when SSH-ing to the web instance, Drush command won’t work as EB is fully generic and doesn’t know of Drupal. With another extension, we can make it work. It looks like this.

Deployment

Shipping code is far from enough to deploy a new version of a Drupal application. You need to take care of database updates, configuration changes, cache clears, perhaps re-indexing your search indices and so on. These are really generic steps; it is re-invented again and again in many RoboFiles, shell scripts across the globe, so here’s the variant for EB. On specific client projects, where we use EB along with ddev, it turned out to be pretty robust to ship code with ddev eb deploy.

Database Connectivity and Other Environment-Specific Details

Like every environment-dependent data (eb CLI asks you for DB’s user / password while creating the environment), RDS database credentials are injected via environment properties. After those properties are set, we are able to access them via $_ENV. The settings.php could look like this:

<?php
$databases['default']['default'] = array (
  'database' => getenv('RDS_DB_NAME'),
  'username' => getenv('RDS_USERNAME'),
  'password' => getenv('RDS_PASSWORD'),
  'prefix' => '',
  'host' => getenv('RDS_HOSTNAME'),
  'port' => getenv('RDS_PORT'),
  'namespace' => 'Drupal\\Core\\Database\\Driver\\mysql',
  'driver' => 'mysql',
);

As your local settings.php won’t be suitable as-is, we decided to handle a special file web/sites/default/settings.aws.php that contains all such customizations during the deployment process and we put that into the proper place using Robo when the deployment actually takes place.

Similarly configuring email sending could be another few lines in the same file:

<?php
$config['swiftmailer.transport']['transport'] = 'smtp';
$config['swiftmailer.transport']['smtp_host'] = 'email-smtp.eu-west-1.amazonaws.com';
$config['swiftmailer.transport']['smtp_port'] = '587';
$config['swiftmailer.transport']['smtp_encryption'] = 'tls';
$config['swiftmailer.transport']['smtp_credential_provider'] = 'swiftmailer';
$config['swiftmailer.transport']['smtp_credentials']['swiftmailer']['username'] = getenv('SES_USER');
$config['swiftmailer.transport']['smtp_credentials']['swiftmailer']['password'] = getenv('SES_PASSWORD');

Of course this implies the configuration of the IAM user and the SES part, but on the Drupal side, it’s simple and it could be environment-independent code. We set the proper environment properties everywhere.

For the record, RDS_ properties, are injected automatically by the AWS. Therefore, if you attach a database instance upon creation to the ElasticBeanstalk environment, which is the default choice, the credentials and the hostname of the MySQL database will be available in the environment variables.

HTTPS

To properly know about HTTPS behind the load balancer, we have a few things inside that AWS-specific settings file:

<?php
// HTTPS should be configured for every single domain, we then do a redirect.
if (isset($_SERVER['HTTP_CF_VISITOR']) && $_SERVER['HTTP_CF_VISITOR'] === '{"scheme":"https"}') {
  $_SERVER['HTTPS'] = TRUE;
}

if (!empty($_SERVER['HTTP_X_FORWARDED_PROTO']) && $_SERVER['HTTP_X_FORWARDED_PROTO'] == 'https' || !empty($_SERVER['HTTP_X_FORWARDED_SSL']) && $_SERVER['HTTP_X_FORWARDED_SSL'] == 'on') {
  $_SERVER['HTTPS'] = TRUE;
}

if ((!array_key_exists('HTTPS', $_SERVER)) && (PHP_SAPI !== 'cli')) {
  $new_url = $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI'];

  header('HTTP/1.1 301 Moved Permanently');
  header('Location: https://'. $new_url);
  exit();
}

// ElasticBeanstalk provides us a load balancer that does terminate HTTPS,
// and EB's EC2 web servers only listen on port 80.
// Certificates are generally managed by Amazon (Certificate Manager).
$settings['reverse_proxy_trusted_headers'] = \Symfony\Component\HttpFoundation\Request::HEADER_X_FORWARDED_ALL;
$settings['reverse_proxy'] = TRUE;
if (isset($_SERVER['REMOTE_ADDR'])) {
  $settings['reverse_proxy_addresses'] = [$_SERVER['REMOTE_ADDR']];
}

This makes it compatible with the internal load balancer and CloudFlare-hosted domain names too. One drawback is that you cannot access your site via HTTP (see below, the default hostname for the environment does not have a certificate out of the box). But that might actually be a good thing.

Safe SSH access

The ElasticBeanstalk stack does not have a robust solution for SSH access. You might be tempted to distribute key files with your team or have port 22 temporarily open to the entire world. IT security teams won’t love it, and for a good reason. To deal with that, we use AWS Systems Manager to allow remote Drush command execution and file syncing between local DDEV and remote EB EC2 instances. The process of integrating SSM with EB is well documented at the eb-ssm GitHub repository, where we contributed back a patch to allow non-interactive mode. The final DDEV command looks like this:

#!/bin/bash

EB_ENV=$1
if [[ "$EB_ENV" == "client-live" ]];
then
  BASE_URL="https://live.example.com/"
elseif  [[ "$EB_ENV" == "client-qa" ]];
then
  BASE_URL="https://qa.example.com/"
else
  BASE_URL="https://test.example.com/"
fi

shift
CMDS=('drush --uri='$BASE_URL "$@");

python3 .ddev/eb_ssm.py --command="${CMDS[*]}" --non-interactive=True "$EB_ENV"

That’s almost as comfortable as terminus remote:drush used on Pantheon. rsync access is well documented at AWS’ own documentation; after SSM is configured, it’s a trivial next step. Replicating the public filesystem of Drupal from production on your local machine is quite a frequent need for development teams.

Pain Points

There are some weaknesses of the platform that we discovered on the go.

The very first pain point that we faced is the SSL certificate handling. There is no Let’s Encrypt. If you have a Route 53-hosted domain, the Load Balancer can give you a certificate for free. If you use CloudFlare then mixed-mode SSL works well. However we’d really expect out of the box certificates for the auto-generated domain names too. We spent more time with certificates and HTTP config than anticipated.

Another pain point is that a major PHP version upgrade cannot be done in-place. For an application backed with test coverage, we’d wish that ElasticBeanstalk should be more frugal in this. We need to spin up a new environment, and then move the domain name to the new environment. If automated tests pass under PHP 8, ElasticBeanstalk should allow us to just re-configure an environment to use PHP 8. However, that’s not possible. Instead, we need to spin up a completely new environment, test the PHP8 upgrade there, do the same for the live environment, and when it’s ready, re-configure our live domain to point to the new environment. Eventually, we should delete the old environment. It’s not great as it requires us to manually sync the SQL database and the Drupal (public) files.

Icing on the Cake

ElasticBeanstalk allows you to automatically patch the web stack.

At the very first offline post-covid Drupal meet-up in Budapest, amongst many other topics, the Drupal security updates were discussed and so many people from various agencies had interesting stories of patching tons of Drupal sites after business hours, with some pizza and with some encouragement from the management. Operations should not be about the excitement, nor about surprises. EB cannot defend you from the next Drupalgeddon, but you can be protected from the next major Apache / PHP / Linux vulnerability that will hit the headlines in the coming months.

Conclusion

From a bird’s eye view, hosting Drupal on EB is very similar to a container-based hosting. But instead of ephemeral containers, there are short-lived good-old virtual private servers, so the devops work there should follow similar principles. However, in the long run, it’s a big win that under the layer of Drupal, every single bit of the software and hardware stack are fully managed. The rigid nature of the stack also makes it possible for AWS support to support your devops efforts in a meaningful way. All in all, for the cases we had to go with an AWS solution, EB has been working for us.

AronNovak's profile

Áron Novák