Performance tests

Last updated on

9 April 2025

This documentation needs review. See "Help improve this page" in the sidebar.

Click here to jump straight to the quick setup instructions.

Introduction

What is Gander

Gander is an open source automated performance testing framework, which has been part of Drupal core since 10.2 and allows us to monitor performance across time and ensure that performance regressions aren't unknowingly re-introduced. Gander isn't limited to core development, so anyone can start using it today!

Learn more about the motivation behind it, it’s history and origins.

Why should I use it

Performance is important, but often overlooked. Experience shows that many organizations neglect performance aspects of their projects until problems pile up to the extent that makes them impossible to ignore. At that point it is often very difficult to reverse the trend due to large technical debt and cultural aspects, which results in months or even years of effort to take the performance of the project back on the right track.

Performance, on the other hand, has a significant and direct impact on the business outcomes. Research shows that even minor performance regressions turn users away, directly resulting in narrower reach and a negative impact on the bottom line.

We already have tools and metrics to reliably measure end user performance experience. Core Web Vitals are becoming an industry standard and anyone can check how their website is performing using tools like PageSpeed Insights and Lighthouse. The main disadvantage of these tools is that we use them retroactively. We would prefer a tool that will detect performance regressions even before it will reach end users.

Gander lets us do exactly that. By incorporating it into existing CI/CD pipelines we ensure that it runs and regularly checks performance of a project during the development cycle. Depending on how it is used it will let us monitor metrics over time and compare them to the baseline (does a change X perform better or worse than the current release, how do releases perform against each other, …) or even fail the automated test suite as soon as one of the important metrics exceeds the threshold that we consider satisfactory.

This allows us to both ensure that the performance of the project improves (or at least does not deteriorate) over time and outright fails when major issues are about to sneak in unnoticed.

How it can be used

A Gander performance test is an extension of a standard functional test, which will additionally collect the following performance metrics during the page lifecycle:

Time to first byte (TTFB)
Largest contentful paint (LCP)
First contentful paint (FCP)
Number of database queries
Number of cache requests (gets, sets, deletes)
Number of JavaScript files
Number of CSS stylesheet files
List of database queries

See Drupal\Tests\PerformanceData class for the most up to date list of metrics and Drupal\Tests\PerformanceTestTrait for where they’re collected.

When the metrics are collected the test will send them to a Grafana dashboard where we visualize them in a way that gives us insight into the metrics over time (for a given branch for example) and compare them across features or branches. The dashboard for Drupal core is publicly available for anyone to inspect.

Gander Grafana dashboard

Additionally, we are able to use the collected performance data to assert on metrics. For example - we can ensure that the number of database queries is exactly the same as we’d expect them to be. If we have such a test in place and somebody introduces a change, which will change the number of queries during the page build our test will fail and the change won’t sneak in unnoticed. Similar assertions can be done on all other metrics that are collected.

Drupal core already does this and we’ve started seeing improvement in awareness, performance consideration and prevention of unwanted changes being merged almost immediately after first such tests became part of Drupal's test suite.

The two ways of using Gander are not mutually exclusive or dependent on each other in any way. We can use either of these methods individually or both of them - depending on what suits the needs of the organization.

How to contribute

Gander is entirely open source and it relies on a community of volunteers. If you find Gander useful but you require additional features or metrics or you find a bug we encourage you to collaborate with the community.

Gander is not a single project but a collection of code, configuration, recipes and documentation spread across multiple projects:

Test base classes and metric collection code are part of Drupal core. In order to contribute to those use Drupal’s issue queues and tag the issues with performance and gander tags.
Community Grafana dashboard configuration is hosted on Github. In order to contribute to it, open an issue or a pull request in its GitHub repository.
Gander DDEV add-on is hosted on Github. In order to contribute to it, open an issue or a pull request in its GitHub repository.
This documentation is hosted on drupal.org, where you can also edit it.

Quickstart

By far the simplest and quickest way to get started with Gander is to use the DDEV add-on, which will automatically add all necessary services to your local environment (Grafana, OpenTelemetry collector, Grafana Tempo, Prometheus).

In order to start using the add-on first ensure that all the prerequisites are met:

DDEV is installed.
DDEV is enabled on your Drupal project.
Ensure you can already run functional JavaScript tests in the DDEV environment.

If you already have a working local environment that is able to run functional tests skip this step. Otherwise this is the easiest way to create one:

composer create-project drupal/recommended-project gander
cd gander
composer require --dev drupal/core-dev
ddev config
ddev get ddev/ddev-selenium-standalone-chrome

If you are using Gander on Drupal <= 10.2 you will need to use ddev/ddev-selenium-standalone-chrome 1.0.4. Fix the version with --version=1.0.4 when adding the Selenium addon.

Now you can start running Gander tests:

ddev get tag1consulting/ddev-gander
ddev restart
ddev ssh
cd web/
To run a single test:
../vendor/bin/phpunit -c core core/profiles/demo_umami/tests/src/FunctionalJavascript/OpenTelemetryAuthenticatedPerformanceTest.php
You might need to run the test a few times for the data to start appearing on the dashboard usually three times is enough.
Check the Grafana dashboard via: http://<project_name>.ddev.site:3000/
To run all tests that send data to the dashboard:
../vendor/bin/phpunit -c core --group OpenTelemetry

Performance assertions

Performance assertions enable us to put performance requirements into existing tests. This ensures that any performance regressions won’t sneak into the codebase unnoticed.

We recommend adding performance assertions to all tests to ensure that basic metrics such as number of database queries or cache requests will stay in the acceptable range. We also recommend writing tests for any performance regression fixes, which will ensure that they won’t be re-introduced.

In order to start using this functionality you need to be on Drupal Core 10.2 or later and be able to functional javascript tests on Drupal

Gander tests are an extension of functional JavaScript tests (any test class that extends the \Drupal\FunctionalJavascriptTests\WebDriverTestBase base class). In order to convert any such test to a Gander performance test we need to use the \Drupal\FunctionalJavascriptTests\PerformanceTestBase instead. Changing the base class won’t change anything and the test should continue to run normally. If you are extending a non-performance test base class that you don't have control over or you have a custom test base class, you can Drupal\Tests\PerformanceTestTrait directly, just adapt how it's used in PerformanceTestBase.

Gander won’t collect performance metrics for just any request made inside the test. This separation allows us to prepare the environment so that we are always collecting the metrics in a predictable environment. Examples of preparation steps would be creating users and content, warming the cache, …

When we are ready to collect the metrics we need to make the test requests using a PerformanceTestBase::collectPerformanceData() function:

$performance_data = $this->collectPerformanceData(function () {
  $this->drupalGet('node/1');
});

$performance_data will include metrics that Gander collected during the requests that were run as part of the closure. We can do standard test assertions on these:

$this->assertSame(41, $performance_data->getQueryCount());
$this->assertSame(81, $performance_data->getCacheGetCount());
$this->assertSame(16, $performance_data->getCacheSetCount());
$this->assertSame(0, $performance_data->getCacheDeleteCount());

$this->assertSame(2, $performance_data->getStylesheetCount());
$this->assertSame(1, $performance_data->getScriptCount());
// or
$this->assertNoJavaScript($performance_data);

…

Common test strategies are to require a certain number of database queries on the page, require only cache gets (and no sets or deletes) on pages with hot caches, require a known number of CSS and/or JavaScript files, …

Check the following core tests for more examples and ideas:

Standard installation profile StandardPerformanceTest
Umami performance test PerformanceTest

Drupal core tags all performance tests that only assert on metrics (and send no data to the Grafana dashboard) with @group Performance, which makes it convenient to find more examples in the core codebase.

We recommend only asserting on metrics that produce deterministic results. This way we prevent false positives. Examples of such metrics are database queries, cache requests, stylesheet and script counts. If there is a small variation in results, use assertGreaterThanOrEquals() and assertLessThanOrEquals() on the narrowest range possible.

Metrics that naturally vary a bit across the runs are not ideal, because they are impossible to assert on a specific number. Examples of such metrics are TTFB, LCP and FCP. In theory it is possible to assert on a range (Examples: TTFB should be between 250 and 300ms, LCP should be between 1500ms and 1800ms), but this should be done with caution. Too narrow a range could end up producing false positives and too wide will likely result in many false positives.

Alternative approach for these metrics could be to use Grafana alerts, but this is something for the next chapter.

Monitoring performance metrics

Besides performance assertions Gander allows us to monitor performance metrics across longer periods of time. This is achieved by integrating Gander with OpenTelemetry, which allows us to send collected performance data into visualization tools such as Grafana. While we do use Grafana as a default presentation tool, it is worth mentioning that it is possible to use any other tool that speaks OpenTelemetry.

This functionality allows us to monitor our application’s performance metrics over time and spot any discrepancies quickly, ideally before the regressions end up on production. We suggest running these types of tests on a regular basis and monitoring the dashboard before tagging and deploying new versions of the application.

When discrepancies are identified it is easy to research the causes as Gander also sends detailed traces to the dashboard.

This already helped us to identify and fix numerous performance issues in Drupal core like this one that shortened every test suite run time by 10% and a few issues that improved the performance of the log-in process.

Drupal core currently runs this type of tests (tagged with
@group OpenTelemetry) every two hours for all active branches (10.2.x and 11.x at the time of writing). Results are published on the publicly accessible Grafana dashboard, which uses publicly available Grafana configuration that can be used as a starting point.

In order to start using this functionality you need to be on Drupal Core 10.2 or later and be able to run functional javascript tests on Drupal. You will also need to have a Grafana/Tempo/OpenTelemetry stack hosted somewhere. You can self host it or use Grafana Cloud (free tier should be sufficient for most use cases). For local testing we suggest using the DDEV add-on that we provided.

In order for metrics and traces to be sent to the dashboard we need to tell Gander where to look for the OpenTelemetry collector endpoint. This is done by setting the OTEL_COLLECTOR

environment variable (example from DDEV add-on: OTEL_COLLECTOR=http://otel-collector:4318/v1/traces).

Tests that collect metrics for the dashboard are very similar to the tests that only assert on performance metrics. We are still using PerformanceTestBase::collectPerformanceData(), we are still preparing the environment outside of it and we can still assert on performance metrics. The only difference is the second argument to the collectPerformanceData function, which is the name of this test on the dashboard:

$this->collectPerformanceData(function () {
  $this->drupalGet('<front>');
}, 'umamiFrontPageColdCache');

In this specific case we are naming the test “umamiFrontPageColdCache”, which will appear on the dashboard like this:

Cold cache performance test results

On the screenshot we see a graph for this specific test over time on the left and the list of latest traces on the right. If we click on a given trace we get to the detail page for this trace, which allows us to do a detailed analysis of it:

Detailed trace view

In order to get the most meaningful results and to be able to compare different situations we suggest running tests with cold and hot caches. We can collect metrics with cold caches by clearing the caches and collecting the metrics immediately after that:

$this->drupalGet('user/login');
$this->rebuildAll();
$this->collectPerformanceData(function () {
  $this->drupalGet('<front>');
}, 'umamiFrontPageColdCache');

In order to use hot caches visit the tested page twice before collecting metrics:

// Request the page twice so that asset aggregates and image
// derivatives are definitely cached in the browser cache. The
// first response builds the file and serves from PHP with
// private, no-store headers. The second request will get the
// file served directly from disk by the browser with
// cacheable headers, so only the third request actually has the
//files in the browser cache.
$this->drupalGet('<front>');
$this->drupalGet('<front>');
$performance_data = $this->collectPerformanceData(function () {
  $this->drupalGet('<front>');
}, 'umamiFrontPageHotCache');

You can also warm just a subset of caches:

$this->rebuildAll();
// Now visit a different page to warm non-route-specific caches.
$this->drupalGet('/user/login');
$this->collectPerformanceData(function () {
  $this->drupalGet('<front>');
}, 'umamiFrontPageCoolCache');

These examples are taken from Umami OpenTelemetryFrontPagePerformanceTest. Two other examples in Drupal core are Umami OpenTelemetryAuthenticatedPerformanceTest and OpenTelemetryNodePagePerformanceTest. Other tests that will be added in the future can be found by searching for @group OpenTelemetry tag.

Help improve this page

Page status: Needs review

You can:

On this page

Automated testing

Performance tests

Introduction

What is Gander

Why should I use it

How it can be used

How to contribute

Quickstart

Performance assertions

Monitoring performance metrics

Help improve this page

News items

Our community

Documentation

Drupal code base

Governance of community