Make WordPress Core

Opened 9 months ago

Closed 3 weeks ago

Last modified 10 days ago

#63018 closed enhancement (fixed)

Increase styles_inline_size_limit from 20,000 bytes

Reported by: westonruter's profile westonruter Owned by: westonruter's profile westonruter
Milestone: 6.9 Priority: normal
Severity: normal Version: 5.8
Component: Script Loader Keywords: has-patch
Focuses: css, performance Cc:

Description

As shown in #63007, being able to eliminate external stylesheets in favor of inline styles can have a dramatic improvement to page load time. External stylesheets are render-blocking in the HEAD and the Largest Contentful Paint (LCP) metric for textual elements often shows up having a render delay which corresponds to the time it takes to load external stylesheets. Eliminating render blocking stylesheets can cut LCP in half, from poor to good.

We need to analyze the performance tradeoff between inlining a larger amount of CSS to reduce render blocking versus having this CSS already loaded in cache for subsequent page views. In other words, how much of a performance regression will there be for subsequent page views where the browser cache has been primed when all CSS is external?

This is all a follow-up to what was raised by @sabernhardt in #58519:

We would need metrics for loading the first page plus an identical page—after the browser saves the external stylesheet(s)—to compare against the same pages with inlined stylesheets.

These styles are not minified, so the proposed inlining would print hundreds of lines on every page of the site. I thought the 162 lines for Twenty Seventeen's SVG icons was a lot, but inlining both the blocks and colors-dark would add more than 1,000 lines in that theme. I also do not like the extra page weight when browsers currently cache that for the next page view, though I can be convinced this would improve it overall.

And furthermore in #63007 and :

I also requested a performance metrics comparison for loading a second page (in addition the first) because browsers would already have the external file cached.

Note that the impact to an increase to the size of the HTML in the second page load will be lessened by the prefetch introduced by Speculative Loading in #62503.

Related to this:

  • #63012: Bundled themes: Stylesheets should be minified
  • #58519: Inline styles block styles in bundled themes
  • #63007: Bundled themes: Stylesheets for block themes are missing path data for inlining
  • #43258: Output buffer template rendering and add filter for post-processing (e.g. caching, optimization)

Classic themes generally have to enqueue a lot more styles than they actually need because at wp_head they don't know which are actually going to be used on the page. So with the output buffering in #43258, classic themes could opt-in to should_load_separate_core_block_assets by default, where individual block styles rendered in the footer can be moved to the HEAD when processing the output buffer. See also performance#1834 where this could be experimented on as a Performance Lab plugin.

The first step here is to do the benchmarking. Based on the findings, this ticket can either be closed or moved forward to introducing a patch. The benchmarking should find the ideal number of bytes to inline versus to remain external (in cache).

Change History (27)

#1 @westonruter
9 months ago

  • Owner set to westonruter
  • Status changed from new to accepted

This ticket was mentioned in Slack in #core-performance by westonruter. View the logs.


8 months ago

This ticket was mentioned in Slack in #core-performance by westonruter. View the logs.


2 months ago

#4 @westonruter
2 months ago

This was recently discussed in Slack with @aristath. The initial 20KB limit was extremely conservative, and it is acknowledged to be way too low.

In my research, I also found that on a vanilla WP install with Twenty Twenty-Five active, the Navigation block alone exceeds the 20KB threshold, so there is basically no chance that a block theme can be served without render-blocking stylesheets given the current limit for inlining. By increasing the threshold even to just 30KB, all CSS for the Sample Page can be inlined, and this improves LCP by 36% on an emulated Fast 4G connection and 45% on an emulated Slow 4G connection. I don't want to just bump to 30KB and call it good though, so I'm still wanting to do more digging on finding out a more ideal threshold based on data.

This ticket was mentioned in Slack in #core by welcher. View the logs.


7 weeks ago

This ticket was mentioned in Slack in #core-performance by westonruter. View the logs.


6 weeks ago

#7 @westonruter
5 weeks ago

I'm working on the benchmarking now.

#8 @spacedmonkey
5 weeks ago

@westonruter Do you think we can land this before beta 1?

#9 @westonruter
5 weeks ago

I hope to have the data by tonight that will validate or invalidate the assumption that increasing the inline limit will improve performance.

#10 @westonruter
5 weeks ago

I've gathered some initial research via modifications to the benchmark-web-vitals tool as described here: https://github.com/GoogleChromeLabs/wpp-research/pull/201

I've attached a graph above that depicts the uncached LCP-TTFB versus the cached LCP-TTFB as the styles_inline_size_limit increases from 0 to 100,000 bytes. Here is the raw data:

styles_inline_size_limit uncached cached
0 1473.6 129.1
10000 612.0 134.9
20000 563.8 136.7
30000 422.6 147.5
40000 419.0 167.6
50000 414.4 165.4
60000 409.4 179.7
70000 433.1 200.4
80000 434.5 194.8
90000 316.9 217.3
100000 308.5 219.0

There is 87,650 bytes of inlinable CSS on the page. When it is all inlined, then there are zero render-blocking stylesheets present. I tested on trunk (r61008) with the Twenty Twenty-Five theme active.

This data seems to show that increasing the limit from 20K to 50K makes sense.

I'll go ahead and make that initial increase, and we can further tune during beta based on additional benchmarking and testing.

#11 @westonruter
5 weeks ago

  • Resolution set to fixed
  • Status changed from accepted to closed

In 61013:

Script Loader: Increase styles_inline_size_limit from 20K to 50K.

Benchmarking of initial page loads versus repeat page loads shows there is a greater performance improvement to increase inlining versus the benefits of relying on browser cache.

This new limit is expected to be further refined during beta based on additional testing.

Props westonruter, sabernhardt, poena, aristath, spacedmonkey, adamsilverstein, jonoaldersonwp, peterwilsoncc.
See #63007.
Fixes #63018.

This ticket was mentioned in Slack in #core by westonruter. View the logs.


5 weeks ago

#13 @westonruter
5 weeks ago

I updated the benchmarking script to capture data at 1K step increments to styles_inline_size_limit rather than the 10K step I had used initially (for the sake of expediency). I also added more blocks to the test post so that the total amount of inlinable CSS is over 109KB.

See chart above.

Raw data is available in PR comment. This includes the total amount of inline CSS for the page at each step, as well as the count of external stylesheets.

This data seems to re-affirm that 50KB is a good new default for styles_inline_size_limit.

This ticket was mentioned in Slack in #core-performance by westonruter. View the logs.


5 weeks ago

#15 @westonruter
4 weeks ago

As I just commented on the wpp-research PR, I just realized that I was testing without far-future expiry on the static assets. So the cached results were still involving conditional requests which returned 304 Not Modified. I patched the wordpress-develop environment to ensure far-future expires were present:

  • tools/local-env/default.template

    diff --git a/tools/local-env/default.template b/tools/local-env/default.template
    index 995913fb45..ecce8ea780 100644
    a b server { 
    2525                try_files $uri $uri/ /index.php?$args;
    2626        }
    2727
     28        location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg)$ {
     29                expires 1y;
     30                add_header Cache-Control "public, no-transform";
     31        }
     32
    2833        location ~ \.php$ {
    2934                try_files $uri =404;
    3035                fastcgi_split_path_info ^(.+\.php)(/.+)$;

And I re-ran the results. As seen in increasing-css-inline-size-limit.png, the results are very similar.

#16 @westonruter
4 weeks ago

As I just also commented, I ran the benchmarks for a broadband connection as well. See increasing-css-inline-size-limit-broadband.png.

👉🏻 For broadband as well as with Fast 4G, increasing the styles_inline_size_limit results in a exponential decay (diminishing returns) in the performance improvement for uncached visits whereas for cached visits there is a linear performance degradation.

Last edited 4 weeks ago by westonruter (previous) (diff)

#17 @westonruter
3 weeks ago

Follow up: #64178 (additional inline CSS is pushing the oEmbed discovery links past the 150K peek limit)

Based on re-evaluating the data from my benchmarks above, I suggest we reduce the styles_inline_size_limit to 35K or 40K, instead of bumping the 20K limit all the way to 50K.

Note the graphs above showing the LCP-TTFB for Fast 4G (raw data) and broadband (raw data) from #63018. In the two graphs, the uncached curve starts to flatten out at 40K. The difference between 40K and 50K is not as great as the difference between 30K and 40K.

#19 @westonruter
3 weeks ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

PR to reduce styles_inline_size_limit from 50K to 40K, which is still double the original 20K: https://github.com/WordPress/wordpress-develop/pull/10450

This ticket was mentioned in Slack in #core-performance by westonruter. View the logs.


3 weeks ago

#21 @westonruter
3 weeks ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 61117:

Script Loader: Reduce styles_inline_size_limit down to 40K from 50K.

In [61013] the CSS inline limit was increased from 20K to 50K. However, based on benchmarking the performance improvements of increasing the limit, the relative improvement from 40K to 50K is not as great as from 30K to 40K. The performance improvements start to flatten out beginning at 40K. This 40K is still double the previous limit of 20K. Being more conservative will allow for a more gradual impact to be observed before considering a larger increase. Furthermore, the 50K limit of inline CSS can cause the oEmbed discovery links to be positioned after the first 150K bytes which prevents them from being discovered. See #64178.

Follow-up to [61013].

Props westonruter, mukesh27, szepeviktor.
See #64178.
Fixes #63018.

#22 @westonruter
3 weeks ago

In 61119:

Embeds: Add wp_oembed_add_discovery_links() to run earlier at wp_head priority 4.

This results in the oEmbed discovery links being printed before scripts and styles are printed. This helps ensure they appear within the first 150K bytes of the HTML response, which is required by WP_oEmbed::discover(). With increasing the styles_inline_size_limit from 20K to 40K in #63018, it becomes more probable that the oEmbed discovery links will be pushed out of range.

For backwards compatibility with themes and plugins that disable the oEmbed discovery links by unhooking wp_oembed_add_discovery_links() from running at wp_head priority 10 (even though the oembed_discovery_links filter is available to disable such links), this callback is added a second time to run earlier at priority 4. The first time the function runs, it checks to see if the callback is still hooked at priority 10. If not, the function short circuits. If it is still hooked at priority 10, however, the function then unhooks itself at priority 10 so that it won't run a second time later during the wp_head action, before proceeding with printing the discovery links. A similar back-compat approach was taken previously in [60910]. The back-compat checks are only performed if the function is invoked during the wp_head action, allowing the function to be called "idempotently" elsewhere.

Developed in https://github.com/WordPress/wordpress-develop/pull/10449

Follow-up to [61118], [61117].

Props westonruter, swissspidy.
See #64186, #63018.
Fixes #64178.

#23 @westonruter
10 days ago

Aside: AMP originally had an inline CSS limit of 50KB and it was increased to 75KB. (Nevertheless, AMP page views were typically first time visits.)

Note: See TracTickets for help on using tickets.