add twisted gateway server #9834

thrau · 2023-12-10T23:05:42Z

Motivation

We've had several performance and concurrency issues related to hypercorn in the past.

Shutdown of localstack blocking due to Hypercorn workers
h11 keep-alive is slowing down HTTP clients #6557
@bentsku has reported issues in some benchmarks that hyerpcorn sometimes blocks on the 1000th request (which seems to have something to do with the number of workers), resulting in timeouts and errors
we've seen in add option to serve gateway through werkzeug #10171 that sync io can outperform hypercorn (likely due to the sync/async bridge code)

Twisted is one of the few frameworks that satisfies our peculiar requirements of HTTP/2 and WebSocket support, while also not relying on asyncio. This PR adds a way to serve our Gateway through twisted, thereby completely foregoing our asgi/wsgi bridge. Overall we have seen that this leads on average to around 2X throughput improvement, or more in some cases (@bentsku did some amazing benchmarks already!).

A few peculiarities:

Since twisted uses its own TLS implementation, we cannot use our DuplexSocket implementation. I implemented this as a custom twisted protocol.
Retaining header casing over their WSGI abstractions ended up being a bit hacky
camelCase is weird, i used camelCase where i extended twisted constructs to be consistent
twisted doesn't seem to have a good notion of resource cleanup, so shutting down the reactor (the main event loop) doesn't seem to properly shut down things like thread pools that serve requests

Changes

LocalStack now supports GATEWAY_SERVER=twisted
Connection: close header is now only forced when GATEWAY_SERVER == "hypercorn"
the image is around ~10MB (uncompressed) larger now due to the twisted dependency

Benchmarks

Here are some selected benchmarks that @bentsku ran. More can be found in notion

SNS + SQS: Only publish call, not received the messages from SQS)

Twisted:   1000 requests took 4.9747 s = 4.9747 ms/op = 201.02 req/sec = 201.02 item/sec
Hypercorn: 1000 requests took 6.9016 s = 6.9016 ms/op = 144.89 req/sec = 144.89 item/sec

1000 DDB BatchGetItem 25

Twisted:   1000 requests took 2.3743 s = 2.3743 ms/op = 421.17 req/sec = 10529.34 item/sec
Hypercorn: 1000 requests took 3.2308 s = 3.2308 ms/op = 309.52 req/sec = 7737.98 item/sec

S3 Query objects: concurrency of 1, obj. size of 1000 (massive speed up due to the removal of the close header, allowing connection reuse)
```
Twisted:   Average: 1.54 MiB/s, 1620.03 obj/s
Hypercorn: Average: 0.21 MiB/s, 219.67 obj/s
```

TODO

What's left to do:

Duplex SSL socket
~~Websockets~~ going to do websockets in a second iteration
get serverless tests to work
deal with dependency twisted being 30 MB https://github.com/localstack/twisted-distribution (it's now 9MB)
clean up and document PR
merge fix functhread daemon status and tmp_thread cleanup #10404 and rebase
set default GATEWAY_SERVER back to hypercorn after review

github-actions · 2024-03-02T20:35:58Z

S3 Image Test Results (AMD64 / ARM64)

2 files 2 suites 3m 17s ⏱️
393 tests 342 ✅ 51 💤 0 ❌
786 runs 684 ✅ 102 💤 0 ❌

Results for commit 1e99f51.

♻️ This comment has been updated with latest results.

coveralls · 2024-03-02T21:13:23Z

coverage: 85.887% (+0.006%) from 85.881%
when pulling ec1f1db on twisted
into 87f747f on master.

github-actions · 2024-03-02T21:28:32Z

LocalStack Community integration with Pro

2 files ±0 2 suites ±0 1h 33m 0s ⏱️ + 5m 36s
2 692 tests - 1 2 436 ✅ - 1 256 💤 ±0 0 ❌ ±0
2 694 runs - 1 2 436 ✅ - 1 258 💤 ±0 0 ❌ ±0

Results for commit ec1f1db. ± Comparison against base commit 87f747f.

This pull request removes 2 and adds 1 tests. Note that renamed tests count towards both.

tests.aws.services.s3.test_s3.TestS3PresignedUrl ‑ test_s3_get_response_case_sensitive_headers[False]
tests.aws.services.s3.test_s3.TestS3PresignedUrl ‑ test_s3_get_response_case_sensitive_headers[True]

tests.aws.services.s3.test_s3.TestS3PresignedUrl ‑ test_s3_get_response_case_sensitive_headers

♻️ This comment has been updated with latest results.

This reverts commit 5ada168.

bentsku

LGTM! This looks nice, and the gain in throughput looks really good. Also, from my benchmarks when using concurrency, running as much requests as we could we 25 client threads, Hypercorn would timeout for 9 requests and slow down everything; this issue is not present with Twisted, boosting the throughput by x2. With no concurrency (client thread = 1), the throughput is x8 for S3 (!!!) because of the close header fix I believe?

Anyway, this looks pretty good, thanks a lot for digging into this and all those fix to make it work with our Gateway.

I just have a general comment about it, as I was looking into how to add a new server implementation myself:
For Hypercorn, it seems we have all of the actual implementation in 2 places: aws.serving.hypercorn, and http.hypercorn.

Also, for example in our test_gateway.py, we directly use HypercornServer. For Cloudfront, we still use it as well, and for the EC2 metadata instance as well.

Should we centralize it somehow? It starts to feel a bit spread out to me and I'm a bit confused about it. Maybe I'm lacking context on it?

I think most of the implementation of the twisted server is in serving.twisted, but for werkzeug it's split between aws.serving.werkzeug and directly in aws.serving.edge.
And for Hypercorn it's in http.hypercorn.

Maybe something to do in a follow up PR? What do you think?

localstack/aws/handlers/legacy.py

localstack/aws/serving/twisted.py

tests/aws/services/s3/test_s3.py

thrau · 2024-03-06T23:38:27Z

thanks for the review ben!

Maybe something to do in a follow up PR? What do you think?

you raised some great points that i noticed before as well

localstack-agnostic serving code should probably be separated and moved to rolo
GatewayServer inheriting from HypercornServer is limiting us in replacing the underlying server technology. the GatewayServer should probably have the parameterization that currently serve_gateway has.
we also still run a lot of tests against hypercorn servers, again because the server tech is hardcoded in many areas

definitely something for one or more follow-up PRs!

alexrashed

Wow, this is really an awesome change! The server is well abstracted, and the performance tests are promising a huge performance bump! 📈 💯
This is already looking great, and is imho ready to get into master as an opt-in feature with the current limitations. Especially the websocket integration is currently not being used anyways (but will be super valuable f.e. for the next iteration of API GW). And the discussion above is a perfect opportunity to push rolo even further and increase the separation between localstack and its serving infrastructure.

The coverage of this code is great when considering that this is replacing the one server which handles all connections. 🚀
It seems that there might a small issue left with a logged KeyError on startup when being integrated with LocalStack Pro (see comment).
Given that we had issues with these integration tests recently (not properly executing the tests against the correct target), I wonder if there might be some test issues in downstream projects? Are there test runs against this PR?

Besides that, I only had two questions on small changes in the tests (concerning the header casing and the connection handling), just to make sure that all assumptions are clear.

tests/aws/services/opensearch/test_opensearch.py

...ons/asl/component/state/state_execution/state_task/service/state_task_service_api_gateway.py

localstack/aws/serving/twisted.py

thrau added this to the Playground milestone Dec 11, 2023

simonrw force-pushed the master branch from b099174 to d48ada8 Compare January 25, 2024 14:56

thrau mentioned this pull request Feb 6, 2024

add option to serve gateway through werkzeug #10171

Merged

thrau mentioned this pull request Feb 24, 2024

add code to copy read stream data into a SpooledTemporaryFile #8101

Closed

thrau force-pushed the twisted branch from 9060400 to 91f57a6 Compare March 2, 2024 20:29

thrau added the semver: minor Non-breaking changes which can be included in minor releases, but not in patch releases label Mar 4, 2024

thrau force-pushed the twisted branch from 5211694 to 052405d Compare March 6, 2024 00:29

thrau modified the milestones: Playground, 3.3 Mar 6, 2024

thrau force-pushed the twisted branch from 54fe282 to c498df1 Compare March 6, 2024 14:46

thrau marked this pull request as ready for review March 6, 2024 15:38

thrau requested review from MEPalma, alexrashed, bentsku, dominikschubert, macnev2013 and silv-io as code owners March 6, 2024 15:38

thrau force-pushed the twisted branch from c498df1 to 04102b2 Compare March 6, 2024 18:00

thrau added 9 commits March 6, 2024 19:01

add twisted dependency

10cb813

add twisted web server

909cd5c

implement HTTP(S) multiplexing

7361330

implement header casing retention

7fbb4a5

set twisted to default server

5356bf0

fix WSGI url scheme

ab26745

avoid setting server header

354a4d6

make chunked requests work

11b4180

fix test_s3_get_response_case_sensitive_headers

17b0128

thrau added 12 commits March 6, 2024 19:01

add h2 protocol negotiation for non-ssl

21415d5

Revert "avoid setting server header"

da0e0ea

This reverts commit 5ada168.

fix SSL context creation

8e5fa4e

fix server header removal in apigw

eed7840

fix opensearch test that uses a plain http client

d802d76

disable websocket integration tests for non-hypercorn servers

3584674

fix http connection closing

a8fa55b

add more docs

0f55d88

another attempt at fixing connection handling

c9d068b

add more graceful thread pool shutdown

ffc0432

add twisted-localstack dependency

69cbeb6

tweak logging

4ae4fa4

thrau force-pushed the twisted branch from 04102b2 to 4ae4fa4 Compare March 6, 2024 18:01

bentsku approved these changes Mar 6, 2024

View reviewed changes

thrau added 2 commits March 6, 2024 22:28

simplify TLS Multiplexer implementation

eca704a

fix typos

ec1f1db

alexrashed approved these changes Mar 7, 2024

View reviewed changes

tests/aws/services/opensearch/test_opensearch.py Show resolved Hide resolved

...ons/asl/component/state/state_execution/state_task/service/state_task_service_api_gateway.py Show resolved Hide resolved

localstack/aws/serving/twisted.py Show resolved Hide resolved

thrau added 2 commits March 7, 2024 12:54

code review

b69c8e6

set default gateway server back to hypercorn

1e99f51

thrau merged commit d7331c5 into master Mar 7, 2024

thrau deleted the twisted branch March 7, 2024 12:11

This was referenced Mar 10, 2024

refactor websockets to allow different implementations localstack/rolo#7

Merged

add twisted support localstack/rolo#8

Merged

use rolo twisted gateway integration #10428

Merged

bentsku mentioned this pull request Mar 19, 2024

make twisted default HTTP server in S3 image #10491

Merged

alexrashed mentioned this pull request Apr 22, 2024

switch default gateway server from hypercorn to twisted #10703

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add twisted gateway server #9834

add twisted gateway server #9834

Uh oh!

thrau commented Dec 10, 2023 •

edited

Loading

Uh oh!

github-actions bot commented Mar 2, 2024 •

edited

Loading

Uh oh!

coveralls commented Mar 2, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Mar 2, 2024 •

edited

Loading

Uh oh!

bentsku left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

thrau commented Mar 6, 2024

Uh oh!

alexrashed left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

add twisted gateway server #9834

add twisted gateway server #9834

Uh oh!

Conversation

thrau commented Dec 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Changes

Benchmarks

TODO

Uh oh!

github-actions bot commented Mar 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

S3 Image Test Results (AMD64 / ARM64)

Uh oh!

coveralls commented Mar 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

LocalStack Community integration with Pro

Uh oh!

bentsku left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

thrau commented Mar 6, 2024

Uh oh!

alexrashed left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

thrau commented Dec 10, 2023 •

edited

Loading

github-actions bot commented Mar 2, 2024 •

edited

Loading

coveralls commented Mar 2, 2024 •

edited

Loading

github-actions bot commented Mar 2, 2024 •

edited

Loading

bentsku left a comment •

edited

Loading