Skip to content

Conversation

@thrau
Copy link
Member

@thrau thrau commented Dec 10, 2023

Motivation

We've had several performance and concurrency issues related to hypercorn in the past.

Twisted is one of the few frameworks that satisfies our peculiar requirements of HTTP/2 and WebSocket support, while also not relying on asyncio. This PR adds a way to serve our Gateway through twisted, thereby completely foregoing our asgi/wsgi bridge. Overall we have seen that this leads on average to around 2X throughput improvement, or more in some cases (@bentsku did some amazing benchmarks already!).

A few peculiarities:

  • Since twisted uses its own TLS implementation, we cannot use our DuplexSocket implementation. I implemented this as a custom twisted protocol.
  • Retaining header casing over their WSGI abstractions ended up being a bit hacky
  • camelCase is weird, i used camelCase where i extended twisted constructs to be consistent
  • twisted doesn't seem to have a good notion of resource cleanup, so shutting down the reactor (the main event loop) doesn't seem to properly shut down things like thread pools that serve requests

Changes

  • LocalStack now supports GATEWAY_SERVER=twisted
  • Connection: close header is now only forced when GATEWAY_SERVER == "hypercorn"
  • the image is around ~10MB (uncompressed) larger now due to the twisted dependency

Benchmarks

Here are some selected benchmarks that @bentsku ran. More can be found in notion

  • SNS + SQS: Only publish call, not received the messages from SQS)
    Twisted:   1000 requests took 4.9747 s = 4.9747 ms/op = 201.02 req/sec = 201.02 item/sec
    Hypercorn: 1000 requests took 6.9016 s = 6.9016 ms/op = 144.89 req/sec = 144.89 item/sec
    
  • 1000 DDB BatchGetItem 25
    Twisted:   1000 requests took 2.3743 s = 2.3743 ms/op = 421.17 req/sec = 10529.34 item/sec
    Hypercorn: 1000 requests took 3.2308 s = 3.2308 ms/op = 309.52 req/sec = 7737.98 item/sec
    
  • S3 Query objects: concurrency of 1, obj. size of 1000 (massive speed up due to the removal of the close header, allowing connection reuse)
    Twisted:   Average: 1.54 MiB/s, 1620.03 obj/s
    Hypercorn: Average: 0.21 MiB/s, 219.67 obj/s
    

TODO

What's left to do:

@github-actions
Copy link

github-actions bot commented Mar 2, 2024

S3 Image Test Results (AMD64 / ARM64)

  2 files    2 suites   3m 17s ⏱️
393 tests 342 ✅  51 💤 0 ❌
786 runs  684 ✅ 102 💤 0 ❌

Results for commit 1e99f51.

♻️ This comment has been updated with latest results.

@coveralls
Copy link

coveralls commented Mar 2, 2024

Coverage Status

coverage: 85.887% (+0.006%) from 85.881%
when pulling ec1f1db on twisted
into 87f747f on master.

@github-actions
Copy link

github-actions bot commented Mar 2, 2024

LocalStack Community integration with Pro

    2 files  ±0      2 suites  ±0   1h 33m 0s ⏱️ + 5m 36s
2 692 tests  - 1  2 436 ✅  - 1  256 💤 ±0  0 ❌ ±0 
2 694 runs   - 1  2 436 ✅  - 1  258 💤 ±0  0 ❌ ±0 

Results for commit ec1f1db. ± Comparison against base commit 87f747f.

This pull request removes 2 and adds 1 tests. Note that renamed tests count towards both.
tests.aws.services.s3.test_s3.TestS3PresignedUrl ‑ test_s3_get_response_case_sensitive_headers[False]
tests.aws.services.s3.test_s3.TestS3PresignedUrl ‑ test_s3_get_response_case_sensitive_headers[True]
tests.aws.services.s3.test_s3.TestS3PresignedUrl ‑ test_s3_get_response_case_sensitive_headers

♻️ This comment has been updated with latest results.

@thrau thrau added the semver: minor Non-breaking changes which can be included in minor releases, but not in patch releases label Mar 4, 2024
@thrau thrau modified the milestones: Playground, 3.3 Mar 6, 2024
@thrau thrau marked this pull request as ready for review March 6, 2024 15:38
Copy link
Contributor

@bentsku bentsku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! This looks nice, and the gain in throughput looks really good. Also, from my benchmarks when using concurrency, running as much requests as we could we 25 client threads, Hypercorn would timeout for 9 requests and slow down everything; this issue is not present with Twisted, boosting the throughput by x2. With no concurrency (client thread = 1), the throughput is x8 for S3 (!!!) because of the close header fix I believe?

Anyway, this looks pretty good, thanks a lot for digging into this and all those fix to make it work with our Gateway.

I just have a general comment about it, as I was looking into how to add a new server implementation myself:
For Hypercorn, it seems we have all of the actual implementation in 2 places: aws.serving.hypercorn, and http.hypercorn.

Also, for example in our test_gateway.py, we directly use HypercornServer. For Cloudfront, we still use it as well, and for the EC2 metadata instance as well.

Should we centralize it somehow? It starts to feel a bit spread out to me and I'm a bit confused about it. Maybe I'm lacking context on it?

I think most of the implementation of the twisted server is in serving.twisted, but for werkzeug it's split between aws.serving.werkzeug and directly in aws.serving.edge.
And for Hypercorn it's in http.hypercorn.

Maybe something to do in a follow up PR? What do you think?

@thrau
Copy link
Member Author

thrau commented Mar 6, 2024

thanks for the review ben!

Maybe something to do in a follow up PR? What do you think?

you raised some great points that i noticed before as well

  • localstack-agnostic serving code should probably be separated and moved to rolo
  • GatewayServer inheriting from HypercornServer is limiting us in replacing the underlying server technology. the GatewayServer should probably have the parameterization that currently serve_gateway has.
  • we also still run a lot of tests against hypercorn servers, again because the server tech is hardcoded in many areas

definitely something for one or more follow-up PRs!

Copy link
Member

@alexrashed alexrashed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is really an awesome change! The server is well abstracted, and the performance tests are promising a huge performance bump! 📈 💯
This is already looking great, and is imho ready to get into master as an opt-in feature with the current limitations. Especially the websocket integration is currently not being used anyways (but will be super valuable f.e. for the next iteration of API GW). And the discussion above is a perfect opportunity to push rolo even further and increase the separation between localstack and its serving infrastructure.

The coverage of this code is great when considering that this is replacing the one server which handles all connections. 🚀
It seems that there might a small issue left with a logged KeyError on startup when being integrated with LocalStack Pro (see comment).
Given that we had issues with these integration tests recently (not properly executing the tests against the correct target), I wonder if there might be some test issues in downstream projects? Are there test runs against this PR?

Besides that, I only had two questions on small changes in the tests (concerning the header casing and the connection handling), just to make sure that all assumptions are clear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

semver: minor Non-breaking changes which can be included in minor releases, but not in patch releases

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants