Skip to content

Conversation

@joe4dev
Copy link
Member

@joe4dev joe4dev commented Apr 3, 2024

Motivation

Event filtering is insufficiently tested and implemented in too many places in LocalStack 😢
Before we can unify all existing implementations, we need a robust AWS-validated test suite.

Existing implementations in LocalStack:

  1. [localstack] test_event_pattern in the EventBridge provider implements a simplistic check whether a key is present in ~1 LOC.
  2. [localstack] filter_event (inner method!) in the EventBridge provider implements the Amazon EventBridge event patterns in ~200 LOC. The implementation is confusing (e.g., operators are implemented multiple times), hard to maintain, and has several refactoring TODOs.
  3. [moto-ext] _does_event_match within the EventPattern models of moto-ext implements the same Amazon EventBridge event patterns in ~50 LOC. The implementation is short and concise (e.g., mapping operators to functions < => lt) but incomplete. LocalStack patches this implementation in the EventBridge provider. Hence, it is not used within LocalStack.
  4. [localstack-ext] filter_event in the Pipes utils is a copy/pasted implementation from EventBridge. It was temporarily copied because the EventBridge implementation was not re-usable but will be unified within this PR. It implements Amazon EventBridge Pipes filtering, which appears to be the same implementation as for EventBridge.
  5. [localstack] filter_stream_record in the Lambda Event Source Mapping (ESM) utils is ~110 LOC. It implements the Lambda event filtering, which appears to be the same implementation as for Pipes and EventBridge. We know from other sources that ESM and Pipes use the same infrastructure at AWS.
  6. [localstack] _evaluate_nested_filter_policy_on_dict within the "SubscriptionFilter" of SNS is ~130 LOC. It implements Amazon SNS subscription filter policies, which appear very similar to the EventBridge implementation. However, this implementation is not identical (e.g., does not support the wildcard operator).
  7. [moto-ext] _compute_body_checks within the SNS utils in moto-ext implements Amazon SNS subscription filter policies is ~340 LOC. In addition to the matching functionality, moto-ext also implements the filter validation using a complexity calculation in _validate_filter_policy. These extra ~170 LOC also cover several exception cases. According to @bentsku, our moto-ext implementation also got backported to upstream moto.

AWS recently (2024-02-01) open-sourced https://github.com/aws/event-ruler. This rule engine implements the above filtering patterns. It is written in Java in ~6k LOC. Their blog post mentions that the internals are built on top of finite state machines, which can be updated dynamically with new rules.

Changes

  • Make test_event_pattern in the event provider use our EventBridge filtering implementation
  • Add a suite of ~70 aws-validated tests based on the documentation of the filtering patterns. The tests mainly based on the original Amazon EventBridge event patterns but also inspired from the open source event ruler and discovered edge cases. The test suite has a good initial operator coverage but misses further edge cases and exception cases.
  • Extract the implementation from the EventBridge provider into a separate utils file
  • Refactor (e.g., filter_event => matches_event) and fix several issues in the EventBridge filter_events method and its helpers (e.g., improved handling of numeric operators and improved list intersections handling)
  • Refactor (e.g., filter_stream_record => does_match_event) and fix several issues (e.g., operator casing, numeric conditions, raising exceptions)

Testing

Within localstack/services/events/provider.py, test_event_pattern can be used to invoke different implementation within LocalStack.

Discussion

Raising exceptions instead of silently ignoring the error case and returning True/False is a behavior change that can help us to make issues more visible. Background processes should handle such exceptions better (e.g., The SQS ESM implementation just drops the entire batch upon unhandled exceptions). At AWS, more exceptions are probably caught earlier because the rule engine can validate rules before actually doing a matching.

@joe4dev joe4dev added the semver: patch Non-breaking changes which can be included in patch releases label Apr 3, 2024
@joe4dev joe4dev self-assigned this Apr 3, 2024
@joe4dev joe4dev force-pushed the add-event-filtering-test-suite branch from 22b67cf to bb6614a Compare April 4, 2024 08:33
@github-actions
Copy link

github-actions bot commented Apr 4, 2024

LocalStack Community integration with Pro

    2 files      2 suites   1h 31m 18s ⏱️
2 831 tests 2 547 ✅ 284 💤 0 ❌
2 833 runs  2 547 ✅ 286 💤 0 ❌

Results for commit bb6614a.

@joe4dev joe4dev marked this pull request as ready for review April 4, 2024 09:34
# and exception handling.
@pytest.mark.skipif(is_v2_provider(), reason="V2 provider does not support this feature yet")
@pytest.mark.parametrize(
"request_template,label", request_template_tuples, ids=[t[1] for t in request_template_tuples]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably debatable whether it's a good idea to parameterize tests based on directory contents. It makes it very easy to add new test cases but add a small delay to test collection because it needs to list the files in the directory 🤷

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine for tests like this :)

"test2": [{"anything-but": "test2"}],
"test3": [{"anything-but": ["test3", "test33"]}],
"test4": [{"anything-but": {"prefix": "test4"}}],
"ip": [{"cidr": "10.102.1.0/24"}],
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That IP check never worked but it got silently ignored 😬 (just double-checked by running the test suite against the old implementations before the refactorings and fixes)

@joe4dev joe4dev requested a review from bentsku April 4, 2024 16:38
Copy link
Member

@dfangl dfangl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good test and a nice refacotring, looking forward to this being unified all over the place :) I also prefer the exceptions over implicit failing/succeeding of the match.

# and exception handling.
@pytest.mark.skipif(is_v2_provider(), reason="V2 provider does not support this feature yet")
@pytest.mark.parametrize(
"request_template,label", request_template_tuples, ids=[t[1] for t in request_template_tuples]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine for tests like this :)

event = request_template["Event"]
event_pattern = request_template["EventPattern"]

if label.endswith("_EXC"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was your reasoning here for encoding this in the file name, instead of having another field in the json5 file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. It kinda evolved because I wanted to have an explicit way to express the intent of the test. The surprise effect helps to write better test cases. It could as well be a field in the the json5 file though.

Not really a strong reason, some thoughts:

  • The intention of the test is easy to see in the file explorer.
  • It currently matches the TestEventPattern API. Maybe, it would be easier to copy/paste and test in the AWS console.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially built a small templating engine supporting variables but noticed we can simplify and don't care about any dynamic values 🤦 So the entire request "templating" part comes from there 😅

Comment on lines +250 to +251
# EventBridge apparently converts some fields, for example: Source=>source, DetailType=>detail-type
# but the actual pattern matching is case-sensitive by key!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fun

)

# Validate the test intention: The _NEG suffix indicates negative tests (i.e., a pattern not matching the event)
if label.endswith("_NEG"):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful in the future to snapshot the result together with a hash of the request because currently, it's possible to modify the test case after being AWS-validated; kinda proper testing police ;)

@joe4dev joe4dev merged commit 389d942 into master Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

semver: patch Non-breaking changes which can be included in patch releases

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants