Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 11 additions & 4 deletions docs/worker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,14 @@ Major cloud offerings (like AWS lambda) offer a variety of lambda
triggers, such as HTTP requests, queue messages, cron, DB/S3 triggers,
etc.

This has not been a focus (so far) of OpenLambda. The only trigger is
an HTTP request. Thus, all the event code is in the
github.com/open-lambda/open-lambda/ol/worker/server package. Requests
to http(s)://WORKER_ADDR:PORT/run/LAMBDA_NAME invoke lambdas.
OpenLambda currently supports two types of triggers: **HTTP requests**
and **Kafka messages**. The event code is in the
github.com/open-lambda/open-lambda/ol/worker/event package.

**HTTP triggers:** Requests to http(s)://WORKER_ADDR:PORT/run/LAMBDA_NAME
invoke lambdas directly.

**Kafka triggers:** Lambdas can be configured to consume from Kafka
topics. The worker runs Kafka consumers that poll for messages and
invoke the corresponding lambda function automatically. See
[kafka-triggers.md](kafka-triggers.md) for details.
136 changes: 136 additions & 0 deletions docs/worker/kafka-triggers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
# Kafka Triggers

Lambdas can be configured to automatically consume messages from Kafka
topics. When a message arrives, the worker invokes the lambda with the
message payload as the request body.

## Configuration

Add a `kafka` section under `triggers` in your lambda's `ol.yaml`
(see [lambda configuration](lambda-config.md) for the full `ol.yaml`
reference):

```yaml
triggers:
kafka:
- bootstrap_servers:
- "localhost:9092"
topics:
- "my-topic"
auto_offset_reset: "latest" # or "earliest"
```

The consumer group ID is automatically set to `lambda-<name>` based on
the lambda name and cannot be overridden. Because each lambda gets its
own group ID, lambdas consume from Kafka independently of one another.
Even if multiple lambdas subscribe to the same topic, each one receives
its own copy of every message, and offset tracking is maintained
separately per lambda.

## How it works

1. When the worker starts in `lambda` mode, it creates a `KafkaManager`
alongside the `LambdaServer`.
2. Kafka triggers are registered via the `/kafka/register/<lambda-name>`
HTTP endpoint (POST to register, DELETE to unregister).
3. For each trigger entry, the manager creates a `LambdaKafkaConsumer`
backed by a [franz-go](https://github.com/twmb/franz-go) (`kgo`)
client.
4. Each consumer runs a polling loop that fetches messages with a
1-second timeout. On receiving a message, it builds a synthetic HTTP
POST request and invokes the lambda directly through the
`LambdaManager`.

## Request format

When a Kafka message triggers a lambda, the worker builds a synthetic
HTTP POST request with the Kafka message value as the body and the
following headers:

| Header | Description |
| ------------------- | ---------------------------------------- |
| `Content-Type` | `application/json` |
| `X-Kafka-Topic` | The topic the message was read from. |
| `X-Kafka-Partition` | The partition number. |
| `X-Kafka-Offset` | The message offset within the partition. |
| `X-Kafka-Group-Id` | The consumer group ID. |

### Accessing Kafka metadata in your handler

The default handler type (`def f(event)`) only receives the JSON-parsed
request body as a dict. It does **not** have access to HTTP headers,
so the Kafka metadata headers listed above will not be available.

To access Kafka metadata headers, use a **WSGI** or **ASGI** entry
point (see [lambda configuration](lambda-config.md) for how to
configure these).

## Example lambdas

Complete working examples are available in the
[examples/](../../examples/) directory:

- [kafka-basic](../../examples/kafka-basic/) — Simple `f(event)` handler
that processes the Kafka message body.
- [kafka-metadata](../../examples/kafka-metadata/) — Flask WSGI handler
that accesses Kafka metadata headers (topic, partition, offset, group
ID) alongside the message body.

### Simple handler (body only)

The default `f(event)` handler receives the Kafka message body as a
parsed dict, but cannot access headers
([full example](../../examples/kafka-basic/)):

```python
def f(event):
# event is the JSON-parsed Kafka message value
print(f"Received message: {event}")
return {"status": "ok"}
```

### WSGI handler (body + headers)

A WSGI handler can access Kafka metadata via the `environ` dict.
HTTP headers are available with an `HTTP_` prefix, uppercased, and
with dashes replaced by underscores
([full example](../../examples/kafka-metadata/)):

```python
from flask import Flask, request

app = Flask(__name__)

@app.route("/", methods=["POST"])
def handle():
topic = request.headers.get("X-Kafka-Topic", "unknown")
partition = request.headers.get("X-Kafka-Partition", "unknown")
offset = request.headers.get("X-Kafka-Offset", "unknown")
group_id = request.headers.get("X-Kafka-Group-Id", "unknown")

body = request.get_json()

print(f"topic={topic} partition={partition} offset={offset} group={group_id}")
print(f"body={body}")

return {"status": "ok"}
```

## Management API

The worker exposes an HTTP endpoint for managing Kafka consumers at
runtime:

- **`POST /kafka/register/<lambda-name>`** — Reads the lambda's
`ol.yaml` config from the registry and starts consumers for all
configured Kafka triggers. Any existing consumers for that lambda are
cleaned up first.
- **`DELETE /kafka/register/<lambda-name>`** — Stops and removes all
Kafka consumers for the given lambda.

## Shutdown

When the worker receives a shutdown signal (SIGTERM/SIGINT), the Kafka
manager is cleaned up before the lambda server. Each consumer's polling
loop is stopped via its stop channel, and the underlying `kgo` client is
closed.
33 changes: 32 additions & 1 deletion docs/worker/lambda-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,11 @@ triggers:
http:
- method: PUT
- method: PATCH
kafka:
- bootstrap_servers:
- "localhost:9092"
topics:
- "my-topic"

environment:
MY_ENV_VAR1: "value1"
Expand All @@ -22,7 +27,7 @@ environment:
## 3. Configuration Options

### a. Triggers
OpenLambda only supports HTTP trigger for now, but future development plans include supporting other trigger types.
OpenLambda currently supports HTTP and Kafka triggers.

#### HTTP Triggers
Defines which HTTP methods can be used to invoke the lambda.
Expand All @@ -36,6 +41,32 @@ triggers:
```
In this case, the lambda accepts GET and POST requests.

#### Kafka Triggers
Defines Kafka topics the lambda should consume from. When a message
arrives on a configured topic, the lambda is invoked with the message
as the request body.

Example:
```yaml
triggers:
kafka:
- bootstrap_servers:
- "localhost:9092"
topics:
- "my-topic"
auto_offset_reset: "latest"
```

| Field | Type | Required | Description |
| ------------------- | ---------- | -------- | ------------------------------------------------------------------------------------------- |
| `bootstrap_servers` | `[]string` | Yes | List of Kafka broker addresses. |
| `topics` | `[]string` | Yes | Topics this lambda should consume from. |
| `auto_offset_reset` | `string` | No | Where to start reading if no committed offset exists. `"latest"` (default) or `"earliest"`. |

A lambda can define multiple Kafka trigger entries. Each entry creates a
separate consumer. For more details on Kafka triggers, including how to access Kafka
metadata headers in your handler, see [kafka-triggers.md](kafka-triggers.md).

### b. Environment Variables
Defines environment variables that will be available to the lambda function at runtime.

Expand Down
4 changes: 4 additions & 0 deletions examples/kafka-basic/f.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
def f(event):
# event is the JSON-parsed Kafka message value
print(f"Received message: {event}")
return {"status": "ok"}
9 changes: 9 additions & 0 deletions examples/kafka-basic/ol.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
triggers:
http:
- method: POST
kafka:
- bootstrap_servers:
- "localhost:9092"
topics:
- "my-topic"
auto_offset_reset: "latest"
17 changes: 17 additions & 0 deletions examples/kafka-metadata/f.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
from flask import Flask, request

app = Flask(__name__)

@app.route("/", methods=["POST"])
def handle():
topic = request.headers.get("X-Kafka-Topic", "unknown")
partition = request.headers.get("X-Kafka-Partition", "unknown")
offset = request.headers.get("X-Kafka-Offset", "unknown")
group_id = request.headers.get("X-Kafka-Group-Id", "unknown")

body = request.get_json()

print(f"topic={topic} partition={partition} offset={offset} group={group_id}")
print(f"body={body}")

return {"status": "ok"}
10 changes: 10 additions & 0 deletions examples/kafka-metadata/ol.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
triggers:
http:
- method: POST
kafka:
- bootstrap_servers:
- "localhost:9092"
topics:
- "my-topic"
auto_offset_reset: "earliest"

2 changes: 2 additions & 0 deletions examples/kafka-metadata/requirements.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
flask==2.3.2
werkzeug==3.0.3
24 changes: 24 additions & 0 deletions examples/kafka-metadata/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
#
# This file is autogenerated by pip-compile with Python 3.13
# by the following command:
#
# pip-compile requirements.in
#
blinker==1.6.2
# via flask
click==8.1.7
# via flask
flask==2.3.2
# via -r requirements.in
itsdangerous==2.1.2
# via flask
jinja2==3.1.4
# via flask
markupsafe==2.1.3
# via
# jinja2
# werkzeug
werkzeug==3.0.3
# via
# -r requirements.in
# flask
Loading