AWS Lambda | AWS Compute Blog

Improving throughput of serverless streaming workloads for Kafka

Event-driven applications often need to process data in real-time. When you use AWS Lambda to process records from Apache Kafka topics, you frequently encounter two typical requirements: you need to process very high volumes of records in close to real-time, and you want your consumers to have the ability to scale rapidly to handle traffic spikes. Achieving both necessitates understanding how Lambda consumes Kafka streams, where the potential bottlenecks are, and how to optimize configurations for high throughput and best performance.

Serverless strategies for streaming LLM responses

Modern generative AI applications often need to stream large language model (LLM) outputs to users in real-time. Instead of waiting for a complete response, streaming delivers partial results as they become available, which significantly improves the user experience for chat interfaces and long-running AI tasks. This post compares three serverless approaches to handle Amazon Bedrock LLM streaming on Amazon Web Services (AWS), which helps you choose the best fit for your application.

Building multi-tenant SaaS applications with AWS Lambda’s new tenant isolation mode

Today, AWS is announcing tenant isolation for AWS Lambda, enabling you to process function invocations in separate execution environments for each end-user or tenant invoking your Lambda function. This capability simplifies building secure multi-tenant SaaS applications by managing tenant-level compute environment isolation and request routing, allowing you to focus on core business logic rather than implementing tenant-aware compute environment isolation.

Building responsive APIs with Amazon API Gateway response streaming

Today, AWS announced support for response streaming in Amazon API Gateway to significantly improve the responsiveness of your REST APIs by progressively streaming response payloads back to the client. With this new capability, you can use streamed responses to enhance user experience when building LLM-driven applications (such as AI agents and chatbots), improve time-to-first-byte (TTFB) performance for web and mobile applications, stream large files, and perform long-running operations while reporting incremental progress using protocols such as server-sent events (SSE).

Python 3.14 runtime now available in AWS Lambda

AWS Lambda now supports Python 3.14 as both a managed runtime and container base image. Python is a popular language for building serverless applications. Developers can now take advantage of new features and enhancements when creating serverless applications on Lambda.

Building serverless applications with Rust on AWS Lambda

Today, AWS Lambda is promoting Rust support from Experimental to Generally Available. This means you can now use Rust to build business-critical serverless applications, backed by AWS Support and the Lambda availability SLA.

AWS Lambda now supports Java 25

You can now develop AWS Lambda functions using Java 25 either as a managed runtime or using the container base image. This blog post highlights notable Java language features, Java Lambda runtime updates, and how you can use the new Java 25 runtime in your serverless applications.

AWS Lambda networking over IPv6

This post examines the benefits of transitioning Lambda functions to IPv6, provides practical guidance for implementing dual-stack support in your Lambda environment, and considerations for maintaining compatibility with existing systems during migration.

Introducing AWS Lambda event source mapping tools in the AWS Serverless MCP Server

Modern serverless applications increasingly rely on event-driven architectures, where AWS Lambda functions process events from various sources like Amazon Kinesis, Amazon DynamoDB Streams, Amazon Simple Queue Service (Amazon SQS), Amazon Managed Streaming for Apache Kafka (Amazon MSK), and self-managed Apache Kafka. Although event source mappings (ESM) offer a powerful mechanism for integrating AWS Lambda with […]

Deploying AI models for inference with AWS Lambda using zip packaging

Users usually package their function code as container images when using machine learning (ML) models that are larger than 250 MB, which is the Lambda deployment package size limit for zip files. In this post, we demonstrate an approach that downloads ML models directly from Amazon S3 into your function’s memory so that you can continue packaging your function code using zip files.

AWS Compute Blog

Category: AWS Lambda