Skip to content

Support parallel writing to online store #2421

@npow

Description

@npow

Is your feature request related to a problem? Please describe.
Writing to online store (DynamoDB) currently uses a single thread, and takes extremely long when attempting to write billions of records (on the order of 12+ hours).

Describe the solution you'd like
Ability to parallelize writes to online store across multiple threads, ideally multiple nodes.

Describe alternatives you've considered
Tried using multiprocessing.Pool to parallelize writes on a single node with 96 cores but am hitting memory limits.
Historically have used AWS Glue to ingest data from S3->DynamoDB (using DynamoDB as a sink), and it works quite well.

Additional context
N/A

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions