avroio

package
v2.72.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 24, 2026 License: Apache-2.0, BSD-3-Clause, MIT Imports: 11 Imported by: 5

Documentation

Overview

Package avroio contains transforms for reading and writing avro files.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Read

func Read(s beam.Scope, glob string, t reflect.Type) beam.PCollection

Read reads a set of files and returns lines as a PCollection<elem> based on the internal avro schema of the file. A type - reflect.TypeOf( YourType{} ) - with JSON tags can be defined or if you wish to return the raw JSON string, use - reflect.TypeOf("") -

func Write

func Write(s beam.Scope, prefix, schema string, col beam.PCollection, opts ...WriteOption)

Write writes a PCollection<string> to an AVRO file. Write expects a JSON string with a matching AVRO schema. the process will fail if the schema does not match the JSON provided

Parameters:

prefix: File path prefix (e.g., "gs://bucket/output")
suffix: File extension (e.g., ".avro")
numShards: Number of output files (0 or 1 for single file)
schema: AVRO schema as JSON string

Files are named as: <prefix>-<shard>-of-<numShards><suffix> Example: output-00000-of-00010.avro

Examples:

Write(s, "gs://bucket/output", schema, col)                                    // output-00000-of-00001.avro (defaults)
Write(s, "gs://bucket/output", schema, col, WithSuffix(".avro"))               // output-00000-of-00001.avro (explicit)
Write(s, "gs://bucket/output", schema, col, WithNumShards(10))                 // output-00000-of-00010.avro (10 shards)
Write(s, "gs://bucket/output", schema, col, WithSuffix(".avro"), WithNumShards(10)) // full control

Types

type WriteOption added in v2.71.0

type WriteOption func(*writeConfig)

func WithNumShards added in v2.71.0

func WithNumShards(numShards int) WriteOption

WithNumShards sets the number of output shards (default: 1)

func WithSuffix added in v2.71.0

func WithSuffix(suffix string) WriteOption

WithSuffix sets the file suffix (default: ".avro")

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL