7,909 questions
-1
votes
0
answers
34
views
Flink egress messages occaisionally never get to Kafka [closed]
after coming across these KIPs:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-939%3A+Support+Participation+in+2PC.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-890%3A+Transactions+...
-2
votes
0
answers
40
views
Flink RocksDB - corrupted state - runtime exception [closed]
Setup: We have a Flink (v1.18.1) job deployed over 5 task managers. State is stored in RocksDB (using local SSD drives) and incremental checkpoints enabled every minute.
Serde: PojoType
State format: ...
1
vote
0
answers
63
views
Flink 1.20: kafka sink job fails with endofinput exception when restoring from savepoint
i encountered one kafka sink exception when started from a savepoint. msg as below:
java.lang.IllegalStateException: Received element after endOfInput: Record @ (undef) : org.apache.flink.table.data....
0
votes
0
answers
57
views
Flink: Could not initialize class org.apache.flink.runtime.util.HadoopUtils (Error when creating catalog with iceberg type in Flink)
I am trying to run a very simple Flink (Java) job that:
Creates an Iceberg JDBC catalog backed by PostgreSQL
Sets the Iceberg warehouse to the Hadoop FileSystem
The job is built successfully with ...
Best practices
0
votes
2
replies
35
views
Which classes must be POJOs/serializable in Apache Flink? When should I use env.registerType()?
I'm mantaining a Flink application and I'm confused about which classes need to be POJOs (serializable) for Flink to reach the State compatibility between different versions of the app.
What I ...
1
vote
1
answer
50
views
Flink: Channels vs. Gates
The Flink docs mention channels and gates.
I am having difficulties inferring what a channel and what a gate is and how they differ. Are these merely logical abstractions or is there also a one-to-one ...
1
vote
2
answers
42
views
Does Flink strictly enforce CPU requests with fine-grained resource management?
Flink allows to define requirements for CPU cores using fine-grained resource management.
I am wondering if this CPU request is strictly enforced or best effort?
Example:
A task manager has 4 CPU ...
-3
votes
1
answer
159
views
Flink Job Manager Direct Buffer Memory gets exhausted when checkpointing enabled
Issue:
Flink application throws Thread 'jobmanager-io-thread-25' produced an uncaught exception. java.lang.OutOfMemoryError: Direct buffer memory and terminates after running for 2-3 days.
No matter ...
Advice
0
votes
0
replies
87
views
Flink Iceberg job loses authentication with REST catalog (Keycloak OAuth2) after short time
I’m running a Flink DataStream job that reads events from a Kafka topic and writes them into an Apache Iceberg table using the REST catalog (Lakekeeper).
Authentication to the REST catalog is ...
0
votes
2
answers
113
views
Is it safe to mutate, emit, and snapshot the same operator-state instance in Apache Flink?
I'm building a single global Topology object in a non-keyed ProcessFunction with parallelism = 1. I keep it as a local mutable object and update it for every input event using topology.apply(GnmiEvent)...
-1
votes
1
answer
47
views
Does Flink throughput decrease proportionally with the number of side outputs?
I have a Flink job with multiple downstream operators I want to route tuples to based on a condition. Side outputs are advertised for this use case in the Flink documentation.
However, when sending ...
0
votes
1
answer
56
views
How to emit keyed records for a compacted topic (SimpleStringSchema ClassCastException)?
I'm upgrading a PyFlink job to 2.0 and want to write to a Kafka compacted topic using the new KafkaSink. The stream produces (key, value) tuples (key is a string, value is a JSON payload). I configure ...
Advice
0
votes
1
replies
54
views
Flink - Async IO Threads required
we are using Flink's AsyncIO function with Futures to make external gRPC calls. Currently, we have set the async capacity to 1, and we are using a blocking stub to make those calls. For each event, we ...
0
votes
1
answer
70
views
Flink state processor api writes new savepoint failed because "All uid's/hashes must be unique"?
Flink Version:1.17.1
There is a KeyedBroadcastProcessFunction in my project like this:
public class MyOperator extends KeyedBroadcastProcessFunction<..> {
private MapState<String, String&...
0
votes
1
answer
49
views
Are Flink's timer service & Collector thread-safe?
Question
Async operation & Future callback was added as the State API was upgraded to v2. Will it be thread-safe to call the Timer service & Collector from that callback?
Example
final var ...
0
votes
1
answer
66
views
Does an uninitialized ValueState occupy memory in checkpoints in Flink?
I'm using a ValueState with TTL and I want to understand the difference (if any) in the checkpointed state size/memory between two scenarios:
First scenario
I create/obtain the ValueState but never ...
0
votes
0
answers
74
views
flink autoscaler not working properly after refactoring jobs to use a single task
We recently refactored all of our flink jobs to use a single vertex with no splitting. Since the change the flink autoscaler is having issues calculating when to scale up or down. We aren't seeing any ...
1
vote
2
answers
63
views
Flink GenericType List serialization
In my Flink app, I found this log:
Field EnrichmentInfo#groupIds will be processed as GenericType. Please read the Flink documentation on "Data Types & Serialization" for details of the ...
0
votes
1
answer
51
views
Flink dynamic sink routing with confluent schema-registry
I have an Apache Flink app that is using a kafksink with a setTopicSelector
KafkaSink<T>> sink =
KafkaSink.<T>>builder()
.setBootstrapServers(sink_brokers)
...
0
votes
0
answers
57
views
Getting (java.sql.SQLRecoverableException: IO Error: Socket read interrupted, Authentication lapse 0 ms.) when using Kafka sink with Apache Flink
I am currently running a small application that periodically polls data from my DB and then puts it in a Kafka topic. While running the application code independently, when I comment my Kafka sink, ...
0
votes
0
answers
69
views
Why the error "package org.apache.flink.api.common.eventtime does not exist" is produced
I am compiling a java project that uses maven to manage the project.
The java version is 17.0.16
I included the following dependency inside the dependencies section
(i.e. ...)
<dependency>
&...
0
votes
0
answers
59
views
s3 filesystem connector able to auth when used as source but not sink
I'm using flink 1.20.2. My flink job reads a table from a parquet file in 3, which had data for many tenants. It gets the list of distinct tenant IDs, then inserts those to new parquet files, 1 ...
1
vote
0
answers
57
views
AvroRowDeserializationSchema and AvroRowSerializationSchema not working in PyFlink 2.1.0
I am trying to use AvroRowSerializationSchema with PyFlink 2.1.0, but I keep getting a Py4JError saying that the class does not exist in the JVM, even though I have the right JARs.
Environment:
...
0
votes
1
answer
33
views
Apache Flink: I/O writing thread encountered an error: segment has been freed
We have an Flink job (batch mode) that runs on AWS KDA (ver: 1.20.0), where its logical operators look like: FileSource -> map() -> AssignTimestamps() -> filter() -> keyBy -> ...
0
votes
0
answers
28
views
Running SQL in a test jar run in org.apache.flink.test.util.MiniClusterWithClientResource minicluster
I have an application that
Streams data from Kafka
Inserts the data received into Flink Table-Api
Perform Join on tables and emit event
StreamExecutionEnvironment and StreamTableEnvironment is used ...
-4
votes
1
answer
96
views
Questions about Apache Flink internals, BATCH execution mode
We recently experimented with Flink, in particular BATCH execution mode to setup an ETL job processing an bounded data-set. It works quite well but I'd like to get some clarifications about my ...
0
votes
1
answer
39
views
flink on yarn, ContainerLocalizer download locally file failed
I submit a Flink job to Hadoop-Yarn, and use Flink application mode. Everything is normal on the client side, but the app master starts failing on the NodeManager, with the following logs.
...
3
votes
1
answer
92
views
Flink and Kafka configuration for aggregated count by topic
We want to track all visits by country. so, our click tracker will send payload containing country to its corresponding topic (1 country maps to 1 topic) where one visit, no matter the page, so long ...
0
votes
0
answers
67
views
Flink CDC + Hudi isn't working as expect, found log said state is cleared
I'm using Flink CDC + Apache Hudi in Flink to sync data from MySQl to AWS S3. My Flink job looks like:
parallelism = 1
env = StreamExecutionEnvironment.get_execution_environment(config)
...
0
votes
1
answer
52
views
High CPU usage from RowData serialization in Flink Table API despite ObjectReuse optimization
I have a Table API pipeline that does a 1-minute Tumbling Count aggregation over a set of 15 columns. FlameGraph shows that most of the CPU (~40%) goes into serializing each row, despite using ...
0
votes
2
answers
76
views
Efficiently iterating over ordered keys in flink MapState
I am building a Flink application roughly modeled after Flink's demo fraud detection application, where events come into my system out-of-order, are keyed by some criteria, and then are stored in a ...
0
votes
1
answer
52
views
How to add a new column to flink sql job that can restore from an existing savepoint or checkpoint
I have 2 table, both use kafka connector to read data. I join these source and write data to a another kafka topic. We checkpoint every 10 minutes, so when job restart, we use execution.savepoint.path ...
0
votes
0
answers
36
views
How do I run a pyflink program on a remote cluster without packaging it
The documentation displays a way to create a RemoteExecutionEnvironment in java:
public static void main(String[] args) throws Exception {
ExecutionEnvironment env = ExecutionEnvironment
....
0
votes
0
answers
90
views
Apache Flink FileSink compaction extremely slow with many hot buckets/paths
I have a Flink ETL job that reads from ~13 Kafka topics and writes data into HDFS using a FileSink with compaction enabled.
Right now, we have around 40 different output paths (buckets), and roughly ...
0
votes
0
answers
65
views
Azure Event Hubs +Apache Flink + Cassandra – how to handle downtime without losing events (one Replay Hub vs multiple Replay Hubs)?
We use Azure Event Hubs (Kafka API) with Apache Flink consumers, and a shared Cassandra DB as the sink.
There are 7 Event Hubs (one per application) → each has its own Flink consumer writing to the ...
0
votes
1
answer
132
views
flink ConfluentRegistryAvroSerializationSchema not respecting registryConfigs
When I use in Apache Flink the KafkaRecordSerializationSchema with settings for the schema registry serialization , the registryConfigs settings are not taken in account
settings like
auto.register....
0
votes
1
answer
71
views
FLink sql with mini batch seems to trigger only on checkpoint
I have the following config set for my job
'table.exec.sink.upsert-materialize': 'NONE',
'table.exec.mini-batch.enabled': true,
'table.exec.mini-batch.allow-latency'...
0
votes
0
answers
76
views
Apache Flink sinkfunction python code throws exception
The Apache Flink 2.1 does not support mongodb python connectors. So I make the sample python codes by using SinkFunction.
from pyflink.datastream import StreamExecutionEnvironment
from pyflink....
0
votes
0
answers
24
views
Apache Flink: kafkaSource with idle works bad when connecting a broadcastStream
When kafkaSource connected a broadcastStream is set with idleness, the watermark of downStream is abnormal.
My question is how to make the watermark normal to use in window.
Here's a case.
KafkaSource ...
0
votes
0
answers
31
views
Kafka - Flink -> Complex Event Processing
I have a topic in kafka with the following messages:
{"time":"2025-07-31T17:25:31.483425Z","EventID":4624,"ComputerName":"workstation"}
{"time&...
2
votes
1
answer
362
views
When to use Flink vs Temporal (Durable Execution Engine)
So, I'm getting started on researching Apache Flink and Temporal to understand if they can be integrated into my current stack. So far, what I understand is both Flink and Temporal is used to replay ...
2
votes
1
answer
145
views
Debezium + Flink Oracle CDC - "db history topic or its content is fully or partially missing" for some tables
I am using Flink with Debezium to consume CDC changes from Oracle DB tables via LogMiner.
For some tables, everything works fine. For example, the following table works without issues:
CREATE TABLE ...
0
votes
0
answers
71
views
Flink SQL Job: com.starrocks.data.load.stream.exception.StreamLoadFailException: Could not get load state because
I'm encountering a Flink job failure and would appreciate any input on what might be misconfigured:
2025‑07‑28 17:30:52
org.apache.flink.runtime.JobException: Recovery is suppressed by ...
2
votes
2
answers
130
views
Set log levels per class with AWS Managed Apache Flink
This question is specific to AWS Managed Flink (1.19).
I know how to control logging in most Java apps,
but everything I know is failing in this case.
I have a Java application running in Amazon ...
0
votes
1
answer
48
views
Flink's Buffer debloating mechanism
I would like to understand better the functioning of Buffer Debloating in Apache Flink.
Assume that:
I have a Flink job structured like a pipeline (A -> B -> C -> D -> E)
aligned ...
0
votes
0
answers
33
views
Compacted files are written to the lowest datetime bucket among source part files instead of original part file system time
I’m working with Apache Flink 1.16 on an ETL job that reads data from Kafka and writes the output to HDFS in Parquet format. I’m using a FileSink with BulkFormat (ParquetAvroWriters), a ...
0
votes
1
answer
65
views
How to limit single file size when using Flink batch mode to write Parquet
I was using Flink in batch mode to read data from one source and then directly write the data into file system as Parquet format.
The code was like:
hudi_source_ddl = f"""
...
0
votes
0
answers
90
views
The library registration references a different set of library BLOBs than previous registrations for this job
Context : Flink Deployment Application cluster Mode
Usecase : Flink Job to read data from kafka , transform and produce to kafka
Deployment : On k8s cluster using k8s Flink operator
To start with ...
0
votes
0
answers
52
views
The DB Hhstory topic or its content is fully or partially missing
Flink CDC Oracle Task Fails with DebeziumException:
The db history topic is missing
I have set up Flink and used it to synchronize Oracle data to Doris. Now I’m testing multiple Oracle ...
0
votes
1
answer
33
views
How to Log State Access (get/put) in Flink SQL Join Operators with Operator Metadata?
I'm using Flink SQL to join data from multiple Kafka topics. Sometimes the resulting join is non-trivial to debug, and I want to log the state access involved in the join — specifically, I’d like to ...