I am trying to understand a behavior which is a bit confusing. We are using Spring boot and KafkaListener approach for the consumers. The configuration has auto commit set to true with the default interval of 5 seconds.
As per the docs, with Kafka auto commit enabled, the commit happens automatically every 5 seconds in the background.
We have a consumer that polls for messages and processes them in about 100 milliseconds. So within the 5 second interval there will be about 50 polls of the messages. When things are working fine, i don't see any issues. There are a couple of scenarios that i am trying to understand:
- We continue to poll for messages and say polled for 4.5 seconds and the pod crashes in the middle of processing a polled batch, before the 5 seconds auto commit interval is reached. This in theory should reprocess all the messages since the last commit 4.5 seconds ago, but it reprocesses only the polled batch that did not finish successfully. Does that mean there is some incremental commit that happens in the background, and only the current unfinished batch is not included in the commit offset?
- Say I have a polled batch that takes longer than the auto commit interval. I had induced an explicit sleep of 10 seconds in the listener. Per docs, this means the batch should get auto committed so if the batch fails after processing, it would have been committed as the interval is passed. However, when i try to do reproduce this by explicitly force killing the pod, the messages got reprocessed/republished as part of the next poll. Per #1 above, may be the background process is not including the current batch as it's not completed yet?
Any explanation to this behavior? My assumption seems to make sense but I want to be sure that's what is happening. Thanks in advance!