™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Working with Kafka Producers
Kafka Producer
Advanced
Working with producers in
Java
Details and advanced topics
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Objectives Create Producer
❖ Cover advanced topics regarding Java Kafka
Consumers
❖ Custom Serializers
❖ Custom Partitioners
❖ Batching
❖ Compression
❖ Retries and Timeouts
2
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Kafka Producer
❖ Kafka client that publishes records to Kafka cluster
❖ Thread safe
❖ Producer has pool of buffer that holds to-be-sent
records
❖ background I/O threads turning records into request
bytes and transmit requests to Kafka
❖ Close producer so producer will not leak resources
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Kafka Producer Send, Acks and
Buffers
❖ send() method is asynchronous
❖ adds the record to output buffer and return right away
❖ buffer used to batch records for efficiency IO and compression
❖ acks config controls Producer record durability. ”all" setting
ensures full commit of record, and is most durable and least fast
setting
❖ Producer can retry failed requests
❖ Producer has buffers of unsent records per topic partition (sized at
batch.size)
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Kafka Producer: Buffering and
batching
❖ Kafka Producer buffers are available to send immediately as fast as broker can
keep up (limited by inflight max.in.flight.requests.per.connection)
❖ To reduce requests count, set linger.ms > 0
❖ wait up to linger.ms before sending or until batch fills up whichever comes
first
❖ Under heavy load linger.ms not met, under light producer load used to
increase broker IO throughput and increase compression
❖ buffer.memory controls total memory available to producer for buffering
❖ If records sent faster than they can be transmitted to Kafka then this buffer
gets exceeded then additional send calls block. If period blocks
(max.block.ms) after then Producer throws a TimeoutException
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Producer Acks
❖ Producer Config property acks
❖ (default all)
❖ Write Acknowledgment received count required from
partition leader before write request deemed complete
❖ Controls Producer sent records durability
❖ Can be all (-1), none (0), or leader (1)
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Acks 0 (NONE)
❖ acks=0
❖ Producer does not wait for any ack from broker at all
❖ Records added to the socket buffer are considered sent
❖ No guarantees of durability - maybe
❖ Record Offset returned is set to -1 (unknown)
❖ Record loss if leader is down
❖ Use Case: maybe log aggregation
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Acks 1 (LEADER)
❖ acks=1
❖ Partition leader wrote record to its local log but responds
without followers confirmed writes
❖ If leader fails right after sending ack, record could be lost
❖ Followers might have not replicated the record
❖ Record loss is rare but possible
❖ Use Case: log aggregation
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Acks -1 (ALL)
❖ acks=all or acks=-1
❖ Leader gets write confirmation from full set of ISRs before
sending ack to producer
❖ Guarantees record not be lost as long as one ISR remains alive
❖ Strongest available guarantee
❖ Even stronger with broker setting min.insync.replicas
(specifies the minimum number of ISRs that must acknowledge
a write)
❖ Most Use Cases will use this and set a min.insync.replicas >
1
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
KafkaProducer config Acks
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Producer Buffer Memory Size
❖ Producer config property: buffer.memory
❖ default 32MB
❖ Total memory (bytes) producer can use to buffer records
to be sent to broker
❖ Producer blocks up to max.block.ms if buffer.memory
is exceeded
❖ if it is sending faster than the broker can receive,
exception is thrown
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Batching by Size
❖ Producer config property: batch.size
❖ Default 16K
❖ Producer batch records
❖ fewer requests for multiple records sent to same partition
❖ Improved IO throughput and performance on both producer and server
❖ If record is larger than the batch size, it will not be batched
❖ Producer sends requests containing multiple batches
❖ batch per partition
❖ Small batch size reduce throughput and performance. If batch size is too big,
memory allocated for batch is wasted
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Batching by Time and Size - 1
❖ Producer config property: linger.ms
❖ Default 0
❖ Producer groups together any records that arrive before
they can be sent into a batch
❖ good if records arrive faster than they can be sent out
❖ Producer can reduce requests count even under
moderate load using linger.ms
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Batching by Time and Size - 2
❖ linger.ms adds delay to wait for more records to build up so
larger batches are sent
❖ good brokers throughput at cost of producer latency
❖ If producer gets records who size is batch.size or more for a
broker’s leader partitions, then it is sent right away
❖ If Producers gets less than batch.size but linger.ms interval
has passed, then records for that partition are sent
❖ Increase to improve throughput of Brokers and reduce broker
load (common improvement)
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Compressing Batches
❖ Producer config property: compression.type
❖ Default 0
❖ Producer compresses request data
❖ By default producer does not compress
❖ Can be set to none, gzip, snappy, or lz4
❖ Compression is by batch
❖ improves with larger batch sizes
❖ End to end compression possible if Broker config “compression.type” set to
producer. Compressed data from producer sent to log and consumer by broker
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Batching and Compression
Example
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Custom Serializers
❖ You don’t have to use built in serializers
❖ You can write your own
❖ Just need to be able to convert to/fro a byte[]
❖ Serializers work for keys and values
❖ value.serializer and key.serializer
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Custom Serializers Config
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Custom Serializer
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockPrice
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Broker Follower Write Timeout
❖ Producer config property: request.timeout.ms
❖ Default 30 seconds (30,000 ms)
❖ Maximum time broker waits for confirmation from
followers to meet Producer acknowledgment
requirements for ack=all
❖ Measure of broker to broker latency of request
❖ 30 seconds is high, long process time is indicative of
problems
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Producer Request Timeout
❖ Producer config property: request.timeout.ms
❖ Default 30 seconds (30,000 ms)
❖ Maximum time producer waits for request to complete to
broker
❖ Measure of producer to broker latency of request
❖ 30 seconds is very high, long request time is an
indicator that brokers can’t handle load
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Producer Retries
❖ Producer config property: retries
❖ Default 0
❖ Retry count if Producer does not get ack from Broker
❖ only if record send fail deemed a transient error (API)
❖ as if your producer code resent record on failed attempt
❖ timeouts are retried, retry.backoff.ms (default to 100 ms)
to wait after failure before retry
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Retry, Timeout, Back-off
Example
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Producer Partitioning
❖ Producer config property: partitioner.class
❖ org.apache.kafka.clients.producer.internals.DefaultPartitioner
❖ Partitioner class implements Partitioner interface
❖ Default Partitioner partitions using hash of key if record
has key
❖ Default Partitioner partitions uses round-robin if record
has no key
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Configuring Partitioner
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockPricePartitioner
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockPricePartitioner
partition()
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockPricePartitioner
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Producer Interception
❖ Producer config property: interceptor.classes
❖ empty (you can pass an comma delimited list)
❖ interceptors implementing ProducerInterceptor interface
❖ intercept records producer sent to broker and after acks
❖ you could mutate records
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
KafkaProducer - Interceptor
Config
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
KafkaProducer
ProducerInterceptor
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
ProducerInterceptor onSend
ic=stock-prices2 key=UBER value=StockPrice{dollars=737, cents=78, name='
Output
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
ProducerInterceptor onAck
onAck topic=stock-prices2, part=0, offset=18360
Output
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
ProducerInterceptor the rest
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
KafkaProducer send() Method
❖ Two forms of send with callback and with no callback both return Future
❖ Asynchronously sends a record to a topic
❖ Callback gets invoked when send has been acknowledged.
❖ send is asynchronous and return right away as soon as record has added to
send buffer
❖ Sending many records at once without blocking for response from Kafka broker
❖ Result of send is a RecordMetadata
❖ record partition, record offset, record timestamp
❖ Callbacks for records sent to same partition are executed in order
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
KafkaProducer send()
Exceptions
❖ InterruptException - If the thread is interrupted while
blocked (API)
❖ SerializationException - If key or value are not valid
objects given configured serializers (API)
❖ TimeoutException - If time taken for fetching metadata or
allocating memory exceeds max.block.ms, or getting acks
from Broker exceed timeout.ms, etc. (API)
❖ KafkaException - If Kafka error occurs not in public API.
(API)
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Using send method
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
KafkaProducer flush() method
❖ flush() method sends all buffered records now (even if
linger.ms > 0)
❖ blocks until requests complete
❖ Useful when consuming from some input system and
pushing data into Kafka
❖ flush() ensures all previously sent messages have been
sent
❖ you could mark progress as such at completion of flush
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
KafkaProducer close()
❖ close() closes producer
❖ frees resources (threads and buffers) associated with producer
❖ Two forms of method
❖ both block until all previously sent requests complete or duration
passed in as args is exceeded
❖ close with no params equivalent to close(Long.MAX_VALUE,
TimeUnit.MILLISECONDS).
❖ If producer is unable to complete all requests before the timeout
expires, all unsent requests fail, and this method fails
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Orderly shutdown using close
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Wait for clean close
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
KafkaProducer partitionsFor()
method
❖ partitionsFor(topic) returns meta data for partitions
❖ public List<PartitionInfo> partitionsFor(String topic)
❖ Get partition metadata for give topic
❖ Produce that do their own partitioning would use this
❖ for custom partitioning
❖ PartitionInfo(String topic, int partition, Node leader,
Node[] replicas, Node[] inSyncReplicas)
❖ Node(int id, String host, int port, optional String rack)
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
KafkaProducer metrics()
method
❖ metrics() method get map of metrics
❖ public Map<MetricName,? extends Metric> metrics()
❖ Get the full set of producer metrics
MetricName(
String name,
String group,
String description,
Map<String,String> tags
)
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Metrics producer.metrics()
❖ Call producer.metrics()
❖ Prints out metrics to log
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Metrics producer.metrics()
output
Metric producer-metrics, record-queue-time-max, 508.0,
The maximum time in ms record batches spent in the record accumulator.
17:09:22.721 [pool-1-thread-9] INFO c.c.k.p.MetricsProducerReporter -
Metric producer-node-metrics, request-rate, 0.025031289111389236,
The average number of requests sent per second.
17:09:22.721 [pool-1-thread-9] INFO c.c.k.p.MetricsProducerReporter -
Metric producer-metrics, records-per-request-avg, 205.55263157894737,
The average number of records per request.
17:09:22.722 [pool-1-thread-9] INFO c.c.k.p.MetricsProducerReporter -
Metric producer-metrics, record-size-avg, 71.02631578947368,
The average record size
17:09:22.722 [pool-1-thread-9] INFO c.c.k.p.MetricsProducerReporter -
Metric producer-node-metrics, request-size-max, 56.0,
The maximum size of any request sent in the window.
17:09:22.723 [pool-1-thread-9] INFO c.c.k.p.MetricsProducerReporter -
Metric producer-metrics, request-size-max, 12058.0,
The maximum size of any request sent in the window.
17:09:22.723 [pool-1-thread-9] INFO c.c.k.p.MetricsProducerReporter -
Metric producer-metrics, compression-rate-avg, 0.41441360272859273,
The average compression rate of record batches.
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Metrics via JMX
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockPrice Producer Java Example
Lab StockPrice
Producer Java
Example
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockPrice App to demo Advanced
Producer
❖ StockPrice - holds a stock price has a name, dollar, and cents
❖ StockPriceKafkaProducer - Configures and creates
KafkaProducer<String, StockPrice>, StockSender list,
ThreadPool (ExecutorService), starts StockSender runnable into
thread pool
❖ StockAppConstants - holds topic and broker list
❖ StockPriceSerializer - can serialize a StockPrice into byte[]
❖ StockSender - generates somewhat random stock prices for a
given StockPrice name, Runnable, 1 thread per StockSender
❖ Shows using KafkaProducer from many threads
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockPrice domain object
❖ has name
❖ dollars
❖ cents
❖ converts
itself to
JSON
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockPriceKafkaProducer
❖ Import classes and setup logger
❖ Create createProducer method to create KafkaProducer
instance
❖ Create setupBootstrapAndSerializers to initialize bootstrap
servers, client id, key serializer and custom serializer
(StockPriceSerializer)
❖ Write main() method - creates producer, create StockSender
list passing each instance a producer, creates a thread pool so
every stock sender gets it own thread, runs each stockSender in
its own thread
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockPriceKafkaProducer imports,
createProducer
❖ Import classes and
setup logger
❖ createProducer
used to create a
KafkaProducer
❖ createProducer()
calls
setupBoostrapAnd
Serializers()
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Configure Producer Bootstrap and
Serializer
❖ Create setupBootstrapAndSerializers to initialize bootstrap
servers, client id, key serializer and custom serializer
(StockPriceSerializer)
❖ StockPriceSerializer will serialize StockPrice into bytes
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockPriceKafkaProducer.mai
n()
❖ main method - creates producer,
❖ create StockSender list passing each instance a producer
❖ creates a thread pool (executorService)
❖ every StockSender runs in its own thread
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockAppConstants
❖ topic name for Producer example
❖ List of bootstrap servers
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockPriceKafkaProducer.getStockSen
derList
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockPriceSerializer
❖ Converts StockPrice into byte array
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockSender
❖ Generates random stock prices for a given StockPrice
name,
❖ StockSender is Runnable
❖ 1 thread per StockSender
❖ Shows using KafkaProducer from many threads
❖ Delays random time between delayMin and delayMax,
❖ then sends random StockPrice between stockPriceHigh
and stockPriceLow
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockSender imports,
Runnable
❖ Imports Kafka Producer, ProducerRecord, RecordMetadata, StockPrice
❖ Implements Runnable, can be submitted to ExecutionService
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockSender constructor
❖ takes a topic, high & low stockPrice, producer, delay min & max
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockSender run()
❖ In loop, creates random record, send record, waits random
time
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockSender
createRandomRecord
❖ createRandomRecord uses randomIntBetween
❖ creates StockPrice and then wraps StockPrice in ProducerRecord
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockSender
displayRecordMetaData
❖ Every 100 records displayRecordMetaData gets called
❖ Prints out record info, and recordMetadata info:
❖ key, JSON value, topic, partition, offset, time
❖ uses Future from call to producer.send()
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Run it
❖ Run ZooKeeper
❖ Run three Brokers
❖ run create-topic.sh
❖ Run StockPriceKafkaProducer
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Run scripts
❖ run ZooKeeper from ~/kafka-training
❖ use bin/create-topic.sh to create
topic
❖ use bin/delete-topic.sh to delete
topic
❖ use bin/start-1st-server.sh to run
Kafka Broker 0
❖ use bin/start-2nd-server.sh to run
Kafka Broker 1
❖ use bin/start-3rd-server.sh to run
Kafka Broker 2
Config is under directory called config
server-0.properties is for Kafka Broker 0
server-1.properties is for Kafka Broker 1
server-2.properties is for Kafka Broker 2
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Run All 3 Brokers
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Run create-topic.sh script
Name of the topic
is stock-prices
Three partitions
Replication factor
of three
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Run StockPriceKafkaProducer
❖ Run StockPriceKafkaProducer from the IDE
❖ You should see log messages from StockSender(s)
with StockPrice name, JSON value, partition, offset,
and time
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Using flush and close
Lab Adding an
orderly shutdown
flush and close
Using flush and close
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Shutdown Producer nicely
❖ Handle ctrl-C shutdown from Java
❖ Shutdown thread pool and wait
❖ Flush producer to send any outstanding batches if using
batches (producer.flush())
❖ Close Producer (producer.close()) and wait
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Nice Shutdown
producer.close()
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Restart Producer then shut it
down
❖ Add shutdown hook
❖ Start StockPriceKafkaProducer
❖ Now stop it (CTRL-C or hit stop button in IDE)
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Lab Configuring
Producer Durability
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Set default acks to all
❖ Set defaults acks to all (this is the default)
❖ This means that all ISRs in-sync replicas have to respond for
producer write to go through
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Note Kafka Broker Config
❖ At least this many in-sync replicas (ISRs) have to respond for
producer to get ack
❖ NOTE: We have three brokers in this lab, all three have to be up
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Run it. Run Servers. Run Producer. Kill 1
Broker
❖ If not already, startup ZooKeeper
❖ Startup three Kafka brokers
❖ using scripts described earlier
❖ From the IDE run StockPriceKafkaProducer
❖ From the terminal kill one of the Kafka Brokers
❖ Now look at the logs for the StockPriceKafkaProducer, you should see
❖ Caused by:
org.apache.kafka.common.errors.NotEnoughReplicasException
❖ Messages are rejected since there are fewer in-sync replicas than
required.
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
What happens when we shut one
down?
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Why did the send fail?
❖ ProducerConfig.ACKS_CONFIG (acks config for
producer) was set to “all”
❖ Expects leader to only give successful ack after all
followers ack the send
❖ Broker Config min.insync.replicas set to 3
❖ At least three in-sync replicas must respond before
send is considered successful
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Run it. Run Servers. Run Producer. Kill 1
Broker
❖ If not already, startup ZooKeeper
❖ Ensure all three Kafka Brokers are running if not running
❖ Change StockPriceKafkaProducer acks config to 1
props.put(ProducerConfig.ACKS_CONFIG, “1"); (leader
sends ack after write to log)
❖ From the IDE run StockPriceKafkaProducer
❖ From the terminal kill one of the Kafka Brokers
❖ StockPriceKafkaProducer runs normally
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
What happens when we shutdown acks “1”
this time?
❖ Nothing
happens!
❖ It continues
to work
because
only the
leader has
to ack its
write
Which type of application would you only want acks set to 1?
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Why did the send not fail for
acks 1?
❖ ProducerConfig.ACKS_CONFIG (acks config for
producer) was set to “1”
❖ Expects leader to only give successful ack after it
writes to its log
❖ Replicas still get replication but leader does not wait
for replication
❖ Broker Config min.insync.replicas is still set to 3
❖ This config only gets looked at if acks=“all”
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Try describe topics before and
after
❖ Try this last one again
❖ Stop a server while producer is running
❖ Run describe-topics.sh (shown above)
❖ Rerun server you stopped
❖ Run describe-topics.sh again
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Describe-Topics After Each
Stop/Start
All 3 brokers
running
1 broker down.
Leader 1has 2
partitions
All 3 brokers
running.
Look at Leader 1
All 3 brokers
running
after a few minute
while
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Review Question
❖ How would you describe the above?
❖ How many servers are likely running out of the three?
❖ Would the producer still run with acks=all? Why or Why not?
❖ Would the producer still run with acks=1? Why or Why not?
❖ Would the producer still run with acks=0? Why or Why not?
❖ Which broker is the leader of partition 1?
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Retry with acks = 0
❖ Run the last example again (servers, and producer)
❖ Run all three brokers then take one away
❖ Then take another broker away
❖ Run describe-topics
❖ Take all of the brokers down and continue to run the producer
❖ What do you think happens?
❖ When you are done, change acks back to acks=all
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Using Kafka built-in Producer Metrics
Adding Producer Metrics
and
Replication Verification
Metrics
Replication Verification
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Objectives of lab
❖ Setup Kafka Producer Metrics
❖ Use replication verification command line tool
❖ Change min.insync.replicas for broker and observer
metrics and replication verification
❖ Change min.insync.replicas for topic and observer
metrics and replication verification
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Create Producer Metrics
Monitor
❖ Create a class called MetricsProducerReporter that is
Runnable
❖ Pass it a Kafka Producer
❖ Call producer.metrics() every 10 seconds in a while
loop from run method, and print out the MetricName
and Metric value
❖ Submit MetricsProducerReporter to the
ExecutorService in the main method of
StockPriceKafkaProducer
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Create MetricsProducerReporter
(Runnable)
❖ Implements Runnable takes a producer
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Create MetricsProducerReporter
(Runnable)
❖ Call producer.metrics() every 10 seconds in a while loop from run method, and print out
MetricName and Metric value
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Create MetricsProducerReporter
(Runnable)
❖ Increase thread pool size by 1 to fit metrics reporting
❖ Submit instance of MetricsProducerReporter to the
ExecutorService in the main method of StockPriceKafkaProducer
(and pass new instance a producer)
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Run it. Run Servers. Run
Producer.
❖ If not already, startup ZooKeeper
❖ Startup three Kafka brokers
❖ using scripts described earlier
❖ From the IDE run StockPriceKafkaProducer (ensure
acks are set to all first)
❖ Observe metrics which print out every ten seconds
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Look at the output
15:40:43.858 [pool-1-thread-1] INFO c.c.k.p.MetricsProducerReporter -
Metric producer-node-metrics, outgoing-byte-rate, 1.8410309773473144,
15:40:43.858 [pool-1-thread-1] INFO c.c.k.p.MetricsProducerReporter -
Metric producer-topic-metrics, record-send-rate, 975.3229844767151,
15:40:43.858 [pool-1-thread-1] INFO c.c.k.p.MetricsProducerReporter -
Metric producer-node-metrics, request-rate, 0.040021611670301965,
The average number of requests sent per second.
15:40:43.858 [pool-1-thread-1] INFO c.c.k.p.MetricsProducerReporter -
Metric producer-node-metrics, incoming-byte-rate, 7.304382629577747,
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Improve output with some
Java love
❖ Keep a set of only the metrics we want
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Improve output: Filter metrics
❖ Use Java 8 Stream to filter and sort metrics
❖ Get rid of metric values that are NaN, Infinite numbers and 0s
❖ Sort map my converting it to TreeMap<String, MetricPair>
❖ MetricPair is helper class that has a Metric and a MetricName
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Improve output: Pretty Print
❖ Give a nice format so we can read metrics easily
❖ Give some space and some easy indicators to find in log
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Pretty Print Metrics Output
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Validate Partition Replication
Checks lag every 5 seconds for stock-price
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Run it. Run Servers. Run Producer. Kill
Brokers
❖ If not already, startup ZooKeeper and three Kafka
brokers
❖ Run StockPriceKafkaProducer (ensure acks are set to
all first)
❖ Start and stop different Kafka Brokers while
StockPriceKafkaProducer runs, observe metrics,
observe changes, Run replication verification in one
terminal and check topics stats in another
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Observe Partitions Getting
Behind
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Recover
Replication Verification
Describe Topics
Output
of
2nd Broker
Recovering
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Run it. Run Servers. Run Producer. Kill
Brokers
❖ Stop all Kafka Brokers (Kafka servers)
❖ Change min.insync.replicas=3 to
min.insync.replicas=2
❖ config files for broker are in lab directory under config
❖ Startup ZooKeeper if needed and three Kafka brokers
❖ Run StockPriceKafkaProducer (ensure acks are set to
all first)
❖ Start and stop different Kafka Brokers while
StockPriceKafkaProducer runs,
❖ Observe metrics, observe changes
❖ Run replication verification in one terminal and check
topics stats in another with describe-topics.sh in another
terminal
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Expected outcome
❖ Producer will work even if one broker goes down.
❖ Producer will not work if two brokers go down because
min.insync.replicas=2, two replicas have to be up besides leader
Since
Producer
can run with
1 down
broker,
the
replication
lag can get
really far
behind.When you
startup
“failed”
broker, it
catches up
really fast.
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Change min.insync.replicas
back
❖ Shutdown all brokers
❖ Change back min.insync.replicas=3
❖ Broker config for servers
❖ Do this for all of the servers
❖ Start ZooKeeper if needed
❖ Start brokers back up
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Modify bin/create-topic.sh
❖ Modify bin/create-
topic.sh
❖ add --config
min.insync.replicas=2
❖ Add this as param to
kafka-topics.sh
❖ Run delete-topic.sh
❖ Run create-topic.sh
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Recreate Topic
❖ Run delete-topic.sh
❖ Run create-topic.sh
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Run it. Run Servers. Run Producer. Kill
Brokers
❖ Stop all Kafka Brokers (Kafka servers)
❖ Startup ZooKeeper if needed and three Kafka brokers
❖ Run StockPriceKafkaProducer (ensure acks are set to all
first)
❖ Start and stop different Kafka Brokers while
StockPriceKafkaProducer runs,
❖ Observe metrics, observe changes
❖ Run replication verification in one terminal and check topics
stats in another with describe-topics.sh in another terminal
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Expected Results
❖ The min.insync.replicas on the Topic config overrides
the min.insync.replicas on the Broker config
❖ In this setup, you can survive a single node failure
but not two (output below is recovery)
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Lab Batching
Records
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Objectives
❖ Disable batching and observer metrics
❖ Enable batching and observe metrics
❖ Increase batch size and linger and observe metrics
❖ Run consumer to see batch sizes change
❖ Enable compression, observe results
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
SimpleStockPriceConsumer
❖ We added a SimpleStockPriceConsumer to consume
StockPrices and display batch lengths for poll()
❖ We won’t cover in detail just quickly since this is a
Producer lab not a Consumer lab. :)
❖ Run this while you are running the
StockPriceKafkaProducer
❖ While you are running SimpleStockPriceConsumer with
various batch and linger config observe output of Producer
metrics and StockPriceKafkaProducer output
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
SimpleStockPriceConsumer
❖ Similar to other
Consumer
examples so
far
❖ Subscribes to
stock-prices
topic
❖ Has custom
serializer
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
SimpleStockPriceConsumer.runConsu
mer
❖ Drains topic; Creates map of current stocks; Calls
displayRecordsStatsAndStocks()
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
SimpleStockPriceConsumer.display…(
❖ Prints out size of each partition read and total record count
❖ Prints out each stock at its current price
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockDeserializer
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Disable batching
❖ Start by
disabling
batching
❖ This turns
batching off
❖ Run this
❖ Check
Consumer
and stats
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Metrics No Batch
Records per poll averages around
Batch Size is 80
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Set batching to 16K
❖ Set the batch size to 16K
❖ Run this
❖ Check Consumer and stats
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Set batching to 16K Results
Consumer Records per poll averages around 7.5
Batch Size is now 136.02
59% more batching
Look how much the request queue time shrunk!
16K Batch SizeNo Batch
Look at the
record-send-rate
200% faster!
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Set batching to 16K and linger to
10ms
❖ Set the batch size to 16K and linger to 10ms
❖ Run this
❖ Check Consumer and Stats
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Results batch size 16K linger
10ms
Consumer Records per poll averages around 17
Batch Size is now 796
585% more batching
16K No Batch
Look at the
record-send-rate
went down but
higher than start
16K and 10ms linger
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Set batching to 64K and linger to 1
second
❖ Set the batch size to 64K and linger to 1 second
❖ Run this
❖ Check Consumer and Stats
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Results batch size 64K linger
1s
Consumer Records per poll averages around 500
Batch Size is now 40K
Record Queue Time is very high :(
16K No Batch
Look at the
network-io-rate
16K/10ms
64K/1s
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Set batching to 64K and linger to
100 ms
❖ Set the batch size to 64K and linger to 100ms second
❖ Run this
❖ Check Consumer and Stats
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Results batch size 64K linger
100ms
16K16K/10ms
64K/1s64K/100ms
64K batch size
100ms Linger
has the highest record-send-rate!
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Turn on compression snappy,
50ms, 64K
❖ Enable compression
❖ Set the batch size to 64K and linger to 50ms second
❖ Run this
❖ Check Consumer and Stats
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Results batch size Snappy
linger 10ms
64K/100ms Snappy 64K/50ms
Snappy 64K/50ms
has the highest record-send-rate and
1/2 the queue time!
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Lab Adding Retries
and Timeouts
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Objectives
❖ Setup timeouts
❖ Setup retries
❖ Setup retry back off
❖ Setup inflight messages to 1 so retries don’t store
records out of order
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Run it. Run Servers. Run Producer. Kill
Brokers
❖ Startup ZooKeeper if needed and three Kafka brokers
❖ Modify StockPriceKafkaProducer to configure retry,
timeouts, in-flight message count and retry back off
❖ Run StockPriceKafkaProducer
❖ Start and stop any two different Kafka Brokers while
StockPriceKafkaProducer runs,
❖ Notice retry messages in log of StockPriceKafkaProducer
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Modify
StockPriceKafkaProducer
❖ to configure retry, timeouts, in-flight message count and retry back off
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Expected output after 2 broker
shutdown
Run all.
Kill any two servers.
Look for retry messages.
Restart them and see that it
recovers
Also use replica verification
to see when broker catches
up
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
WARN Inflight Message Count
❖ MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION
"max.in.flight.requests.per.connection"
❖ max number of unacknowledged requests client sends on a single
connection before blocking
❖ If >1 and
❖ failed sends, then
❖ Risk message re-ordering on partition during retry attempt
❖ Depends on use but for StockPrices not good, you should pick retries > 1
or inflight > 1 but not both. Avoid duplicates. :)
❖ June 2017 release might fix this with sequence from producer
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Lab Write
ProducerIntercepto
r
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Objectives
❖ Setup an interceptor for request sends
❖ Create ProducerInterceptor
❖ Implement onSend
❖ Implement onAcknowledge
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Producer Interception
❖ Producer config property: interceptor.classes
❖ empty (you can pass an comma delimited list)
❖ interceptors implementing ProducerInterceptor interface
❖ intercept records producer sent to broker and after acks
❖ you could mutate records
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
KafkaProducer - Interceptor
Config
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
KafkaProducer
ProducerInterceptor
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
ProducerInterceptor onSend
ic=stock-prices2 key=UBER value=StockPrice{dollars=737, cents=78, name='
Output
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
ProducerInterceptor onAck
onAck topic=stock-prices2, part=0, offset=18360
Output
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
ProducerInterceptor the rest
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Run it. Run Servers. Run
Producer.
❖ Startup ZooKeeper if needed
❖ Start or restart Kafka brokers
❖ Run StockPriceKafkaProducer
❖ Look for log message from ProducerInterceptor
in output
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
ProducerInterceptor Output
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Lab Write Custom
Partitioner
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Objectives
❖ Create StockPricePartitioner
❖ Implements interface Partitioner
❖ Implement partition() method
❖ Implement configure() method with importantStocks
property
❖ Configure new Partitioner in Producer config with
property
ProducerConfig.INTERCEPTOR_CLASSES_CONFIG
❖ Pass config property importantStocks
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Producer Partitioning
❖ Producer config property: partitioner.class
❖ org.apache.kafka.clients.producer.internals.DefaultPartitioner
❖ Partitioner class implements Partitioner interface
❖ partition() method takes topic, key, value, and cluster
❖ returns partition number for record
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockPricePartitioner
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockPricePartitioner.configur
e()
❖ Implement configure() method
❖ with importantStocks property
❖ importantStocks get added to importantStocks HashSet
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockPricePartitioner
partition()
❖ IMPORTANT STOCK: If stockName is in
importantStocks HashSet then put it in partitionNum
= (partitionCount -1) (last partition)
❖ REGULAR STOCK: Otherwise if not in importantStocks
then not important use the absolute value of the hash of
the stockName modulus partitionCount -1 as the
partition to send the record
❖ partitionNum = abs(stockName.hashCode()) % (partitionCount - 1)
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
StockPricePartitioner
partition()
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Producer Config: Configuring
Partitioner
❖ Configure new Partitioner in Producer config with property
ProducerConfig.INTERCEPTOR_CLASSES_CONFIG
❖ Pass config property importantStocks
❖ importantStock are the ones that go into priority queue
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting, Kafka
Tutorial
Review of lab work
❖ You implemented custom ProducerSerializer
❖ You tested failover configuring broker/topic min.insync.replicas, and acks
❖ You implemented batching and compression and used metrics to see how
it was or was not working
❖ You implemented retires and timeouts, and tested that it worked
❖ You setup max inflight messages and retry back off
❖ You implemented a ProducerInterceptor
❖ You implemented a custom partitioner to implement a priority queue for
important stocks

Kafka Tutorial: Advanced Producers

  • 1.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Working with Kafka Producers Kafka Producer Advanced Working with producers in Java Details and advanced topics
  • 2.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Objectives Create Producer ❖ Cover advanced topics regarding Java Kafka Consumers ❖ Custom Serializers ❖ Custom Partitioners ❖ Batching ❖ Compression ❖ Retries and Timeouts 2
  • 3.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Kafka Producer ❖ Kafka client that publishes records to Kafka cluster ❖ Thread safe ❖ Producer has pool of buffer that holds to-be-sent records ❖ background I/O threads turning records into request bytes and transmit requests to Kafka ❖ Close producer so producer will not leak resources
  • 4.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Kafka Producer Send, Acks and Buffers ❖ send() method is asynchronous ❖ adds the record to output buffer and return right away ❖ buffer used to batch records for efficiency IO and compression ❖ acks config controls Producer record durability. ”all" setting ensures full commit of record, and is most durable and least fast setting ❖ Producer can retry failed requests ❖ Producer has buffers of unsent records per topic partition (sized at batch.size)
  • 5.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Kafka Producer: Buffering and batching ❖ Kafka Producer buffers are available to send immediately as fast as broker can keep up (limited by inflight max.in.flight.requests.per.connection) ❖ To reduce requests count, set linger.ms > 0 ❖ wait up to linger.ms before sending or until batch fills up whichever comes first ❖ Under heavy load linger.ms not met, under light producer load used to increase broker IO throughput and increase compression ❖ buffer.memory controls total memory available to producer for buffering ❖ If records sent faster than they can be transmitted to Kafka then this buffer gets exceeded then additional send calls block. If period blocks (max.block.ms) after then Producer throws a TimeoutException
  • 6.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Producer Acks ❖ Producer Config property acks ❖ (default all) ❖ Write Acknowledgment received count required from partition leader before write request deemed complete ❖ Controls Producer sent records durability ❖ Can be all (-1), none (0), or leader (1)
  • 7.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Acks 0 (NONE) ❖ acks=0 ❖ Producer does not wait for any ack from broker at all ❖ Records added to the socket buffer are considered sent ❖ No guarantees of durability - maybe ❖ Record Offset returned is set to -1 (unknown) ❖ Record loss if leader is down ❖ Use Case: maybe log aggregation
  • 8.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Acks 1 (LEADER) ❖ acks=1 ❖ Partition leader wrote record to its local log but responds without followers confirmed writes ❖ If leader fails right after sending ack, record could be lost ❖ Followers might have not replicated the record ❖ Record loss is rare but possible ❖ Use Case: log aggregation
  • 9.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Acks -1 (ALL) ❖ acks=all or acks=-1 ❖ Leader gets write confirmation from full set of ISRs before sending ack to producer ❖ Guarantees record not be lost as long as one ISR remains alive ❖ Strongest available guarantee ❖ Even stronger with broker setting min.insync.replicas (specifies the minimum number of ISRs that must acknowledge a write) ❖ Most Use Cases will use this and set a min.insync.replicas > 1
  • 10.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial KafkaProducer config Acks
  • 11.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Producer Buffer Memory Size ❖ Producer config property: buffer.memory ❖ default 32MB ❖ Total memory (bytes) producer can use to buffer records to be sent to broker ❖ Producer blocks up to max.block.ms if buffer.memory is exceeded ❖ if it is sending faster than the broker can receive, exception is thrown
  • 12.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Batching by Size ❖ Producer config property: batch.size ❖ Default 16K ❖ Producer batch records ❖ fewer requests for multiple records sent to same partition ❖ Improved IO throughput and performance on both producer and server ❖ If record is larger than the batch size, it will not be batched ❖ Producer sends requests containing multiple batches ❖ batch per partition ❖ Small batch size reduce throughput and performance. If batch size is too big, memory allocated for batch is wasted
  • 13.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Batching by Time and Size - 1 ❖ Producer config property: linger.ms ❖ Default 0 ❖ Producer groups together any records that arrive before they can be sent into a batch ❖ good if records arrive faster than they can be sent out ❖ Producer can reduce requests count even under moderate load using linger.ms
  • 14.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Batching by Time and Size - 2 ❖ linger.ms adds delay to wait for more records to build up so larger batches are sent ❖ good brokers throughput at cost of producer latency ❖ If producer gets records who size is batch.size or more for a broker’s leader partitions, then it is sent right away ❖ If Producers gets less than batch.size but linger.ms interval has passed, then records for that partition are sent ❖ Increase to improve throughput of Brokers and reduce broker load (common improvement)
  • 15.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Compressing Batches ❖ Producer config property: compression.type ❖ Default 0 ❖ Producer compresses request data ❖ By default producer does not compress ❖ Can be set to none, gzip, snappy, or lz4 ❖ Compression is by batch ❖ improves with larger batch sizes ❖ End to end compression possible if Broker config “compression.type” set to producer. Compressed data from producer sent to log and consumer by broker
  • 16.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Batching and Compression Example
  • 17.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Custom Serializers ❖ You don’t have to use built in serializers ❖ You can write your own ❖ Just need to be able to convert to/fro a byte[] ❖ Serializers work for keys and values ❖ value.serializer and key.serializer
  • 18.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Custom Serializers Config
  • 19.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Custom Serializer
  • 20.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockPrice
  • 21.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Broker Follower Write Timeout ❖ Producer config property: request.timeout.ms ❖ Default 30 seconds (30,000 ms) ❖ Maximum time broker waits for confirmation from followers to meet Producer acknowledgment requirements for ack=all ❖ Measure of broker to broker latency of request ❖ 30 seconds is high, long process time is indicative of problems
  • 22.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Producer Request Timeout ❖ Producer config property: request.timeout.ms ❖ Default 30 seconds (30,000 ms) ❖ Maximum time producer waits for request to complete to broker ❖ Measure of producer to broker latency of request ❖ 30 seconds is very high, long request time is an indicator that brokers can’t handle load
  • 23.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Producer Retries ❖ Producer config property: retries ❖ Default 0 ❖ Retry count if Producer does not get ack from Broker ❖ only if record send fail deemed a transient error (API) ❖ as if your producer code resent record on failed attempt ❖ timeouts are retried, retry.backoff.ms (default to 100 ms) to wait after failure before retry
  • 24.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Retry, Timeout, Back-off Example
  • 25.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Producer Partitioning ❖ Producer config property: partitioner.class ❖ org.apache.kafka.clients.producer.internals.DefaultPartitioner ❖ Partitioner class implements Partitioner interface ❖ Default Partitioner partitions using hash of key if record has key ❖ Default Partitioner partitions uses round-robin if record has no key
  • 26.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Configuring Partitioner
  • 27.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockPricePartitioner
  • 28.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockPricePartitioner partition()
  • 29.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockPricePartitioner
  • 30.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Producer Interception ❖ Producer config property: interceptor.classes ❖ empty (you can pass an comma delimited list) ❖ interceptors implementing ProducerInterceptor interface ❖ intercept records producer sent to broker and after acks ❖ you could mutate records
  • 31.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial KafkaProducer - Interceptor Config
  • 32.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial KafkaProducer ProducerInterceptor
  • 33.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial ProducerInterceptor onSend ic=stock-prices2 key=UBER value=StockPrice{dollars=737, cents=78, name=' Output
  • 34.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial ProducerInterceptor onAck onAck topic=stock-prices2, part=0, offset=18360 Output
  • 35.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial ProducerInterceptor the rest
  • 36.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial KafkaProducer send() Method ❖ Two forms of send with callback and with no callback both return Future ❖ Asynchronously sends a record to a topic ❖ Callback gets invoked when send has been acknowledged. ❖ send is asynchronous and return right away as soon as record has added to send buffer ❖ Sending many records at once without blocking for response from Kafka broker ❖ Result of send is a RecordMetadata ❖ record partition, record offset, record timestamp ❖ Callbacks for records sent to same partition are executed in order
  • 37.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial KafkaProducer send() Exceptions ❖ InterruptException - If the thread is interrupted while blocked (API) ❖ SerializationException - If key or value are not valid objects given configured serializers (API) ❖ TimeoutException - If time taken for fetching metadata or allocating memory exceeds max.block.ms, or getting acks from Broker exceed timeout.ms, etc. (API) ❖ KafkaException - If Kafka error occurs not in public API. (API)
  • 38.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Using send method
  • 39.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial KafkaProducer flush() method ❖ flush() method sends all buffered records now (even if linger.ms > 0) ❖ blocks until requests complete ❖ Useful when consuming from some input system and pushing data into Kafka ❖ flush() ensures all previously sent messages have been sent ❖ you could mark progress as such at completion of flush
  • 40.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial KafkaProducer close() ❖ close() closes producer ❖ frees resources (threads and buffers) associated with producer ❖ Two forms of method ❖ both block until all previously sent requests complete or duration passed in as args is exceeded ❖ close with no params equivalent to close(Long.MAX_VALUE, TimeUnit.MILLISECONDS). ❖ If producer is unable to complete all requests before the timeout expires, all unsent requests fail, and this method fails
  • 41.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Orderly shutdown using close
  • 42.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Wait for clean close
  • 43.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial KafkaProducer partitionsFor() method ❖ partitionsFor(topic) returns meta data for partitions ❖ public List<PartitionInfo> partitionsFor(String topic) ❖ Get partition metadata for give topic ❖ Produce that do their own partitioning would use this ❖ for custom partitioning ❖ PartitionInfo(String topic, int partition, Node leader, Node[] replicas, Node[] inSyncReplicas) ❖ Node(int id, String host, int port, optional String rack)
  • 44.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial KafkaProducer metrics() method ❖ metrics() method get map of metrics ❖ public Map<MetricName,? extends Metric> metrics() ❖ Get the full set of producer metrics MetricName( String name, String group, String description, Map<String,String> tags )
  • 45.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Metrics producer.metrics() ❖ Call producer.metrics() ❖ Prints out metrics to log
  • 46.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Metrics producer.metrics() output Metric producer-metrics, record-queue-time-max, 508.0, The maximum time in ms record batches spent in the record accumulator. 17:09:22.721 [pool-1-thread-9] INFO c.c.k.p.MetricsProducerReporter - Metric producer-node-metrics, request-rate, 0.025031289111389236, The average number of requests sent per second. 17:09:22.721 [pool-1-thread-9] INFO c.c.k.p.MetricsProducerReporter - Metric producer-metrics, records-per-request-avg, 205.55263157894737, The average number of records per request. 17:09:22.722 [pool-1-thread-9] INFO c.c.k.p.MetricsProducerReporter - Metric producer-metrics, record-size-avg, 71.02631578947368, The average record size 17:09:22.722 [pool-1-thread-9] INFO c.c.k.p.MetricsProducerReporter - Metric producer-node-metrics, request-size-max, 56.0, The maximum size of any request sent in the window. 17:09:22.723 [pool-1-thread-9] INFO c.c.k.p.MetricsProducerReporter - Metric producer-metrics, request-size-max, 12058.0, The maximum size of any request sent in the window. 17:09:22.723 [pool-1-thread-9] INFO c.c.k.p.MetricsProducerReporter - Metric producer-metrics, compression-rate-avg, 0.41441360272859273, The average compression rate of record batches.
  • 47.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Metrics via JMX
  • 48.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockPrice Producer Java Example Lab StockPrice Producer Java Example
  • 49.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockPrice App to demo Advanced Producer ❖ StockPrice - holds a stock price has a name, dollar, and cents ❖ StockPriceKafkaProducer - Configures and creates KafkaProducer<String, StockPrice>, StockSender list, ThreadPool (ExecutorService), starts StockSender runnable into thread pool ❖ StockAppConstants - holds topic and broker list ❖ StockPriceSerializer - can serialize a StockPrice into byte[] ❖ StockSender - generates somewhat random stock prices for a given StockPrice name, Runnable, 1 thread per StockSender ❖ Shows using KafkaProducer from many threads
  • 50.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockPrice domain object ❖ has name ❖ dollars ❖ cents ❖ converts itself to JSON
  • 51.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockPriceKafkaProducer ❖ Import classes and setup logger ❖ Create createProducer method to create KafkaProducer instance ❖ Create setupBootstrapAndSerializers to initialize bootstrap servers, client id, key serializer and custom serializer (StockPriceSerializer) ❖ Write main() method - creates producer, create StockSender list passing each instance a producer, creates a thread pool so every stock sender gets it own thread, runs each stockSender in its own thread
  • 52.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockPriceKafkaProducer imports, createProducer ❖ Import classes and setup logger ❖ createProducer used to create a KafkaProducer ❖ createProducer() calls setupBoostrapAnd Serializers()
  • 53.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Configure Producer Bootstrap and Serializer ❖ Create setupBootstrapAndSerializers to initialize bootstrap servers, client id, key serializer and custom serializer (StockPriceSerializer) ❖ StockPriceSerializer will serialize StockPrice into bytes
  • 54.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockPriceKafkaProducer.mai n() ❖ main method - creates producer, ❖ create StockSender list passing each instance a producer ❖ creates a thread pool (executorService) ❖ every StockSender runs in its own thread
  • 55.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockAppConstants ❖ topic name for Producer example ❖ List of bootstrap servers
  • 56.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockPriceKafkaProducer.getStockSen derList
  • 57.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockPriceSerializer ❖ Converts StockPrice into byte array
  • 58.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockSender ❖ Generates random stock prices for a given StockPrice name, ❖ StockSender is Runnable ❖ 1 thread per StockSender ❖ Shows using KafkaProducer from many threads ❖ Delays random time between delayMin and delayMax, ❖ then sends random StockPrice between stockPriceHigh and stockPriceLow
  • 59.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockSender imports, Runnable ❖ Imports Kafka Producer, ProducerRecord, RecordMetadata, StockPrice ❖ Implements Runnable, can be submitted to ExecutionService
  • 60.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockSender constructor ❖ takes a topic, high & low stockPrice, producer, delay min & max
  • 61.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockSender run() ❖ In loop, creates random record, send record, waits random time
  • 62.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockSender createRandomRecord ❖ createRandomRecord uses randomIntBetween ❖ creates StockPrice and then wraps StockPrice in ProducerRecord
  • 63.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockSender displayRecordMetaData ❖ Every 100 records displayRecordMetaData gets called ❖ Prints out record info, and recordMetadata info: ❖ key, JSON value, topic, partition, offset, time ❖ uses Future from call to producer.send()
  • 64.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Run it ❖ Run ZooKeeper ❖ Run three Brokers ❖ run create-topic.sh ❖ Run StockPriceKafkaProducer
  • 65.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Run scripts ❖ run ZooKeeper from ~/kafka-training ❖ use bin/create-topic.sh to create topic ❖ use bin/delete-topic.sh to delete topic ❖ use bin/start-1st-server.sh to run Kafka Broker 0 ❖ use bin/start-2nd-server.sh to run Kafka Broker 1 ❖ use bin/start-3rd-server.sh to run Kafka Broker 2 Config is under directory called config server-0.properties is for Kafka Broker 0 server-1.properties is for Kafka Broker 1 server-2.properties is for Kafka Broker 2
  • 66.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Run All 3 Brokers
  • 67.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Run create-topic.sh script Name of the topic is stock-prices Three partitions Replication factor of three
  • 68.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Run StockPriceKafkaProducer ❖ Run StockPriceKafkaProducer from the IDE ❖ You should see log messages from StockSender(s) with StockPrice name, JSON value, partition, offset, and time
  • 69.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Using flush and close Lab Adding an orderly shutdown flush and close Using flush and close
  • 70.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Shutdown Producer nicely ❖ Handle ctrl-C shutdown from Java ❖ Shutdown thread pool and wait ❖ Flush producer to send any outstanding batches if using batches (producer.flush()) ❖ Close Producer (producer.close()) and wait
  • 71.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Nice Shutdown producer.close()
  • 72.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Restart Producer then shut it down ❖ Add shutdown hook ❖ Start StockPriceKafkaProducer ❖ Now stop it (CTRL-C or hit stop button in IDE)
  • 73.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Lab Configuring Producer Durability
  • 74.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Set default acks to all ❖ Set defaults acks to all (this is the default) ❖ This means that all ISRs in-sync replicas have to respond for producer write to go through
  • 75.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Note Kafka Broker Config ❖ At least this many in-sync replicas (ISRs) have to respond for producer to get ack ❖ NOTE: We have three brokers in this lab, all three have to be up
  • 76.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Run it. Run Servers. Run Producer. Kill 1 Broker ❖ If not already, startup ZooKeeper ❖ Startup three Kafka brokers ❖ using scripts described earlier ❖ From the IDE run StockPriceKafkaProducer ❖ From the terminal kill one of the Kafka Brokers ❖ Now look at the logs for the StockPriceKafkaProducer, you should see ❖ Caused by: org.apache.kafka.common.errors.NotEnoughReplicasException ❖ Messages are rejected since there are fewer in-sync replicas than required.
  • 77.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial What happens when we shut one down?
  • 78.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Why did the send fail? ❖ ProducerConfig.ACKS_CONFIG (acks config for producer) was set to “all” ❖ Expects leader to only give successful ack after all followers ack the send ❖ Broker Config min.insync.replicas set to 3 ❖ At least three in-sync replicas must respond before send is considered successful
  • 79.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Run it. Run Servers. Run Producer. Kill 1 Broker ❖ If not already, startup ZooKeeper ❖ Ensure all three Kafka Brokers are running if not running ❖ Change StockPriceKafkaProducer acks config to 1 props.put(ProducerConfig.ACKS_CONFIG, “1"); (leader sends ack after write to log) ❖ From the IDE run StockPriceKafkaProducer ❖ From the terminal kill one of the Kafka Brokers ❖ StockPriceKafkaProducer runs normally
  • 80.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial What happens when we shutdown acks “1” this time? ❖ Nothing happens! ❖ It continues to work because only the leader has to ack its write Which type of application would you only want acks set to 1?
  • 81.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Why did the send not fail for acks 1? ❖ ProducerConfig.ACKS_CONFIG (acks config for producer) was set to “1” ❖ Expects leader to only give successful ack after it writes to its log ❖ Replicas still get replication but leader does not wait for replication ❖ Broker Config min.insync.replicas is still set to 3 ❖ This config only gets looked at if acks=“all”
  • 82.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Try describe topics before and after ❖ Try this last one again ❖ Stop a server while producer is running ❖ Run describe-topics.sh (shown above) ❖ Rerun server you stopped ❖ Run describe-topics.sh again
  • 83.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Describe-Topics After Each Stop/Start All 3 brokers running 1 broker down. Leader 1has 2 partitions All 3 brokers running. Look at Leader 1 All 3 brokers running after a few minute while
  • 84.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Review Question ❖ How would you describe the above? ❖ How many servers are likely running out of the three? ❖ Would the producer still run with acks=all? Why or Why not? ❖ Would the producer still run with acks=1? Why or Why not? ❖ Would the producer still run with acks=0? Why or Why not? ❖ Which broker is the leader of partition 1?
  • 85.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Retry with acks = 0 ❖ Run the last example again (servers, and producer) ❖ Run all three brokers then take one away ❖ Then take another broker away ❖ Run describe-topics ❖ Take all of the brokers down and continue to run the producer ❖ What do you think happens? ❖ When you are done, change acks back to acks=all
  • 86.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Using Kafka built-in Producer Metrics Adding Producer Metrics and Replication Verification Metrics Replication Verification
  • 87.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Objectives of lab ❖ Setup Kafka Producer Metrics ❖ Use replication verification command line tool ❖ Change min.insync.replicas for broker and observer metrics and replication verification ❖ Change min.insync.replicas for topic and observer metrics and replication verification
  • 88.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Create Producer Metrics Monitor ❖ Create a class called MetricsProducerReporter that is Runnable ❖ Pass it a Kafka Producer ❖ Call producer.metrics() every 10 seconds in a while loop from run method, and print out the MetricName and Metric value ❖ Submit MetricsProducerReporter to the ExecutorService in the main method of StockPriceKafkaProducer
  • 89.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Create MetricsProducerReporter (Runnable) ❖ Implements Runnable takes a producer
  • 90.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Create MetricsProducerReporter (Runnable) ❖ Call producer.metrics() every 10 seconds in a while loop from run method, and print out MetricName and Metric value
  • 91.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Create MetricsProducerReporter (Runnable) ❖ Increase thread pool size by 1 to fit metrics reporting ❖ Submit instance of MetricsProducerReporter to the ExecutorService in the main method of StockPriceKafkaProducer (and pass new instance a producer)
  • 92.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Run it. Run Servers. Run Producer. ❖ If not already, startup ZooKeeper ❖ Startup three Kafka brokers ❖ using scripts described earlier ❖ From the IDE run StockPriceKafkaProducer (ensure acks are set to all first) ❖ Observe metrics which print out every ten seconds
  • 93.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Look at the output 15:40:43.858 [pool-1-thread-1] INFO c.c.k.p.MetricsProducerReporter - Metric producer-node-metrics, outgoing-byte-rate, 1.8410309773473144, 15:40:43.858 [pool-1-thread-1] INFO c.c.k.p.MetricsProducerReporter - Metric producer-topic-metrics, record-send-rate, 975.3229844767151, 15:40:43.858 [pool-1-thread-1] INFO c.c.k.p.MetricsProducerReporter - Metric producer-node-metrics, request-rate, 0.040021611670301965, The average number of requests sent per second. 15:40:43.858 [pool-1-thread-1] INFO c.c.k.p.MetricsProducerReporter - Metric producer-node-metrics, incoming-byte-rate, 7.304382629577747,
  • 94.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Improve output with some Java love ❖ Keep a set of only the metrics we want
  • 95.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Improve output: Filter metrics ❖ Use Java 8 Stream to filter and sort metrics ❖ Get rid of metric values that are NaN, Infinite numbers and 0s ❖ Sort map my converting it to TreeMap<String, MetricPair> ❖ MetricPair is helper class that has a Metric and a MetricName
  • 96.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Improve output: Pretty Print ❖ Give a nice format so we can read metrics easily ❖ Give some space and some easy indicators to find in log
  • 97.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Pretty Print Metrics Output
  • 98.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Validate Partition Replication Checks lag every 5 seconds for stock-price
  • 99.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Run it. Run Servers. Run Producer. Kill Brokers ❖ If not already, startup ZooKeeper and three Kafka brokers ❖ Run StockPriceKafkaProducer (ensure acks are set to all first) ❖ Start and stop different Kafka Brokers while StockPriceKafkaProducer runs, observe metrics, observe changes, Run replication verification in one terminal and check topics stats in another
  • 100.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Observe Partitions Getting Behind
  • 101.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Recover Replication Verification Describe Topics Output of 2nd Broker Recovering
  • 102.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Run it. Run Servers. Run Producer. Kill Brokers ❖ Stop all Kafka Brokers (Kafka servers) ❖ Change min.insync.replicas=3 to min.insync.replicas=2 ❖ config files for broker are in lab directory under config ❖ Startup ZooKeeper if needed and three Kafka brokers ❖ Run StockPriceKafkaProducer (ensure acks are set to all first) ❖ Start and stop different Kafka Brokers while StockPriceKafkaProducer runs, ❖ Observe metrics, observe changes ❖ Run replication verification in one terminal and check topics stats in another with describe-topics.sh in another terminal
  • 103.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Expected outcome ❖ Producer will work even if one broker goes down. ❖ Producer will not work if two brokers go down because min.insync.replicas=2, two replicas have to be up besides leader Since Producer can run with 1 down broker, the replication lag can get really far behind.When you startup “failed” broker, it catches up really fast.
  • 104.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Change min.insync.replicas back ❖ Shutdown all brokers ❖ Change back min.insync.replicas=3 ❖ Broker config for servers ❖ Do this for all of the servers ❖ Start ZooKeeper if needed ❖ Start brokers back up
  • 105.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Modify bin/create-topic.sh ❖ Modify bin/create- topic.sh ❖ add --config min.insync.replicas=2 ❖ Add this as param to kafka-topics.sh ❖ Run delete-topic.sh ❖ Run create-topic.sh
  • 106.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Recreate Topic ❖ Run delete-topic.sh ❖ Run create-topic.sh
  • 107.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Run it. Run Servers. Run Producer. Kill Brokers ❖ Stop all Kafka Brokers (Kafka servers) ❖ Startup ZooKeeper if needed and three Kafka brokers ❖ Run StockPriceKafkaProducer (ensure acks are set to all first) ❖ Start and stop different Kafka Brokers while StockPriceKafkaProducer runs, ❖ Observe metrics, observe changes ❖ Run replication verification in one terminal and check topics stats in another with describe-topics.sh in another terminal
  • 108.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Expected Results ❖ The min.insync.replicas on the Topic config overrides the min.insync.replicas on the Broker config ❖ In this setup, you can survive a single node failure but not two (output below is recovery)
  • 109.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Lab Batching Records
  • 110.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Objectives ❖ Disable batching and observer metrics ❖ Enable batching and observe metrics ❖ Increase batch size and linger and observe metrics ❖ Run consumer to see batch sizes change ❖ Enable compression, observe results
  • 111.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial SimpleStockPriceConsumer ❖ We added a SimpleStockPriceConsumer to consume StockPrices and display batch lengths for poll() ❖ We won’t cover in detail just quickly since this is a Producer lab not a Consumer lab. :) ❖ Run this while you are running the StockPriceKafkaProducer ❖ While you are running SimpleStockPriceConsumer with various batch and linger config observe output of Producer metrics and StockPriceKafkaProducer output
  • 112.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial SimpleStockPriceConsumer ❖ Similar to other Consumer examples so far ❖ Subscribes to stock-prices topic ❖ Has custom serializer
  • 113.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial SimpleStockPriceConsumer.runConsu mer ❖ Drains topic; Creates map of current stocks; Calls displayRecordsStatsAndStocks()
  • 114.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial SimpleStockPriceConsumer.display…( ❖ Prints out size of each partition read and total record count ❖ Prints out each stock at its current price
  • 115.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockDeserializer
  • 116.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Disable batching ❖ Start by disabling batching ❖ This turns batching off ❖ Run this ❖ Check Consumer and stats
  • 117.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Metrics No Batch Records per poll averages around Batch Size is 80
  • 118.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Set batching to 16K ❖ Set the batch size to 16K ❖ Run this ❖ Check Consumer and stats
  • 119.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Set batching to 16K Results Consumer Records per poll averages around 7.5 Batch Size is now 136.02 59% more batching Look how much the request queue time shrunk! 16K Batch SizeNo Batch Look at the record-send-rate 200% faster!
  • 120.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Set batching to 16K and linger to 10ms ❖ Set the batch size to 16K and linger to 10ms ❖ Run this ❖ Check Consumer and Stats
  • 121.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Results batch size 16K linger 10ms Consumer Records per poll averages around 17 Batch Size is now 796 585% more batching 16K No Batch Look at the record-send-rate went down but higher than start 16K and 10ms linger
  • 122.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Set batching to 64K and linger to 1 second ❖ Set the batch size to 64K and linger to 1 second ❖ Run this ❖ Check Consumer and Stats
  • 123.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Results batch size 64K linger 1s Consumer Records per poll averages around 500 Batch Size is now 40K Record Queue Time is very high :( 16K No Batch Look at the network-io-rate 16K/10ms 64K/1s
  • 124.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Set batching to 64K and linger to 100 ms ❖ Set the batch size to 64K and linger to 100ms second ❖ Run this ❖ Check Consumer and Stats
  • 125.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Results batch size 64K linger 100ms 16K16K/10ms 64K/1s64K/100ms 64K batch size 100ms Linger has the highest record-send-rate!
  • 126.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Turn on compression snappy, 50ms, 64K ❖ Enable compression ❖ Set the batch size to 64K and linger to 50ms second ❖ Run this ❖ Check Consumer and Stats
  • 127.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Results batch size Snappy linger 10ms 64K/100ms Snappy 64K/50ms Snappy 64K/50ms has the highest record-send-rate and 1/2 the queue time!
  • 128.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Lab Adding Retries and Timeouts
  • 129.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Objectives ❖ Setup timeouts ❖ Setup retries ❖ Setup retry back off ❖ Setup inflight messages to 1 so retries don’t store records out of order
  • 130.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Run it. Run Servers. Run Producer. Kill Brokers ❖ Startup ZooKeeper if needed and three Kafka brokers ❖ Modify StockPriceKafkaProducer to configure retry, timeouts, in-flight message count and retry back off ❖ Run StockPriceKafkaProducer ❖ Start and stop any two different Kafka Brokers while StockPriceKafkaProducer runs, ❖ Notice retry messages in log of StockPriceKafkaProducer
  • 131.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Modify StockPriceKafkaProducer ❖ to configure retry, timeouts, in-flight message count and retry back off
  • 132.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Expected output after 2 broker shutdown Run all. Kill any two servers. Look for retry messages. Restart them and see that it recovers Also use replica verification to see when broker catches up
  • 133.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial WARN Inflight Message Count ❖ MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION "max.in.flight.requests.per.connection" ❖ max number of unacknowledged requests client sends on a single connection before blocking ❖ If >1 and ❖ failed sends, then ❖ Risk message re-ordering on partition during retry attempt ❖ Depends on use but for StockPrices not good, you should pick retries > 1 or inflight > 1 but not both. Avoid duplicates. :) ❖ June 2017 release might fix this with sequence from producer
  • 134.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Lab Write ProducerIntercepto r
  • 135.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Objectives ❖ Setup an interceptor for request sends ❖ Create ProducerInterceptor ❖ Implement onSend ❖ Implement onAcknowledge
  • 136.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Producer Interception ❖ Producer config property: interceptor.classes ❖ empty (you can pass an comma delimited list) ❖ interceptors implementing ProducerInterceptor interface ❖ intercept records producer sent to broker and after acks ❖ you could mutate records
  • 137.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial KafkaProducer - Interceptor Config
  • 138.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial KafkaProducer ProducerInterceptor
  • 139.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial ProducerInterceptor onSend ic=stock-prices2 key=UBER value=StockPrice{dollars=737, cents=78, name=' Output
  • 140.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial ProducerInterceptor onAck onAck topic=stock-prices2, part=0, offset=18360 Output
  • 141.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial ProducerInterceptor the rest
  • 142.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Run it. Run Servers. Run Producer. ❖ Startup ZooKeeper if needed ❖ Start or restart Kafka brokers ❖ Run StockPriceKafkaProducer ❖ Look for log message from ProducerInterceptor in output
  • 143.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial ProducerInterceptor Output
  • 144.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Lab Write Custom Partitioner
  • 145.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Objectives ❖ Create StockPricePartitioner ❖ Implements interface Partitioner ❖ Implement partition() method ❖ Implement configure() method with importantStocks property ❖ Configure new Partitioner in Producer config with property ProducerConfig.INTERCEPTOR_CLASSES_CONFIG ❖ Pass config property importantStocks
  • 146.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Producer Partitioning ❖ Producer config property: partitioner.class ❖ org.apache.kafka.clients.producer.internals.DefaultPartitioner ❖ Partitioner class implements Partitioner interface ❖ partition() method takes topic, key, value, and cluster ❖ returns partition number for record
  • 147.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockPricePartitioner
  • 148.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockPricePartitioner.configur e() ❖ Implement configure() method ❖ with importantStocks property ❖ importantStocks get added to importantStocks HashSet
  • 149.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockPricePartitioner partition() ❖ IMPORTANT STOCK: If stockName is in importantStocks HashSet then put it in partitionNum = (partitionCount -1) (last partition) ❖ REGULAR STOCK: Otherwise if not in importantStocks then not important use the absolute value of the hash of the stockName modulus partitionCount -1 as the partition to send the record ❖ partitionNum = abs(stockName.hashCode()) % (partitionCount - 1)
  • 150.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial StockPricePartitioner partition()
  • 151.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Producer Config: Configuring Partitioner ❖ Configure new Partitioner in Producer config with property ProducerConfig.INTERCEPTOR_CLASSES_CONFIG ❖ Pass config property importantStocks ❖ importantStock are the ones that go into priority queue
  • 152.
    ™ Cassandra / KafkaSupport in EC2/AWS. Kafka Training, Kafka Consulting, Kafka Tutorial Review of lab work ❖ You implemented custom ProducerSerializer ❖ You tested failover configuring broker/topic min.insync.replicas, and acks ❖ You implemented batching and compression and used metrics to see how it was or was not working ❖ You implemented retires and timeouts, and tested that it worked ❖ You setup max inflight messages and retry back off ❖ You implemented a ProducerInterceptor ❖ You implemented a custom partitioner to implement a priority queue for important stocks

Editor's Notes

  • #25 Mention why IN_Flight is set to 1.