topics and paritions:
https://codingharbour.com/apache-kafka/the-introduction-to-kafka-topics-and-partitions/
Partition has several purposes in Kafka.
From Kafka broker’s point of view, partitions allow a single topic to be distributed over multiple servers. That way it is possible to store more data in a topic than what a single server could hold. If you imagine you needed to store 10TB of data in a topic and you have 3 brokers, one option would be to create a topic with one partition and store all 10TB on one broker. Another option would be to create a topic with 3 partitions and spread 10 TB of data over all the brokers.
replication of brokers:
https://stackoverflow.com/questions/44787552/in-kafka-is-each-message-replicated-across-all-partitions-of-a-topic#:~:text=Each%20message%20goes%20into%20a,is%20replicated%20across%20those%20brokers.
Replication does not occur across partitions. Each message goes into a single partition of the topic, no matter how many partitions the topic has.
If you have set the replication-factor for topic to a number larger than 1 (assuming you have multiple brokers running in the cluster), then each partition of the topic is replicated across those brokers.
consumer groups:
onsumer group
A consumer group is a group of consumers (I guess you didn’t see this coming?) that share the same group id. When a topic is consumed by consumers in the same group, every record will be delivered to only one consumer. As the official documentation states: “If all the consumer instances have the same consumer group, then the records will effectively be load-balanced over the consumer instances.”
This way you can ensure parallel processing of records from a topic and be sure that your consumers won’t be stepping on each other toes.
How does Kafka achieve this?
Each topic consists of one or more partitions. When a new consumer is started it will join a consumer group (this happens under the hood) and Kafka will then ensure that each partition is consumed by only one consumer from that group.
So, if you have a topic with two partitions and only one consumer in a group, that consumer would consume records from both partitions.

No comments:
Post a Comment