

Kafka's performance is effectively constant with respect to data size so retaining lots of data is not a problem. For example if the log retention is set to two days, then for the two days after a message is published it is available for consumption, after which it will be discarded to free up space. The Kafka cluster retains all published messages-whether or not they have been consumed-for a configurable period of time. The messages in the partitions are each assigned a sequential id number called the offset that uniquely identifies each message within the partition. So, at a high level, producers send messages over the network to the Kafka cluster which in turn serves them up to consumers like this:Įach partition is an ordered, immutable sequence of messages that is continually appended to-a commit log. Kafka is run as a cluster comprised of one or more servers each of which is called a broker.We'll call processes that subscribe to topics and process the feed of published messages consumers.We'll call processes that publish messages to a Kafka topic producers.Kafka maintains feeds of messages in categories called topics.It provides the functionality of a messaging system, but with a unique design.įirst let's review some basic messaging terminology: Kafka® is a distributed, partitioned, replicated commit log service. 7.2 Encryption and Authentication using SSL.You're viewing documentation for an older version of Kafka - check out our current documentation here.
