What is Apache Kafka?
Kafka is designed for distributed systems.It mainly used to transfer data from Hadoop using the messaging system. Messaging system means transferring data from one application to another one but it does not consider how to transfer data and is based on message queuing.There are two types of messaging system in Kafka.
1.Point to Point
Apache Kafka is the publish – Subscribe messaging system and is suitable for online and offline messages.Kafka has prevented the data without any data loss. Kafka runs on top of the Zookeeper.
Recommended Reading – Hadoop Ecosystem Architecture and Components
Architecture and Components of Apache Kafka:
Apache Kafka having following six components
5. Kafka Cluster
Brokers have maintained the Kafka load balance and published data. kafka having more number of brokers. In Kafka zookeeper are used to analyze the broker state. One broker can handle multiple numbers of messages at one system without any data loss.
The producer is sent data to the broker. If any new broker is created procedure automatically sends the message to the broker. It cannot wait for broker acknowledgments. Producer message format is segmented files.
Main purposes of the consumer are read broker data and publish the broker messages. It consumes a number of data within one second. If one message comes to consumer and we are sending the message to all consumer group.
Zookeeper maintains the broker, producer and consumer data. Zookeeper has identified any failure in broker and send that failure message to the consumer. If one receives the message it takes a decision for that failure broker.
Kafka cluster having more number of broker for maintains the broker data.
It also is known as a consumer and it updates the data. The follower is following the leader instructions. If one leader fails automatically creates the one leader.