Types of Nodes in Hadoop

Types of Nodes in Hadoop

1. NameNode:

NameNode is the main and heartbeat node of Hdfs and also called master. It stores the meta data in RAM for quick access and track the files across hadoop cluster. If Namenode failure the whole hdfs is inaccessible so NameNode is very critical for HDFS. NameNode is the health of datanode and it access datanode data only. NameNode Tracking all information from files such as which file saved in cluster, access time of file and Which user access a file on current time.There are two types of NameNode

2. Secondary NameNode:

Secondary NameNode helps to Primary NameNode and merge the namespaces. Secondary NameNode stores the data when NameNode failure and used to restart the NameNode. It requires huge amount of memory for data storing. Secondary NameNode runs on different machines for memory management. Secondary NameNode is checking point of NameNode.

3. DataNode:

DataNode stores actual data of HDFS and also called Slave. If DataNode failure it does not affect any data which stored in DataNode. It Configured lot of disk space because DataNode stores actually data. It perform read and write operations as per client request. Performance of DataNode are based on NameNode Instuctions.

4. Checkpoint Node:

Checkpoint Node mainly designed for solves the NameNode drawbacks. It tracks the latest checkpoint directory that has same structure. It creates the checking point for NameNode namespace and downloads the edits and fsimage from NameNode and mering locally. The new image are uploaded to NameNode. After this uploads the result to NameNode.

5. Backup Node:

It provide the functions to Check Point but it interact with NameNode and it supports the online streaming of filesystems. In Backup Nodes namespace are available on main memory because it interact with primary node. It maintains up-to date file namespace for streaming process. Backup node having own memory so just save and copy the namespace from main memory.

6. Job Tracker Node:

Job Tracker node used to MapReduce jobs and it process runs on separate node. It receives the request for MapReduce from client. It Loacte the data when after talking of NameNode. Job tracked choose best TaskTracker for executes the task and it give slots to tasktracker for execute the task. It monitors the all TaskTrackers and give status report to Client. If Job Tracked failure MapReduce function does not executed and all functions are halted so Job Tracker is critical for MapReduce.

7. Task Tracker Node:

Task Tracker are runs on DataNode. TaskTrackers will be assigned Mapper and Reducer tasks to execute by JobTracker. If Task Tracker failure the job tracker assign the task for another node so MapReduce Task running successfully..