Understanding the Basics of Hadoop Frameworks

Understanding Basics Hadoop Frameworks

Hadoop – Hadoop is an open source framework and written in java. Hadoop is big database and used to storing and processing the large amount of data across hadoop clusters. Hadoop having many frameworks for processing the data .. Here we are discussed about types of hadoop frameworks

Hive – Hive is a data warehousing framework and that built in Hadoop. It allows the querying operations using SQL query so called as HiveQL. Hive used to write the hard MapReduce programs and hive related to relational database.

Pig – Pig is a data transforming framework which used to analyse and transforming the large amount of data. It has own SQL called pig-latin. Pig-Latin used to mapreduce jobs coding without using java.

Flume – Flume is a data moving framework and used to moving large sets of data from one place to another place. Flume gets the data from the online server which placed in HDFS. Flume program written in java and transfers data to HDFS directly.

Drill – Drill is an SQL query engine and listed the SQL queries and Not like Sql queries. Drill written sql queries easily within minutes.

Kafka – Kafka is a messaging framework in hadoop. Kafka is an queuing system when work with storm. Kafka maintains the online and offline messages and that messages moves to disks so it prevent the data loss.

Tez – Tez used to building applications using DAG (Directed Acyclic Graph). Tez allows written MapReduce jobs and make a pig script jobs very fast.

Zeppelin – Zeppelin is a Web-based notebook for interactive data analytics. It makes data visualization as easy as drag and drop. Zeppelin works with Hive and Spark (all languages) and markdown.