What Is Hive?





we also offer , online and classroom trainings
we support in POC
author: Bharat (sree ram)
contact : 04042026071
__________________________________________________________
WHAT IS HIVE   ?      
Hive is one of important echo system in hadoop framework, 
by which , you can process and analyze HDFS files data .
Hive is also called data warehouse environment of hadoop framework.
The language used in hive is  hql (Hive Query language) which is similar to sql of rdbms.
but there are lots of differences between hive and rdbms.
Hive supports only batch process (bulk data processing) , and does not support row level operations such as reading a row randomly (ex: select * from sales where prid='909') , inserting a single row (ex: insert into sales values(......) )etc..
hql does not have dml statements to delete and update rows, but by using indirect methods we can update or delete data of hive tables.
hive will run on top hdfs and mapreduce.
Hive storage is HDFS:
 this means, when you create a table in hive , in hdfs one table directory will be created.        
 If  you load any file into hive table, the file will be copied into its backend hdfs directory.
Hive execution model is mapreduce :
this means, when you submit hql statement, the hql statement will be converted into MapReduce code, and the converted code will be submitted to jvm. so hadoop can execute the hql statement in MapReduce style.
so , developer/analyst can easily process or analyze the data using hql statements with out  writing complex java programs.
Especially, hive is good for adhoc reporting or analytics.
but sql or hql is not solution for every situation of analytics. Because for your analytics, some custom functionalities are required , which are not available in hive built in functions.
These custom functionalities can be developed and written in hive UDFs(User defined functions).
hive udfs can be developed in following languages:
    --> java
    --> python
    --> c++
    --> Ruby
    --> R (statistical programming)
These udfs to be registered in hive, and then can be called any number of times.
   Author:
   Bharat Ram  


0 comments: