Hive Tables



we also offer , online and classroom trainings
we support in POC
author: Bharat (sree ram)
contact : 04042026071
___________________________________________________________________________
Hive Tables:
                Hive tables are classified into two types.
Ø  Inner tables
Ø  External tables
Inner Tables:
                Whenever a hive table is created in the following default location a directory will be created for the table.If a file is loaded into table,the file will be copied into table directory.
Ex: Hive> Create table samp(str string);
      now IN HDFS following directory will be created.
     /user/hive/warehouse/samp
If you load a file,
                Ex: Hive > load data local inpath ‘file1.txt’ into table samp;
In HDFS
                /user/hive/warehouse/samp/file1.text
                In above path samp is directory file1.text is file which loaded in to table. When you select the rows of table,hive will read data from all files of table directory.
             Ex: Hive > select * from samp;
Now hive will read data from /user/hive/warehouse/samp/file1.txt file.
                If you drop inner table,the table directory will also be deleted from HDFS.
              That means of table is dropped,you loose mete data and also data.
                                Ex: Hive > drop table samp;

                                From HDFS, /user/hive/warehouse/samp/ directory will be deleted.
External Table:
Ø  Table uses custom location
Ø  When table is dropped,the backendHDFS table directory will not be deleted.
That means,if table is dropped only metadata (table from hive) will be dropped.But data will be safely available in HDFS directory.
Ex:
     Hive > create external table sample(str string)  location ‘/user/mydir’;
Now in HDFS, /user/mydir will be created for table.

If you load a file into table, hive > load data local inpath ‘samp.text’ into table sample;

In HDFS, /user/mydir/samp.txt

If you select rows from sample table
                hive > select * from sample;
Now hive will read data from all files of following directory
                /user/mydir/samp.txt
If sample table is dropped.
                hive > drop table sample;
Now from hive, sample table will be deleted.
But backend HDFS directory still available.
                /user/mydir/samp.txt
So still the data can be reused by hive or other echosystems of hadoop.

The Summary
Inner tables : 1.Use default warehouse location.
                          2.if table dropped ,data and metadata will be lost.

External tables: 1.Uses custom location of HDFS.
                              2. If  table  is  dropped,  only  metadata will be lost. 

5 comments: