Monday, August 18, 2014

Hive - Create Database and table in Hive

Hive is a component, which provides SQL-Like interface to access data in HDFS. It provides data warehousing facilities on HDFS.
HQL statements are broken down by the Hive service into MapReduce jobs and executed across a Hadoop cluster. For anyone with a SQL or relational database background, this section will look very familiar to you. As with any database management system (DBMS), you can run your Hive queries in many ways. 
Create Database syntax:

CREATE DATABASE IF NOT EXISTS <dbname>
LOCATION '/lib/warehouse/sample'
COMMENT 'Holds all db tables'
WITH DBPROPERTIES ('Use' = 'Demos', 'SchemaInfo' = 'db schema information');

·         IF NOT EXISTS clause is useful for scripts that should create a database onthe-fly, if necessary.
·         Location used to override the default location of the directory.

Create Table syntax:


CREATE TABLE employees (
name STRING,
salary FLOAT,
subordinates ARRAY<STRING>,
deductions MAP<STRING, FLOAT>,
addr STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>)
PARTITIONED BY (country STRING, state STRING)
COMMENT 'Description of the table'
TBLPROPERTIES ('creator'='me', 'created_at'='2012-01-02’', ...)
LOCATION '/user/hive/warehouse/mydb.db/employees';
FIELDS TERMINATED BY ','
COLLECTION ITEMS TERMINATED BY '\002'
MAP KEYS TERMINATED BY '\003'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;

·         String,Float,Array,Map,Strauct are some of the data types.
·         Struct is represented as a particular type.
·         Deduction is Map type, with key value pair data type
·         For Array<string> every item in subordinate will be string
·         If the filed terminated by ‘,’, the file will be saved in csv format.
·         Each terminated by new line.



No comments:

Post a Comment