Hive 基本操作

xiaoxiao2021-02-28  81

目录 1.显示地展示当前使用的数据库 2.建表 3.导入本地文件 4.数据库多用户共享表 5.创建数据库 5.1仓库数据库判断数据库是否存在 5.2.显示现在有的数据库 5.3条件查询数据库 5.4创建指定存放文件位置 数据库 5.5创建数据库时 添加注释信息 5.6创建数据库 添加扩展信息 5.7使用数据库  5.8删除数据库  5.9当数据库存在表时,先要删除表 再能删除数据库   5.10修改数据库 6.1创建表 6.2创建表-复制表结构 6.3显示某个数据库中的表 6.4显示指定筛选条件 表名 6.5显示表扩展信息 6.6指定显示某个字段的信息 7.1External Tables -- 外部表 7.2复制表结构仓库外部表 7.3Partitioned, Managed Tables --分区表 7.3.1建议安全措施 7.3.2查看现有分区 7.3.3查看分区详细 分区键 7.3.3通过 DESC 显示分区键 7.3.4从文件读入 分区表 7.4.1External Partitioned Tables 外部分区表 7.4.1.1.先创建外部表结构 7.4.2.为外部表增加指定分区 7.4.3.把数据表复制外部表目录结构中 7.4.3显示 表分区信息 1.显示地展示当前使用的数据库 :hive> set hive.cli.print.current.db=true;  2.建表:create table t_order(id int,age int) row format delimited fields terminated by ','; 3.导入本地文件:load data local inpath '/home/hadoop/hivedata/data' into table t_order; 4.数据库多用户共享表(解决不同用户不能共享数据库):mysqld_safe --skip-grant-tables 注意:Hive 不支持行级插入,更新,删除。也不支持事务 5.创建数据库 hive> CREATE DATABASE financials;   5.1仓库数据库判断数据库是否存在  hive> CREATE DATABASE IF NOT EXISTS financials;   5.2.显示现在有的数据库 hive> SHOW DATABASES;                                           hive> CREATE DATABASE human_resources;                                           hive> SHOW DATABASES;   5.3条件查询数据库  hive> SHOW DATABASES LIKE 'h.*';   5.4创建指定存放文件位置 数据库   hive> CREATE DATABASE financials       > LOCATION '/my/preferred/directory';   5.5创建数据库时 添加注释信息 hive> CREATE DATABASE financials       > COMMENT 'Holds all financial tables';   hive> DESCRIBE DATABASE financials;   financials   Holds all financial tables     hdfs://master-server/user/hive/warehouse/financials.db   5.6创建数据库 添加扩展信息 hive> CREATE DATABASE financials       > WITH DBPROPERTIES ('creator' = 'Mark Moneybags', 'date' = '2012-01-02');   hive> DESCRIBE DATABASE financials;   financials   hdfs://master-server/user/hive/warehouse/financials.db   hive> DESCRIBE DATABASE EXTENDED financials;   financials   hdfs://master-server/user/hive/warehouse/financials.db    {date=2012-01-02, creator=Mark Moneybags);   5.7使用数据库  hive> USE financials;   5.8删除数据库 hive> DROP DATABASE IF EXISTS financials;   5.9当数据库存在表时,先要删除表 再能删除数据库  hive> DROP DATABASE IF EXISTS financials CASCADE;   5.10修改数据库 hive> ALTER DATABASE financials SET DBPROPERTIES ('edited-by' = 'Joe Dba');           There is no way to delete or “unset” a DBPROPERTY  没有方法删除或重置 DBPROPERTY  6.1创建表 CREATE TABLE IF NOT EXISTS mydb.employees (     name         STRING COMMENT 'Employee name',     salary       FLOAT  COMMENT 'Employee salary',     subordinates ARRAY<STRING> COMMENT 'Names of subordinates',     deductions   MAP<STRING, FLOAT>                  COMMENT 'Keys are deductions names, values are percentages     address      STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>                  COMMENT 'Home address')   COMMENT 'Description of the table'   TBLPROPERTIES ('creator'='me', 'created_at'='2012-01-02 10:00:00', ...)   LOCATION '/user/hive/warehouse/mydb.db/employees';  
6.2创建表-复制表结构 CREATE TABLE IF NOT EXISTS mydb.employees2   LIKE mydb.employees;  
6.3显示某个数据库中的表 hive> USE mydb;   hive> SHOW TABLES;    
6.4显示指定筛选条件 表名 hive> USE mydb;   hive> SHOW TABLES 'empl.*';   employees  
6.5显示表扩展信息 hive> DESCRIBE EXTENDED mydb.employees;   name    string  Employee name   salary  float   Employee salary   subordinates    array<string>   Names of subordinates   deductions      map<string,float> Keys are deductions names, values are percentages   address struct<street:string,city:string,state:string,zip:int>  Home address   Detailed Table Information      Table(tableName:employees, dbName:mydb, owner:me,   ...   location:hdfs://master-server/user/hive/warehouse/mydb.db/employees,   parameters:{creator=me, created_at='2012-01-02 10:00:00',               last_modified_user=me, last_modified_time=1337544510,               comment:Description of the table, ...}, ...)  
6.6指定显示某个字段的信息 hive> DESCRIBE mydb.employees.salary;   salary  float   Employee salary  
7.1External Tables -- 外部表 外部表,删除表不删除数据 CREATE EXTERNAL TABLE IF NOT EXISTS stocks (     exchange        STRING,     symbol          STRING,     ymd             STRING,     price_open      FLOAT,     price_high      FLOAT,     price_low       FLOAT,     price_close     FLOAT,     volume          INT,     price_adj_close FLOAT)   ROW FORMAT DELIMITED FIELDS TERMINATED BY ','   LOCATION '/data/stocks';  
7.2复制表结构仓库外部表 CREATE EXTERNAL TABLE IF NOT EXISTS mydb.employees3   LIKE mydb.employees   LOCATION '/path/to/data';  
7.3Partitioned, Managed Tables --分区表 CREATE TABLE employees (     name         STRING,     salary       FLOAT,     subordinates ARRAY<STRING>,     deductions   MAP<STRING, FLOAT>,     address      STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>   )   PARTITIONED BY (country STRING, state STRING);   However, Hive will now create subdirectories reflecting the partitioning structure. For example: ...   .../employees/country=CA/state=AB   .../employees/country=CA/state=BC   ...   .../employees/country=US/state=AL   .../employees/country=US/state=AK   ...  
7.3.1建议安全措施 把HIVE 设置成“严格”模式,禁止分区表的查询没有 一个WHERE子句 hive> set hive.mapred.mode=strict;   hive> SELECT e.name, e.salary FROM employees e LIMIT 100;   FAILED: Error in semantic analysis: No partition predicate found for    Alias "e" Table "employees"   hive> set hive.mapred.mode=nonstrict;   hive> SELECT e.name, e.salary FROM employees e LIMIT 100;  
7.3.2查看现有分区 hive> SHOW PARTITIONS employees;   ...   Country=CA/state=AB   country=CA/state=BC   ...   country=US/state=AL   country=US/state=AK  
7.3.3查看分区详细 分区键 hive> SHOW PARTITIONS employees PARTITION(country='US');   country=US/state=AL   country=US/state=AK   ...   hive> SHOW PARTITIONS employees PARTITION(country='US', state='AK');   country=US/state=AK  
7.3.3通过 DESC 显示分区键 hive> DESCRIBE EXTENDED employees;   name         string,   salary       float,   ...   address      struct<...>,   country      string,   state        string   Detailed Table Information...   partitionKeys:[FieldSchema(name:country, type:string, comment:null),   FieldSchema(name:state, type:string, comment:null)],   ...  
7.3.4从文件读入 分区表 LOAD DATA LOCAL INPATH '${env:HOME}/california-employees'   INTO TABLE employees   PARTITION (country = 'US', state = 'CA');  
7.4.1External Partitioned Tables 外部分区表 1.先创建外部表结构 CREATE EXTERNAL TABLE IF NOT EXISTS log_messages (     hms             INT,     severity        STRING,     server          STRING,     process_id      INT,     message         STRING)   PARTITIONED BY (year INT, month INT, day INT)   ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';   7.4.2.为外部表增加指定分区 ALTER TABLE log_messages ADD PARTITION(year = 2012, month = 1, day = 2)   LOCATION 'hdfs://master_server/data/log_messages/2012/01/02';   7.4.3.把数据表复制外部表目录结构中 Copy the data for the partition being moved to S3. For example, you can use the Hadoop distcp command: hadoop distcp /data/log_messages/2011/12/02 s3n://ourbucket/logs/2011/12/02   •Alter the table to point the partition to the S3 location: ALTER TABLE log_messages PARTITION(year = 2011, month = 12, day = 2)   SET LOCATION 's3n://ourbucket/logs/2011/01/02'; •Remove the HDFS copy of the partition using the hadoop fs -rmr command: hadoop fs -rmr /data/log_messages/2011/01/02   7.4.3显示 表分区信息 hive> SHOW PARTITIONS log_messages;   ...   year=2011/month=12/day=31   year=2012/month=1/day=1   year=2012/month=1/day=2   hive> DESCRIBE EXTENDED log_messages;   ...   message         string,   year            int,   month           int,   day             int   Detailed Table Information...   partitionKeys:[FieldSchema(name:year, type:int, comment:null),   FieldSchema(name:month, type:int, comment:null),   FieldSchema(name:day, type:int, comment:null)],   ...   hive> DESCRIBE EXTENDED log_messages PARTITION (year=2012, month=1, day=2);   ...   location:s3n://ourbucket/logs/2011/01/02,   ...   8.1Customizing Table Storage Formats -- 表存储格式(表操作) CREATE TABLE employees (     name         STRING,     salary       FLOAT,     subordinates ARRAY<STRING>,     deductions   MAP<STRING, FLOAT>,     address      STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>   )   ROW FORMAT DELIMITED   FIELDS TERMINATED BY '\001'   COLLECTION ITEMS TERMINATED BY '\002'   MAP KEYS TERMINATED BY '\003'   LINES TERMINATED BY '\n'   STORED AS TEXTFILE;   Dropping Tables -- 删除表 DROP TABLE IF EXISTS employees;   For external tables, the metadata is deleted but the data is not. Alter Table --修改表结构 ALTER TABLE modifies  table metadata  only. The  data  for  the  table  is untouched. It’s up to you to ensure that any modifications are consistent with the actual data. Renaming a Table -- 修改表名 ALTER TABLE log_messages RENAME TO logmsgs;   Adding, Modifying, and Dropping a Table Partition -- 增加,修改,删除 表分区 ALTER TABLE log_messages ADD IF NOT EXISTS   PARTITION (year = 2011, month = 1, day = 1) LOCATION '/logs/2011/01/01'   PARTITION (year = 2011, month = 1, day = 2) LOCATION '/logs/2011/01/02'   PARTITION (year = 2011, month = 1, day = 3) LOCATION '/logs/2011/01/03'   ALTER TABLE log_messages PARTITION(year = 2011, month = 12, day = 2)   SET LOCATION 's3n://ourbucket/logs/2011/01/02';   ALTER TABLE log_messages DROP IF EXISTS PARTITION(year = 2011, month = 12, day = 2);   Changing Columns --修改列 ALTER TABLE log_messages   CHANGE COLUMN hms hours_minutes_seconds INT   COMMENT 'The hours, minutes, and seconds part of the timestamp'   AFTER severity;   Adding Columns --增加列 ALTER TABLE log_messages ADD COLUMNS (    app_name   STRING COMMENT 'Application name',    session_id LONG   COMMENT 'The current session id');   Deleting or Replacing Columns --删除 替换列 ALTER TABLE log_messages REPLACE COLUMNS (    hours_mins_secs INT    COMMENT 'hour, minute, seconds from timestamp',    severity        STRING COMMENT 'The message severity'    message         STRING COMMENT 'The rest of the message');   This statement effectively renames the original hms column and removes the server and process_id columns from the original schema definition. As for all ALTER statements, only the table metadata is changed. Alter Table Properties --修改表属性 ALTER TABLE log_messages SET TBLPROPERTIES (    'notes' = 'The process id is no longer captured; this column is always NULL');   Alter Storage Properties --修改存储属性 ALTER TABLE log_messages   PARTITION(year = 2012, month = 1, day = 1)   SET FILEFORMAT SEQUENCEFILE;   You can specify a new SerDe along with SerDe properties or change the properties for the existing SerDe. The following example specifies that a table will use a Java class named com.example.JSONSerDe to process a file of JSON-encoded records ALTER TABLE table_using_JSON_storage   SET SERDE 'com.example.JSONSerDe'   WITH SERDEPROPERTIES (    'prop1' = 'value1',    'prop2' = 'value2');   来源: http://blog.csdn.net/iquicksandi/article/details/8522691
转载请注明原文地址: https://www.6miu.com/read-78001.html

最新回复(0)