集群之间数据的迁移

xiaoxiao2021-02-28  111

场景:旧集群的数据要迁移到新集群上面

hadoop distcp [option] hdfs://master_ip:8020/hive/warehouse/xxx.db/tab_name hdfs://master_ip:8020/hive/warehouse/xxx.db/tab_name

option的内容可以hadoop distcp回车就可以查看帮助了,这里不用多解释了吧。

master_ip:填集群master的IP

tab_name:天要迁移表的名字

路径要保证正确,如果你不知道表的路径可以用desc formatted db_name.tab_name来看。location就是正确的路径,把test01换成master_ip:port即可。

例如:

hive> desc formatted aidemo.ac_ref; OK # col_name data_type comment pkg_name string label string # Detailed Table Information Database: aidemo Owner: hchou CreateTime: Wed Jun 07 15:34:35 CST 2017 LastAccessTime: UNKNOWN Protect Mode: None Retention: 0 Location: hdfs://test01/hive/warehouse/aidemo.db/ac_ref Table Type: MANAGED_TABLE Table Parameters: transient_lastDdlTime 1496820875 # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat: org.apache.hadoop.mapred.TextInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Compressed: No Num Buckets: -1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: field.delim \t serialization.format \t Time taken: 0.078 seconds, Fetched: 28 row(s)

转载请注明原文地址: https://www.6miu.com/read-54436.html

最新回复(0)