hive加载json数据

xiaoxiao2025-08-09  25

json数据压缩成.gz文件上传到hdfs,通过hive映射,并进行后续的spark跑算法

1,在hdfs上的数据格式化后的形式,根据需求我们只要_source里的数据

{ "_index": "operatelog", "_type": "operatelog", "_id": "5938ca71-6057-40b5-a739-5687ae360640", "_source": { "id": "5938ca71-6057-40b5-a739-5687ae360640", "appId": "1000", "actionId": "1005", "code": "0", "sn": "50631032", "logLevel": "DEBUG" }, "sort": [1500349702325] }

2,这就简单了,没有复杂的格式,建表

create external table operatelog( id string, appId string, actionId string, code string, sn string, `logLevel` string ) partitioned by (etlyear string ,etlmonth string,etlday string) row format serde 'org.apache.hive.hcatalog.data.JsonSerDe' STORED AS TEXTFILE ;

此时,如果遇到FAILED: ParseException line 1:34 cannot recognize input near 'loglevel' 'string' ',' in column specification (state=42000,code=40000),这个是因为loglevel是hive的关键字段,此时用 ``飘号 (键盘左上角跟~重合的键)盖住就行

3,如果加上飘号之后建表还是报错

     FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: org.apache.hive.hcatalog.data.JsonSerDe

这是说明,环境里没有格式化json的JsonSerDe类,所以此时应该将JsonSerDe加到hive环境里

 

命令:

0: jdbc:hive2://192.168.18.14:10000> add jar ${HIVE_HOME}hcatalog/share/hcatalog/hive-hcatalog-core-0.13.1-cdh5.3.2.jar

至于${HIVE_HOME}路径是什么,可以通过echo ${HIVE_HOME} 查看,然后补全就行

4,然后再次建表,此时应该就没问题了,直接映射,select看是否完美就行。

但是,敲黑板划重点了啊,如果是有复杂结构的json格式的数据,就比如下列数据

"_source": { "id": "184TCA876817", "sn": "184TCA876817", "param": ["LdTms|38|2.1.7", "富掌柜|42|1.5.9", "消息中心|22|2.0.2", "KunTengConver|1|1.0", ] }

表结构如下

create external table apkversion( id string, sn string, `_source` array<string> ) partitioned by (etlyear string ,etlmonth string,etlday string) row format serde 'org.apache.hive.hcatalog.data.JsonSerDe' STORED AS TEXTFILE ;

建表,映射都OK,但是你在select * from apkversion limit 5; 时,你又发现一个坑

0: jdbc:hive2://192.168.18.14:10000> select * from apkversion limit 1;                 org.apache.hadoop.hive.serde2.SerDeException: java.io.IOException: Start of Array expected (state=,code=0)

这是因为JsonSerDe不支持复杂格式,比如array,就问你难受不?

至今我还没解决这个坑,如果有哪个大神解决了,请给小弟留言,小弟谢过了

转载请注明原文地址: https://www.6miu.com/read-5034547.html

最新回复(0)