MapReduce基础-自定义WordCount过程

xiaoxiao2021-02-28 41

hadoop自带的wordcount应用： >1）本地文件hello.txt cat hello.txt 2）复制文件： cp hello.txt hello2.txt 3）在远程创建d3文件夹 hadoop@Master:/usr/local/hadoop/share/hadoop/mapreduce$ hadoop fs -mkdir /user/hadoop/d3 4）分别将本地的hello.txt文件上传到远程hdfs hadoop@Master:/usr/local/hadoop/share/hadoop/mapreduce$ hadoop fs -put hello.txt /user/hadoop/d3 hadoop@Master:/usr/local/hadoop/share/hadoop/mapreduce$ hadoop fs -put hello2.txt /user/hadoop/d3 5）执行任务 ,统计hadoop下d3文件夹下的文本文件单词： >hadoop jar hadoop-mapreduce-examples-2.8.0.jar wordcount /user/hadoop/d3 /user/hadoop/d3output 这个过程会有mapreduce执行的过程，输出很多info信息，时间从几十秒到几分钟，看统计任务的大小 6）查看输出文件夹，有Success文件表示成功 >hadoop fs -ls /user/hadoop/d3output 7）显示输出文件中内容结果： >hadoop fs -cat /user/hadoop/d3output/part-r-00000 自定义的WordCount过程：运行完成后，将整个项目导出export成 jar file (我导出的路径是本地的hadoop_file文件夹中的MyWordCount.jar)在hadoop中的/user/hadoop/d3文件夹中放入了两个文档如下：在导出的路径下执行：hadoop jar MyWordCount.jar MyWordCount 输入文件输出文件查看输出文件是否成功输出内容：最后执行hadoop fs -cat /user/hadoop/output0502单词查看统计结果：

转载请注明原文地址: https://www.6miu.com/read-2612380.html

技术

最新回复(0)