用maven编译打包spark2.1.1 以下编译都是在root用户下执行的命令
建议虚拟机内存4g以上
编译前安装一些压缩解压缩工具 yum install -y snappy snappy-devel bzip2 bzip2-devel lzo lzo-devel lzop openssl openssl-devel
1.安装Maven 3.3.9和Java 8,并且配置环境变量;
2.安装R包 先安装epel yum list epel* yum install epel-release 再安装R yum list R yum -y install R 如果没安装R包 ./dev/make-distribution.sh …会报错: Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.4.0:exec (sparkr-pkg) on project spark-core_2.10: Command execution failed. Process exited with an error: 127 (Exit value: 127) -> [Help 1]
3.设置maven选项 jdk1.7: export MAVEN_OPTS=”-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m” jdk1.8: export MAVEN_OPTS=”-Xmx2g -XX:ReservedCodeCacheSize=512m”
4.切换到spark解压后的源码根目录下 cd /root/spark-2.1.1
5.这里的选择scala-2.11,用2.10编译报错,在build前切换scala版本 ./dev/change-scala-version.sh 2.11
6.切换到spark2.1.0解压后的源码根目录下 ./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -Dscala-2.11 -Phive -Phive -thriftserver -DskipTests clean package
7.打包 ./dev/make-distribution.sh –name custom-spark –tgz -Psparkr -Phadoop-2.7 -Phive -Phive-thriftserver -Pmesos -Pyarn