记学习大数据踩坑系列--Secondary namenode failed to start via ambari.

xiaoxiao2021-02-28  26

以下是报错日志

File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/snamenode.py", line 143, in <module>Traceback (most recent call last):

SNameNode().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/snamenode.py", line 51, in start snamenode(action="start") File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_snamenode.py", line 47, in snamenode create_log_dir=True File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py", line 267, in service Execute(daemon_cmd, not_if=process_id_exists_command, environment=hadoop_env_exports) File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 238, in action_run tries=self.resource.tries, try_sleep=self.resource.try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call tries=tries, try_sleep=try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call raise Fail(err_msg)resource_management.core.exceptions.

Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start secondarynamenode'' returned 1. starting secondarynamenode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-secondarynamenode-slaver1.outJava HotSpot(TM) 64-Bit Server VM 

warning: INFO: os::commit_memory(0x00000000c8000000, 939524096, 0) failed; error='Cannot allocate memory' (errno=12)## There is insufficient memory for the Java Runtime Environment to continue.# Native memory allocation (mmap) failed to map 939524096 bytes for committing reserved memory.# An error report file with more information is saved as:# /var/log/hadoop/hdfs/hs_err_pid30918.log

根据这里红色的报错信息,显示内存不足以分配给JAVA。由于不懂JAVA,看到了一个最为简单的方法,那就是老一套--重启服务器。

重启服务器后,执行ambari-agent start 无法启动,想起来没有 禁用Transparent Huge Pages,

#cat/sys/kernel/mm/transparent_hugepage/enabled

[always] madvisenever

# echo never >/sys/kernel/mm/transparent_hugepage/enabled

# echo never >/sys/kernel/mm/transparent_hugepage/defrag

# cat/sys/kernel/mm/transparent_hugepage/enabled

always madvise[never]

再次执行ambari-agent start ,输出信息如下

Verifying Python version compatibility...Using python  /usr/bin/pythonChecking for previously running Ambari Agent.../var/run/ambari-agent/ambari-agent.pid found with no process. Removing 1391...Starting ambari-agentVerifying ambari-agent process status...Ambari Agent successfully startedAgent PID at: /var/run/ambari-agent/ambari-agent.pidAgent out at: /var/log/ambari-agent/ambari-agent.out

Agent log at: /var/log/ambari-agent/ambari-agent.log

显示已经成功,但当我执行ambari-agent status,发现高兴过早。如下执行后信息

Found ambari-agent PID: 1472

ambari-agent not running. Stale PID File at: /var/run/ambari-agent/ambari-agent.pid    

并没有启动。然后删除ambari-agent.pid。再次启动agent。

报一下错误:

[root@slaver1 ~]# ambari-agent startVerifying Python version compatibility...Using python  /usr/bin/pythonChecking for previously running Ambari Agent...Starting ambari-agentVerifying ambari-agent process status...ERROR: ambari-agent start failed. For more details, see /var/log/ambari-agent/ambari-agent.out:====================  File "/usr/lib/python2.6/site-packages/ambari_agent/Hardware.py", line 44, in __init__    self.hardware.update(Facter().facterInfo())  File "/usr/lib/python2.6/site-packages/ambari_agent/Facter.py", line 522, in facterInfo    facterInfo = super(FacterLinux, self).facterInfo()  File "/usr/lib/python2.6/site-packages/ambari_agent/Facter.py", line 217, in facterInfo    facterInfo['ipaddress'] = self.getIpAddress()  File "/usr/lib/python2.6/site-packages/ambari_agent/Facter.py", line 73, in getIpAddress    return socket.gethostbyname(self.getFqdn().lower())gaierror: [Errno -3] Temporary failure in name resolution====================Agent out at: /var/log/ambari-agent/ambari-agent.out

Agent log at: /var/log/ambari-agent/ambari-agent.log

如红字所示,无法解析主机。

#hostname

slaver1.novalocal

竟然后面有域名,想起来上次好像用的是临时的,重启后又变成/etc/hostname里的名字

#hostnamectl set-hostname slaver1   这个是centos7版本后永久性改hostname的命令,之前版本要去配置文件里改

再次重启

[root@slaver1 ~]# ambari-agent startVerifying Python version compatibility...Using python  /usr/bin/pythonChecking for previously running Ambari Agent.../var/run/ambari-agent/ambari-agent.pid found with no process. Removing 1581...Starting ambari-agentVerifying ambari-agent process status...Ambari Agent successfully startedAgent PID at: /var/run/ambari-agent/ambari-agent.pidAgent out at: /var/log/ambari-agent/ambari-agent.outAgent log at: /var/log/ambari-agent/ambari-agent.log

查看

[root@slaver1 ~]# ambari-agent statusFound ambari-agent PID: 1671ambari-agent running.Agent PID at: /var/run/ambari-agent/ambari-agent.pidAgent out at: /var/log/ambari-agent/ambari-agent.outAgent log at: /var/log/ambari-agent/ambari-agent.log

转载请注明原文地址: https://www.6miu.com/read-2600344.html

最新回复(0)