最近在Linux上装个GP测试,装的比较简单,单节点,问题是在安装好了GP后第二天使用gpstart启动GP时,启动失败,启动日志
20170606:23:12:29:018059 gpstart:quickstart:gpadmin-[WARNING]:-Failed segment starts = 1 <<<<<<<< 20170606:23:12:29:018059 gpstart:quickstart:gpadmin-[INFO]:- Skipped segment starts (segments are marked down in configuration) = 0 20170606:23:12:29:018059 gpstart:quickstart:gpadmin-[INFO]:----------------------------------------------------- 20170606:23:12:29:018059 gpstart:quickstart:gpadmin-[INFO]:- 20170606:23:12:29:018059 gpstart:quickstart:gpadmin-[INFO]:-Successfully started 0 of 1 segment instances <<<<<<<< 20170606:23:12:29:018059 gpstart:quickstart:gpadmin-[INFO]:----------------------------------------------------- 20170606:23:12:29:018059 gpstart:quickstart:gpadmin-[WARNING]:-Segment instance startup failures reported 20170606:23:12:29:018059 gpstart:quickstart:gpadmin-[WARNING]:-Failed start 1 of 1 segment instances <<<<<<<< 20170606:23:12:29:018059 gpstart:quickstart:gpadmin-[WARNING]:-Review /home/gpadmin/gpAdminLogs/gpstart_20170606.log 20170606:23:12:29:018059 gpstart:quickstart:gpadmin-[INFO]:----------------------------------------------------- 20170606:23:12:29:018059 gpstart:quickstart:gpadmin-[INFO]:-Commencing parallel segment instance shutdown, please wait... 20170606:23:12:30:018059 gpstart:quickstart:gpadmin-[ERROR]:-gpstart error: Do not have enough valid segments to start the array.错误显示:
Do not have enough valid segments to start the array.上网搜索了一波,说是shared_buffers设置的值大了,一般小于1000,或者有的答案是小于250,但是我很奇怪,我从来没有改过这个值,去/gpmaster/gpseg-1/postgresql.conf中查看shared_buffers的值为125Mb。
#------------------------------------------------------------------------------ # RESOURCE USAGE (except WAL) #------------------------------------------------------------------------------ # - Memory - shared_buffers = 125MB # inserted by initdb #shared_buffers = 128MB # min 128kB or max_connections*16kB # (change requires restart) #temp_buffers = 8MB # min 800kB max_prepared_transactions = 250 # can be 0 or more # (change requires restart) # Note: Increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). #work_mem = 32MB # min 64kB #maintenance_work_mem = 64MB # min 1MB #max_stack_depth = 2MB # min 100kB所以这个应该不是问题的原因,没办法只能去查看启动日志,
[gpadmin@quickstart gpseg-1]$ cd /home/gpadmin/gpAdminLogs/ [gpadmin@quickstart gpAdminLogs]$ ls gpconfig_20170518.log gpstart_20170515.log gpinitsystem_20170514.log gpstart_20170518.log gpinitsystem_20170606.log gpstart_20170519.log gprecoverseg_20170518.log gpstart_20170523.log gpsegstart.py_quickstart:gpadmin_20170514.log gpstart_20170606.log gpsegstart.py_quickstart:gpadmin_20170606.log gpstate_20170518.log gpsegstop.py_quickstart:gpadmin_20170514.log gpstop_20170514.log gpsegstop.py_quickstart:gpadmin_20170515.log gpstop_20170515.log gpsegstop.py_quickstart:gpadmin_20170518.log gpstop_20170518.log gpsegstop.py_quickstart:gpadmin_20170606.log gpstop_20170606.log gpstart_20170514.log [gpadmin@quickstart gpAdminLogs]$ more gpstart_20170606.log查看启动日志,发现在执行命令的时候找不着gpServer地址,
20170606:23:12:16:018059 gpstart:quickstart:gpadmin-[INFO]:- quickstart.cloudera /home/gpadmin/primary/gpseg0 40000 20170606:23:12:23:018059 gpstart:quickstart:gpadmin-[INFO]:-Commencing parallel segment instance startup, please wait... 20170606:23:12:29:018059 gpstart:quickstart:gpadmin-[INFO]:-Process results... 20170606:23:12:29:018059 gpstart:quickstart:gpadmin-[ERROR]:-No segment started for content: 0. 20170606:23:12:29:018059 gpstart:quickstart:gpadmin-[INFO]:-dumping success segments: [] 20170606:23:12:29:018059 gpstart:quickstart:gpadmin-[INFO]:----------------------------------------------------- 20170606:23:12:29:018059 gpstart:quickstart:gpadmin-[INFO]:-DBID:2 FAILED host:'quickstart.cloudera' datadir:'/home/gpadmin/primary/gpseg0' with reason:'cmd had rc=255 completed=True halted=False stdout='' stderr='ssh: Could not resolve hostname gpServer: Name or service not known^M '' 20170606:23:12:29:018059 gpstart:quickstart:gpadmin-[INFO]:-----------------------------------------------------很奇怪,我昨天在安装的时候在hosts里面添加了这个值,结果去查看hosts中,真没有这个值,也就是说在机器重启之后这个值就没有了,现在还在纠结是为什么,在hosts里面加上gpServer就可以了。
所在在看日志的时候应该要看出问题的关键原因,走弯路是最骚的。