[問題] 伺服器偶爾會重開機

看板Linux作者 (玻璃做的大叔)時間8年前 (2016/03/13 14:13), 編輯推噓1(107)
留言8則, 4人參與, 最新討論串1/1
我是centos 6.7 我確定沒有下排程去重開機,crontab裡面只有三項工作,兩項是把指定資料夾 內超過一個月的檔案清掉,還有一項是yum -y update。全部都是凌晨作業。 我最近一個月發生過兩次重開機,我原本以為有斷電或是power不穩,不過 查過message後發現應該是正常重開機, Mar 9 17:00:45 ORZ kernel: r8169 0000:02:00.0: eth0: link down Mar 9 17:03:38 ORZ kernel: r8169 0000:02:00.0: eth0: link up Mar 9 17:03:51 ORZ init: tty (/dev/tty1) main process (2231) killed by TERM signal Mar 9 17:03:51 ORZ init: tty (/dev/tty2) main process (2233) killed by TERM signal Mar 9 17:03:51 ORZ init: tty (/dev/tty3) main process (2235) killed by TERM signal Mar 9 17:03:51 ORZ init: tty (/dev/tty4) main process (2237) killed by TERM signal Mar 9 17:03:51 ORZ init: tty (/dev/tty5) main process (2239) killed by TERM signal Mar 9 17:03:51 ORZ init: tty (/dev/tty6) main process (2243) killed by TERM signal Mar 9 17:03:52 ORZ abrtd: Got signal 15, exiting Mar 9 17:03:53 ORZ xinetd[2038]: Exiting... Mar 9 17:03:57 ORZ acpid: exiting Mar 9 17:04:08 ORZ openvpn[1697]: /sbin/ip route del 192.168.100.0/24 Mar 9 17:04:08 ORZ openvpn[1697]: ERROR: Linux route delete command failed: external program exited with error status: 2 Mar 9 17:04:08 ORZ openvpn[1697]: Closing TUN/TAP interface Mar 9 17:04:08 ORZ openvpn[1697]: /sbin/ip addr del dev tun0 local 192.168.100.1 peer 192.168.100.2 Mar 9 17:04:08 ORZ openvpn[1697]: Linux ip addr del failed: external program exited with error status: 2 Mar 9 17:04:08 ORZ init: Disconnected from system bus Mar 9 17:04:08 ORZ openvpn[1697]: SIGTERM[hard,] received, process exiting Mar 9 17:04:08 ORZ console-kit-daemon[31704]: WARNING: no sender#012 Mar 9 17:04:08 ORZ rpcbind: rpcbind terminating on signal. Restart with "rpcbind -w" Mar 9 17:04:08 ORZ auditd[1490]: The audit daemon is exiting. Mar 9 17:04:08 ORZ kernel: type=1305 audit(1457514248.550:7020): audit_pid=0 old=1490 auid=4294967295 ses=4294967295 subj=system_u:system_r:auditd_t:s0 res=1 Mar 9 17:04:08 ORZ kernel: type=1305 audit(1457514248.664:7021): audit_enabled=0 old=1 auid=4294967295 ses=4294967295 subj=system_u:system_r:auditctl_t:s0 res=1 Mar 9 17:04:08 ORZ kernel: Kernel logging (proc) stopped. Mar 9 17:04:08 ORZ rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="1526" x-info="http://www.rsyslog.com"] exiting on signal 15. Mar 9 17:07:14 ORZ kernel: imklog 5.8.10, log source = /proc/kmsg started. Mar 9 17:07:14 ORZ rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="1520" x-info="http://www.rsyslog.com"] start Mar 9 17:07:14 ORZ kernel: Initializing cgroup subsys cpuset Mar 9 17:07:14 ORZ kernel: Initializing cgroup subsys cpu 這樣應該是有被下指定對吧?另外last也說 reboot system boot 2.6.32-573.18.1. Wed Mar 9 17:06 - 14:06 (3+20:59) reboot system boot 2.6.32-573.18.1. Thu Mar 3 03:06 - 17:03 (6+13:57) 請問為什麼會重開? -- 起初,他們追殺共產主義者,我沒有說話,因為我不是共產主義者; 接著,他們追殺猶太人,我沒有說話,因為我不是猶太人; 後來,他們追殺工會成員,我沒有說話,因為我不是工會成員; 此後,他們追殺天主教徒,我沒有說話,因為我是新教教徒; 最後,他們奔我而來,卻再也沒有人站起來為我說話了。 《First They Came(他們首次來時)》,Martin Niemoller牧師(1892-1984) -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 180.176.36.217 ※ 文章網址: https://www.ptt.cc/bbs/Linux/M.1457849592.A.6F3.html

03/13 16:09, , 1F
檢查記憶體或主機板看看?
03/13 16:09, 1F

03/13 20:54, , 2F
以前買過 hp z620 也有類似情況,叫廠商換 power
03/13 20:54, 2F

03/13 20:54, , 3F
之後就好了
03/13 20:54, 3F

03/13 22:06, , 4F
嗚挖....這東西要交叉測試也難耶!頻率那麼低
03/13 22:06, 4F

03/14 16:04, , 5F
主機過熱?感覺上像是被hardware trigger 了reboot,若是H
03/14 16:04, 5F

03/14 16:04, , 6F
P server可以檢查一下iLo log, IBM server可以檢查IMM lo
03/14 16:04, 6F

03/14 16:04, , 7F
g
03/14 16:04, 7F

03/20 09:47, , 8F
我是centos,我再看看狀況吧!先頻繁備份好了
03/20 09:47, 8F
文章代碼(AID): #1MvGJuRp (Linux)