RHEL cluster testing a fencing override using the fence_ack_manual command
triggering a kernel panic , trigger a crash
[root@hostname2 ~]# echo c > /proc/sysrq-trigger
tailing the log on the second node
July 1 22:57 hostname1 ccsd[5959]: Update of cluster.conf complete (version 40 -> 41).
July 1 22:10 hostname1 clurgmgrd[6229]: <notice> Reconfiguring
July 1 22:15 hostname1 clurgmgrd: [6229]: <notice> Getting status
July 1 22:01 hostname1 root: seamus simulating fencedevice\site failure
July 1 22:48 hostname1 root: seamus triggred kenel panic on hostname2
July 1 22:40 hostname1 openais[5968]: [TOTEM] The token was lost in the OPERATIONAL state.
July 1 22:40 hostname1 openais[5968]: [TOTEM] Receive multicast socket recv buffer size (320000 bytes).
July 1 22:40 hostname1 openais[5968]: [TOTEM] Transmit multicast socket send buffer size (320000 bytes).
July 1 22:40 hostname1 openais[5968]: [TOTEM] entering GATHER state from 2.
July 1 22:42 hostname1 openais[5968]: [TOTEM] entering GATHER state from 0.
July 1 22:42 hostname1 openais[5968]: [TOTEM] Creating commit token because I am the rep.
July 1 22:42 hostname1 openais[5968]: [TOTEM] Storing new sequence id for ring 1538
July 1 22:42 hostname1 openais[5968]: [TOTEM] entering COMMIT state.
July 1 22:42 hostname1 openais[5968]: [TOTEM] entering RECOVERY state.
July 1 22:42 hostname1 openais[5968]: [TOTEM] position [0] member 10.10.90.1:
July 1 22:42 hostname1 openais[5968]: [TOTEM] previous ring seq 5428 rep 10.10.90.1
July 1 22:42 hostname1 openais[5968]: [TOTEM] aru 2f high delivered 2f received flag 1
July 1 22:42 hostname1 openais[5968]: [TOTEM] Did not need to originate any messages in recovery.
July 1 22:42 hostname1 openais[5968]: [TOTEM] Sending initial ORF token
July 1 22:42 hostname1 openais[5968]: [CLM ] CLM CONFIGURATION CHANGE
July 1 22:42 hostname1 openais[5968]: [CLM ] New Configuration:
July 1 22:42 hostname1 openais[5968]: [CLM ] r(0) ip(10.10.90.1)
July 1 22:42 hostname1 openais[5968]: [CLM ] Members Left:
July 1 22:42 hostname1 openais[5968]: [CLM ] r(0) ip(10.10.90.2)
July 1 22:42 hostname1 kernel: dlm: closing connection to node 2
July 1 22:42 hostname1 openais[5968]: [CLM ] Members Joined:
July 1 22:42 hostname1 fenced[5990]: hostname2.domain.private not a cluster member after 0 sec post_fail_delay
July 1 22:42 hostname1 openais[5968]: [CLM ] CLM CONFIGURATION CHANGE
July 1 22:42 hostname1 fenced[5990]: fencing node "hostname2.domain.private"
July 1 22:42 hostname1 openais[5968]: [CLM ] New Configuration:
July 1 22:42 hostname1 openais[5968]: [CLM ] r(0) ip(10.10.90.1)
July 1 22:42 hostname1 openais[5968]: [CLM ] Members Left:
July 1 22:42 hostname1 openais[5968]: [CLM ] Members Joined:
July 1 22:42 hostname1 openais[5968]: [SYNC ] This node is within the primary component and will provide service.
July 1 22:42 hostname1 openais[5968]: [TOTEM] entering OPERATIONAL state.
July 1 22:42 hostname1 openais[5968]: [CLM ] got nodejoin message 10.10.90.1
July 1 22:42 hostname1 openais[5968]: [CPG ] got joinlist message from node 1
July 1 22:50 hostname1 fenced[5990]: agent "fence_ipmilan" reports: Rebooting machine @ IPMI:192.168.214.33...Failed
July 1 22:50 hostname1 fenced[5990]: fence "hostname2.domain.private" failed
July 1 22:18 hostname1 root: seamus fencing failing due to incorrect ipmi password
July 1 22:48 hostname1 fenced[5990]: fencing node "hostname2.domain.private"
July 1 22:56 hostname1 fenced[5990]: agent "fence_ipmilan" reports: Rebooting machine @ IPMI:192.168.214.33...Failed
July 1 22:56 hostname1 fenced[5990]: fence "hostname2.domain.private" failed
running the fence_ack_manual command
[root@hostname1 ~]# fence_ack_manual -e -n hostname2.domain.private
Warning: If the node "hostname2.domain.private" has not been manually fenced
(i.e. power cycled or disconnected from shared storage devices)
the GFS file system may become corrupted and all its data
unrecoverable! Please verify that the node shown above has
been reset or disconnected from storage.
Are you certain you want to continue? [yN] y
July 1 22:06 hostname1 root: seamus running fence_ack_manual -e -n hostname2.domain.private
July 1 22:23 hostname1 fenced[5990]: fence "hostname2.domain.private" overridden by administrator intervention