Thursday, September 1, 2016

Ceph Monitors Deadlock

Introduction
A part of my role scope is general maintenance of the QE department Ceph cluster. Usually it doesn’t consume a lot of my time, apart from general monitoring, providing keyrings and pools, etc, you know, general management.
‘My’ cluster has 3 servers with 9 disks each:
  • 2 for the OS, RHEL 7 (RAID 1)
  • 1 SSD for journaling
  • 6 disks as OSDs
And it works pretty well.

The Problem

A colleague asked me a question first thing in the morning about the relations of Ceph and Openstack, as a huge believer of teaching by examples I logged in into one of the servers and ran rbd command, showing the list of images in the pool.  
$ sudo rbd -p <pool name> --id <client> ls
The client failed to connect to the monitors, all 3 of them
2016-09-01 14:00:04.946448 7f2a2d2cc700  0 -- <IP address>:6789/0 >> <IP address>:6789/0 pipe(0x4cee000 sd=13 :0 s=1 pgs=0 cs=0 l=0 c=0x4967080).fault

Troubleshooting

What I could go with

First of all, when the RBD client fails to connect, it probably mean that the ceph client will not be effective as well. Thus no reason, IMO, to check for cluster health
$ sudo ceph health
Cause the reply will be the same.
The first thing on my mind is checking the monitors daemon status in all servers in the cluster
$ sudo service ceph status mon
The result was
=== mon.ceph1 ===
mon.ceph1: not running.
OK, then the daemon is down, let us bring it back up
$ sudo service ceph start mon
No joy - the daemon is down.
After that, I went to the Ceph’s monitor log, /var/log/ceph/ceph-mon-ceph.1.log it show me these following log entries. Two messages were highlighted, in my eyes, starting with:
2016-09-01 09:23:49.490950 7efd5a7137c0 -1 WARNING: 'mon addr' config option 10.35.65.98:0/0 does not match monmap file continuing with monmap configuration
With this line as the punchline:
2016-09-01 09:23:49.762012 7efd5021c700  0 cephx: verify_authorizer could not decrypt ticket info: error: NSS AES final round failed:-8190
So the problem is either with the monitors keyring, meaning it failed with the authentication, or there’s a problem with the monitors map configuration.

Dead ends (but should be checked)

  • The keyrings of the monitors were identical, no authentication problem (still might be permission issue, daemon fails to read file)
  • NTP service is up and running, all the clocks are in sync  

The Solution

Fixing this issue required me to use monmaptool command, for that I used
Though Sébastien Han recommends not to do it on a live cluster, I did it anyhow, with the minor risk of data lost in a staging environment.  
I got the cluster FSID from /etc/ceph/ceph.conf and created a new monmap with monmaptool
$ sudo monmaptool --create --add ceph1 <IP address>:6789 --add ceph2 <IP address>:6789 --add ceph3:6789 --fsid <Ceph’s cluster FSID> --clobber monmap
Once the file is available, I copied it to all the servers in the cluster and stopped all Ceph’s daemons
$ sudo service ceph stop


Now that the cluster is down and out, I can inject the newly created map to the monitors
$ sudo ceph-mon -i ceph<X> -inject-monmap monmap
Timidly I started the monitors daemons together (as much as I could) and behold!
=== mon.ceph1 ===
mon.ceph1: running {"version":"0.94.5-9.el7cp"}
Afterwards the rest of the Ceph’s daemons are available
$ sudo service ceph start
And the cluster status is HEALTH_OK

No comments:

Post a Comment