Operating the Ceph Monitors (ceph-mon)

Adding ceph-mon daemons (VM, jewel/luminous)

Upstream documentation at http://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/

Create the machine for the mon

Normally we create ceph-mon's as VMs in the ceph/{hg_name}/mon hostgroup.

Example: Adding a monitor to the ceph/test cluster:

  • First, source the IT Ceph Storage Service environment on aiadm: link
  • Then create a virtual machine with the following parameters:
  • main-user/responsible: ceph-admins (the user of the VM)
  • VM Flavor: m2.2xlarge (monitors must withstand heavy loads)
  • OS: Centos7 (the preferred OS used in CERN applications)
  • Hostgroup: ceph/test/mon (Naming convention for puppet configuration)
  • VM name: cephtest-mon- (We use prefix to generate an id)
  • Availability zone: usually cern-geneva-[a/b/c]

Example command: (It will create a VM with the above parameters)

$ ai-bs --landb-mainuser ceph-admins --landb-responsible ceph-admins --nova-flavor m2.2xlarge
--cc7 -g ceph/test/mon --prefix cephtest-mon-  --nova-availabilityzone cern-geneva-a
--nova-sshkey {your_openstack_key}

This command will create a VM named cephtest-mon-XXXXXXXXXX in the ceph/test/mon hostgroup. Puppet will take care of the initialization of the machine

When you deploy a monitor server, you have to choose an availability zone. We tend to use different availability zones to avoid a single point of failure.

Set roger state and enable alarming

Set the appstate and app_alarmed parameters if necessary

Example: Get the roger data for the VM cephtest-mon-d8788e3256

$ roger show cephtest-mon-d8788e3256

The output should be something similar to this:

[
    {
        "app_alarmed": false,
        "appstate": "build",
        "expires": "",
        "hostname": "cephtest-mon-d8788e3256.cern.ch",
        "hw_alarmed": true,
        "message": "",
        "nc_alarmed": true,
        "os_alarmed": true,
        "update_time": "1506418702",
        "update_time_str": "Tue Sep 26 11:38:22 2017",
        "updated_by": "tmourati",
        "updated_by_puppet": false
    }
]

You need to set the machine's state to "production", so it can be used in production.

The following command will set the target VM to production state:

$ roger update --appstate production --all_alarms=true cephtest-mon-XXXXXXXXXX

Now the roger show {host} should show something like this:

[
    {
        "app_alarmed": true,
        "appstate": "production",
        "..."
    }
]

We now let puppet configure the machine. This task will take an adequate amount of time, as it needs about two configuration cycles to apply the desired changes. After the second cycle you can SSH (as root) to the machine to check if everything is ok.

For example you can check the cluster's status with $ ceph -s

You should see the current host in the monitor quorum.

Details on lbalias for mons

We prefer not to use load-balancing service and lbclient here (https://configdocs.web.cern.ch/dnslb/). There is no scenario in ceph where we want a mon to disappear from the alias.

For a bare metal node

We rather use the --load-N appoarch to create the alias with all the mons:

  • Go to network.cern.ch
  • Click on Update information and use the FQDN of the mon machine
    • If prompted, make sure you host interface and not the IPMI one
  • Add "ceph{hg_name}--LOAD-N-" to the list IP Aliases under TCP/IP Interface Information
  • Multiple aliases are supported. Use a comma-separated list
  • Check the changes are correct and submit the request

For a openstack VM

In the case of a VM, we can't directly set an alias, but can set a property in OS to the same effectL

  • Log onto aiadm or lxplus
  • Set your environmental variables to the correct tenant e.g. `eval $(ai-rc 'Ceph Development')
    • Check the vars are what you expect with env | grep OS paying attention to OS_region
  • set the alias using openstack with openstack server set --property landb-alias=CEPH{hg_name}--LOAD-N- {hostname}

Removing a ceph-mon daemon (jewel)

Upstream documentation at http://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/

Prerequisites

  1. The cluster must be in HEALTH_OK state, i.e. the monitor must be in a a healthy quorum.
  2. You should have a replacement for the current monitor already in the quorum. And there should be enough monitors so that the cluster can be healthy after one monitor is removed. Normally this means that we should have about 4 monitors in the quorum before starting.

Procedure

  1. Disable puppet: $ puppet agent --disable 'decommissioning mon'
  2. (If needed) remove the DNS alias from this machine and wait until it is so:
- For physical machines, visit http://network.cern.ch → "Update Information".
- For a VM monitor, you can remove the alias from the `landb-alias` property. See [Cloud Docs](https://clouddocs.web.cern.ch/clouddocs/using_openstack/properties.html)
  1. Check if monitor is ok-to-stop: $ ceph mon ok-to-stop <hostname>
  2. Stop the monitor: $ systemctl stop ceph-mon.target. You should now get a HEALTH_WARN status by running $ ceph -s, for example 1 mons down, quorum 1,2,3,4,5.
  3. Remove the monitor's configuration, data and secrets with:
```sh
$ rm /var/lib/ceph/tmp/keyring.mon.*
$ rm -rf /var/lib/ceph/mon/<hostname>
```
  1. Remove the monitor from the ceph cluster:
```sh
$ ceph mon rm <hostname>
removing mon.<hostname> at <IP>:<port>, there will be 5 monitors
```
  1. You should now have a HEALTH_OK status after the monitor removal.
  2. (If monitored by prometheus) remove the hostname from the list of endpoints to monitor. See it-puppet-hostgroup-ceph/data/hostgroup/ceph/prometheus.yaml

For machines hosting uniquely the ceph mon

  1. Move this machine to a spare hostgroup: $ ai-foreman updatehost -c ceph/spare {hostname}

  2. Run puppet once: $ puppet agent -t

  3. (If physical) Reinstall the server in the ceph/spare hostgroup:

```sh
aiadm> ai-installhost p01001532077488
...
1/1 machine(s) ready to be installed
Please reboot the host(s) to start the installation:
ai-remote-power-control cycle p01001532077488.cern.ch
aiadm> ai-remote-power-control cycle p01001532077488.cern.ch
```

Now the physical machine is installed in the ceph/spare hostgroup.

  1. (If virtual) Kill the vm with: $ ai-kill-vm {hostname}

For machines hosting other ceph-daemons

  1. Move this machine to another hostgroup (e.g., /osd) of the same cluster: $ ai-foreman updatehost -c ceph/<cluster_name>/osd {hostname}
  2. Run puppet to apply the changes: $ puppet agent -t
Improve me !