Operating the Ceph Metadata Servers (ceph-mds)

Adding a ceph-mds daemon (VM, luminous)

Upsream documentation here: http://docs.ceph.com/docs/master/rados/deployment/ceph-deploy-mds/

The procedure follows the same pattern as adding a monitor node (create_a_mon) to the cluster.

Make sure you add your mds to the corresponding hostgroup ceph/<cluster>/mds and prepare the Puppet code (check other ceph clusters with cephfs as a reference)

Example for the ceph/mycluster hostgroup:

$ ai-bs --landb-mainuser ceph-admins --landb-responsible ceph-admins \
 --nova-flavor m2.2xlarge --cc7 -g ceph/<mycluster>/mds --prefix ceph<mycluster>-mds- \
 --nova-availabilityzone cern-geneva-a

Note: When deploying more than one mds, make sure that they are spreaded into different availability zones.

As written in the upstream documentation, a ceph filesystem needs at least two metadata servers. The first will be the main server that will handle the clients' requests and the second one is the backup. Don't forget also to put the metadata servers into different availability zones, in case some problem occurs to a site.

Because of resource limitations, the flavor of the machines could be m2.xlarge instead of m2.2xlarge. In the ceph/<mycluster> cluster we use 2 m2.2xlarge main servers and one m2.xlarge backup server.

When the machine is available (reachable by the dns service), you can alter its state into production with roger.

$ roger update --appstate production --all_alarms=true ceph<mycluster>-mds-XXXXXXXXXX

After 2-3 runs of puppet

Using additional metadata servers (luminous)

Upstream documentation here: http://docs.ceph.com/docs/master/cephfs/multimds/

When your cephfs system can't handle the amount of client requests, you notice warnings about the mds or the requests on ceph status, you may need to use multiple active metadata servers.

After adding an mds to the cluster, you will notice on ceph status on the mds line something like the following line.

mds: cephfs-1/1/1 up  {0=cephironic-mds-716dc88600=up:active}, 1 up:standby-replay, 1 up:standby

The 1 up:standby-replay is the backup server and the 1 up:standby that is shown recently is the mds we just added. To make the standby server active, we need to execute the following line:

WARNING: Your cluster may have multiple filesystems, use the right one!

ceph fs set <fs_name> max_mds 2

The name of the ceph filesystem can be retrieved by using $ ceph fs ls and looking for the name: <fs_name> key-value pair.

Now your ceph status message should look like this:

...
mds: cephfs-2/2/2 up  {0=cephironic-mds-716dc88600=up:active,1=cephironic-mds-c4fbd7ee74=up:active}, 1 up:standby-replay
...
Improve me !