Operating the Ceph Metadata Servers (ceph-mds)
Adding a ceph-mds daemon (VM, luminous)
Upsream documentation here: http://docs.ceph.com/docs/master/rados/deployment/ceph-deploy-mds/
The procedure follows the same pattern as adding a monitor node (create_a_mon) to the cluster.
Make sure you add your mds to the corresponding hostgroup ceph/<cluster>/mds
and prepare
the Puppet code (check other ceph clusters with cephfs as a reference)
Example for the ceph/mycluster
hostgroup:
$ ai-bs --landb-mainuser ceph-admins --landb-responsible ceph-admins \
--nova-flavor m2.2xlarge --cc7 -g ceph/<mycluster>/mds --prefix ceph<mycluster>-mds- \
--nova-availabilityzone cern-geneva-a
Note: When deploying more than one mds, make sure that they are spreaded into different availability zones.
As written in the upstream documentation, a ceph filesystem needs at least two metadata servers. The first will be the main server that will handle the clients' requests and the second one is the backup. Don't forget also to put the metadata servers into different availability zones, in case some problem occurs to a site.
Because of resource limitations, the flavor of the machines could be m2.xlarge
instead of m2.2xlarge
. In the ceph/<mycluster>
cluster we use 2 m2.2xlarge
main
servers and one m2.xlarge
backup server.
When the machine is available (reachable by the dns service), you can alter its
state into production with roger
.
$ roger update --appstate production --all_alarms=true ceph<mycluster>-mds-XXXXXXXXXX
After 2-3 runs of puppet
Using additional metadata servers (luminous)
Upstream documentation here: http://docs.ceph.com/docs/master/cephfs/multimds/
When your cephfs system can't handle the amount of client requests, you notice warnings
about the mds or the requests on ceph status
, you may need to use multiple active metadata
servers.
After adding an mds to the cluster, you will notice on ceph status
on the mds line
something like the following line.
mds: cephfs-1/1/1 up {0=cephironic-mds-716dc88600=up:active}, 1 up:standby-replay, 1 up:standby
The 1 up:standby-replay
is the backup server and the 1 up:standby
that is shown
recently is the mds we just added. To make the standby server active, we need to
execute the following line:
WARNING: Your cluster may have multiple filesystems, use the right one!
ceph fs set <fs_name> max_mds 2
The name of the ceph filesystem can be retrieved by using $ ceph fs ls
and looking
for the name: <fs_name>
key-value pair.
Now your ceph status message should look like this:
...
mds: cephfs-2/2/2 up {0=cephironic-mds-716dc88600=up:active,1=cephironic-mds-c4fbd7ee74=up:active}, 1 up:standby-replay
...