Large omap object warning due to bucket index over limit

Large omap objects trigger HEALTH WARN messages and can be due to poorly sharded bucket indexes.

The following example report about a over-limit bucket on nethub detected on 2021/05/21.

  1. Look for Large omap object found. in ceph logs (/var/log/ceph/ceph/log):
2021-05-21 04:34:00.879483 osd.867 (osd.867) 240 : cluster [WRN] Large omap object found. Object: 7:7bae080b:::.dir.fe32212d-631b-44fe-8d35-03f5a3551af1.142704632.29:head PG: 7.d01075de (7.de) Key count: 610010 Size (bytes): 198156342
2021-05-21 04:34:11.622372 mon.cephnethub-data-c116fa59b2 (mon.0) 659324 : cluster [WRN] Health check failed: 1 large omap objects (LARGE_OMAP_OBJECTS)

These lines show that:

  • The pool suffering from the problem is pool number 7
  • The PG suggering is 7.de
  • The shared object is a bucket index: .dir. represents bucket indexes
  • The affected bucket has id fe32212d-631b-44fe-8d35-03f5a3551af1.142704632.29 (sadly , there is no way to map it to a name)

To verify this is actually a bucket index, one can also check what pool #7 stores:

[14:21][root@cephnethub-data-98ab89f75a (production:ceph/nethub/mon*27:peon) ~]# ceph osd pool ls detail | grep "pool 7"
pool 7 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 256 pgp_num 256 last_change 30708 lfor 0/0/2063 flags hashpspool,nodelete,nopgchange,nosizechange stripe_width 0 application rgw
  1. Run radosgw-admin bucket limit check to see how bucket index sharding is doing. It might take a while, it is recommended to dump to file.

  2. Check the output of radosgw-admin bucket limit check and look for buckets with OVER "fill_status":

{
    "bucket": "cboxbackproj-cboxbackproj-sftnight-lgdocs",
    "tenant": "",
    "num_objects": 767296,
    "num_shards": 0,
    "objects_per_shard": 767296,
    "fill_status": "OVER 749%"
},
  1. Check in the radosgw logs (please, use mco to look through all the RGWs) if the radosgw process has tried to reshard the bucket recently but did not manage. Example:
ceph-client.rgw.cephnethub-data-0509dffff2.log-20210514.gz:2021-05-13 12:19:40.316 7fd2ce2a4700  1 check_bucket_shards bucket cboxbackproj-sftnight-lgdocs need resharding  old num shards 0 new num sh
ards 18
ceph-client.rgw.cephnethub-data-0509dffff2.log-20210514.gz:2021-05-13 12:20:12.624 7fd2cd2a2700  0 NOTICE: resharding operation on bucket index detected, blocking
ceph-client.rgw.cephnethub-data-0509dffff2.log-20210514.gz:2021-05-13 12:20:12.625 7fd2cd2a2700  0 RGWReshardLock::lock failed to acquire lock on cboxbackproj-sftnight-lgdocs:fe32212d-631b-44fe-8d35-
03f5a3551af1.142705079.19 ret=-16

This only applies whether dynamic resharding is enabled:

[14:27][root@cephnethub-data-0509dffff2 (qa:ceph/nethub/traefik*26) ~]# cat /etc/ceph/ceph.conf  | grep resharding
rgw dynamic resharding = true
  1. Reshard the bucket index manually:
radosgw-admin reshard add --bucket cboxbackproj-cboxbackproj-sftnight-lgdocs --num-shards 18
  • The number of shards can be inferred from the logs inspected at point 4. -i If dynamic resharding is disable, a little math is required. Check the bucket stats (radosgw-admin bucket stats --bucket <bucket_name>) and make sure usage --> rgw.main --> num_objects divided by the number of shards does not exceed 100000 (50000 is recommended).

Example:

[14:29][root@cephnethub-data-98ab89f75a (production:ceph/nethub/mon*27:peon) ~]# radosgw-admin bucket stats --bucket cboxbackproj-cboxbackproj-sftnight-lgdocs
{
    "bucket": "cboxbackproj-cboxbackproj-sftnight-lgdocs",
[...]
    "usage": {
        "rgw.main": {
            "size": 4985466767640,
            "size_actual": 4987395952640,
            "size_utilized": 4985466767640,
            "size_kb": 4868619891,
            "size_kb_actual": 4870503860,
            "size_kb_utilized": 4868619891,
            "num_objects": 941202
        }
    },
}

with 941202 / 18 = 52289

5b. Once added the bucket to be reshared, start the reshard process:

radosgw-admin reshard list
radosgw-admin reshard process
  1. Check after some time that the radosgw-admin bucket stats --bucket <bucket_name> reports the right number of shards and that radosgw-admin bucket limit check no longer shows OVER or WARNING for the re-sharded bucket.

  2. To clear the HEALTH_WARN message for the large omap object, start a deep scrub on the affected pg:

[14:31][root@cephnethub-data-98ab89f75a (production:ceph/nethub/mon*27:peon) ~]# ceph pg deep-scrub 7.de
instructing pg 7.de on osd.867 to deep-scrub
Improve me !