Configuring and tuning Ceph

This how-to gives some configuration snippets to configure and tune the Ceph cluster.

See the Rook.io CephCluster documentation for Ceph configuration options which are exposed by the Rook-Ceph operator. See Ceph documentation for the upstream Ceph configuration documentation.

Configure Ceph’s backing storage

To configure the component for an infrastructure which provides backing storage as storageclass localblock-storage, simply provide the following config.

parameters:
  rook_ceph:
    ceph_cluster:
      block_storage_class: localblock-storage

Tune backing storage for SSDs

If the backing storage provided for Ceph is itself backed by SSDs (or better), you can tune Ceph for "fast" devices with the following config.

parameters:
  rook_ceph:
    ceph_cluster:
      tune_fast_device_class: true

Configure target size of RBD block pool

To tell the Ceph cluster that the default RBD block pool, which is named storagepool, is expected to take up 80% of the Ceph cluster’s capacity, the following config can be provided.

parameters:
  rook_ceph:
    ceph_cluster:
      storage_pools:
        rbd:
          storagepool:
            config:
              parameters:
                target_size_ratio: "0.8"

Configure Ceph options which aren’t exposed by the Rook-Ceph operator

To configure a Ceph option which isn’t exposed by the operator, you can provide raw Ceph configuration entries in parameter ceph_cluster.config_override.

For example, to change the OSD operations queue (op_queue) scheduler to wpq for a Ceph v17 cluster, you can provide the following config.

Ceph v17 has switched to mclock_scheduler by default. We’ve benchmarked both the wpq and mclock_scheduler for Ceph v17 and didn’t see significant differences on an idle cluster.

mclock_scheduler should provide improved client workload performance during recovery events if the cluster is heavily utilized. Additionally, recovery events should be able to consume more capacity when there’s little client load on the cluster.

See the Ceph documentation for more details.

parameters:
  rook_ceph:
    ceph_cluster:
      config_override:
        osd:
          osd_op_queue: wpq

As discussed in the parameter’s documentation, the contents of ceph_cluster.config_override are rendered into ini style format by the component.

Each key in the parameter is used as the name of a section in the resulting ini style configuration file.

Configure Ceph full ratios

Ceph has configuration options to control at which point the cluster starts warning about potential service interruptions due to running out of disk space.

The component sets the config options mon_osd_full_ratio=0.85, mon_osd_backfillfull_ratio=0.8 and mon_osd_nearfull_ratio=0.75 in the Rook config-override configmap. With this configuration, the Ceph cluster will go into read-only mode when it’s utilized to 85% of its capacity.

You can tune these options before creating the Ceph cluster by providing the following config.

parameters:
  rook_ceph:
    ceph_cluster:
      config_override:
        global:
          mon_osd_full_ratio: '0.9' (1)
          mon_osd_nearfull_ratio: '0.8' (2)
          mon_osd_backfillfull_ratio: '0.85' (3)
1 Configure the cluster to become read-only (OSD status full) at 90% utilization. We give the ratios as strings to avoid the resulting configuration containing values like 0.90000000000000002 due to floating point rounding issues.
2 Configure the threshold at which the OSD status becomes nearfull.
3 Configure the threshold at which Ceph won’t backfill more data to the OSD.

At runtime, you can adjust the cluster’s ratios with

kubectl --as=cluster-admin exec -it deploy/rook-ceph-tools -- ceph osd set-full-ratio 0.9 (1)
kubectl --as=cluster-admin exec -it deploy/rook-ceph-tools -- ceph osd set-nearfull-ratio 0.8 (2)
kubectl --as=cluster-admin exec -it deploy/rook-ceph-tools -- ceph osd set-backfillfull-ratio 0.85 (3)
1 Configure the cluster to become read-only (OSD status full) at 90% utilization.
2 Configure the threshold at which the OSD status becomes nearfull.
3 Configure the threshold at which Ceph won’t backfill more data to the OSD.