11Investigating a Failed Ceph Drive
22---------------------------------
33
4- After deployment, when a drive fails it may cause OSD crashes in Ceph .
5- If Ceph detects crashed OSDs, it will go into `HEALTH_WARN ` state.
4+ A failing drive in a Ceph cluster will cause OSD daemon to crash .
5+ In this case Ceph will go into `HEALTH_WARN ` state.
66Ceph can report details about failed OSDs by running:
77
8- .. ifconfig :: deployment['cephadm']
8+ .. code-block :: console
99
10- .. note ::
10+ ceph# ceph health detail
1111
12- Remember to run ceph/rbd commands after issuing ``cephadm shell`` or
13- installing ceph clients.
14- It is also important to run the commands on the hosts with _admin label
15- (Ceph monitors by default).
12+ .. ifconfig :: deployment['cephadm']
1613
17- .. code-block :: console
14+ .. note ::
1815
19- ceph# ceph health detail
16+ Remember to run ceph/rbd commands from within ``cephadm shell``
17+ (preferred method) or after installing Ceph client. Details in the
18+ official `documentation <https://docs.ceph.com/en/quincy/cephadm/install/#enable-ceph-cli>`__.
19+ It is also required that the host where commands are executed has admin
20+ Ceph keyring present - easiest to achieve by applying
21+ `_admin <https://docs.ceph.com/en/quincy/cephadm/host-management/#special-host-labels>`__
22+ label (Ceph MON servers have it by default when using
23+ `StackHPC Cephadm collection <https://github.com/stackhpc/ansible-collection-cephadm>`__).
2024
2125 A failed OSD will also be reported as down by running:
2226
@@ -26,7 +30,7 @@ A failed OSD will also be reported as down by running:
2630
2731 Note the ID of the failed OSD.
2832
29- The failed hardware device is logged by the Linux kernel:
33+ The failed disk is usually logged by the Linux kernel too :
3034
3135.. code-block :: console
3236
0 commit comments