When setting up a cluster with ceph-deploy, just after the ceph-deploy osd activate
phase and the distribution of keys, the OSDs should be both “up” and “in” the cluster.
One thing that is not mentioned in the quick-install documentation with ceph-deploy or the OSDs monitoring or troubleshooting page (or at least I didn’t find it), is that, upon (re-)boot, mounting the storage volumes to the mount points that ceph-deploy prepares is up to the administrator (check this discussion on the Ceph mailing list).
So, after a reboot of my storage nodes, the Ceph cluster couldn’t reach a healthy state showing the following OSD tree:
$ ceph osd tree # id weight type name up/down reweight -1 3.64 root default -2 1.82 host ceph-osd0 0 0.91 osd.0 down 0 1 0.91 osd.1 down 0 -3 1.82 host ceph-osd1 2 0.91 osd.2 down 0 3 0.91 osd.3 up 1
I wasn’t thinking about mounting the drives, as this process was hidden to me during the initial installation, but a simple mount
command would have immediately unveiled the mistery :D.
So, the simple solution was to mount the devices:
sudo mount /dev/sd<XY> /var/lib/ceph/osd/ceph-<K>/
and then to start the OSD daemons:
sudo start ceph-osd id=<K>
For some other troubleshooting hints for Ceph, you may look at this page.
Hi,
How can I down a osd and bring it back in RHEL 7.2 with ceph verison 10.2.2
sudo start ceph-osd id=1 fails with “sudo: start: command not found”.
I have 5 osds in each node and i want to down one particular osd (sudo stop ceph-sd id=1 also fails) and see whether replicas are written to other osds without any issues.
Thanks in advance.
–kanchana.
ceph-disk activate /var/lib/ceph/osd/ceph-{number}
or
ceph-disk activate-all
Is this to deactivate an OSD or…activate an OSD?