Create a howto: how we maintain our disks and ceph cluster(s)
- Table of contents
- Disk handling
- How to move an OSD (Disk, SSD) to another server
- Detailed explanation
- Including commands of megacli
- Including motivation
Our servers are using Perc H700 or H800 disk controllers. The tool megacli can be used to manage them.
Handling foreign configuration¶
When plugging a disk into a server that was configured / used in a different server before, we need to clear the previous configuration.
Listing all known foreign configuration using megacli -CfgForeign -Scan -aALL.
Remove/clear the foreign configuration using
# for instance -L5 -a0 or -L2 -a1 megacli -DiscardPreservedCache -L<slot-or-ID> -a<Controller>
Removal of a disk¶
If a (dead or intact) disk is being removed from the system, the system will HANG on the next reboot, complaining about it missing. For that reason AFTER removing a disk, we need to "clear the cache":
# Get list of caches that need to be cleared megacli -GetPreservedCacheList -aALL # Clear cache: -L0 -a1 megacli -DiscardPreservedCache -L<??> -a<controller>
How to move an OSD (Disk, SSD) to another server¶
- Use ceph-osd-stop-disable from ungleich tools to stop / disable the OSD
- Clear the raid controller information
- Insert disk into new host
- Clear the foreign configuration (megacli)
- [add it to the system with megacli]
- OSD should automatically come up afterwards
- Check using ceph osd tree