Task #9487
Updated by Nico Schottelius over 3 years ago
h2. Situation
* At ungleich we are running multiple ceph cluster, usually one or more per location. Currently our ceph clusters run on Devuan (w/o systemd).
* Alpine Linux, which we use in combination with Devuan, does not have proper ceph packages.
* Ceph is moving away from packages in favor for containers managed by @cephadm@, which requires systemd on the host OS.
* We are not inclined to switch to systemd, due to long term maintenance issues we have experienced
* The alternative to @cephadm@ is using ceph with https://rook.io
* We have successfully tested Alpine Linux + crio or docker + rook on test clusters
h2. Objectives
* Have proper operational understanding of rook (vs. "plain ceph") by running multiple test clusters, upgrade, fail, destroy and rebuild them (handled in internal ticket #4608)
* Develop a strategy to migrate our native Ceph clusters into rook without any downtime
h2. High level Strategy per cluster
* Create a k8s cluster
* Adjust the rook settings to match the current cluster (fsid, monitors, networking, etc.)
* Controlled phase in services in rook
* Retire / replace native running services
* Repeat for each cluster
h2. Operational steps / detailed plan
(TBD)