Task #5786

Updated by Nico Schottelius almost 4 years ago

h2. Introduction 

 This article describes the ungleich storage architecture that is based on ceph. It describes our architecture as well maintenance commands. 

 h2. Communication guide 

 Usually when disks fails no customer communication is necessary, as it is automatically compensated/rebalanced by ceph. However in case multiple disk failures happen at the same time, I/O speed might be reduced and thus customer experience impacted. 

 For this reason communicate whenever I/O recovery settings are temporarily tuned. 

 h2. Change ceph speed for i/o recovery 

 By default we want to keep I/O recovery traffic low to not impact customer experience. However when multiple disks fail at the same point, we might want to prioritise recover for data safety over performance. 

 The default configuration on our servers contains: 

 osd max backfills = 1 
 osd recovery max active = 1 
 osd recovery op priority = 2 

 The important settings are *osd max backfills* and *osd recovery max active*, the priority is always kept low so that regular I/O has priority. 

 To adjust the number of backfills *per osd* and to change the *number of threads* used for recovery, we can use on any node with the admin keyring: 

 ceph tell osd.* injectargs '--osd-max-backfills Y' 
 ceph tell osd.* injectargs '--osd-recovery-max-active X' 

 where Y and X are the values that we want to use. Experience shows that Y=5 and X=5 doubles to triples the recovery performance, whereas X=10 and Y=10 increases recovery performance 5 times. 

 h2. Debug scrub errors / inconsistent pg message 

 From time to time disks don't save what they are told to save. Ceph scrubbing detects these errors and switches to HEALTH_ERR. Use *ceph health detail* to find out which placement groups (*pgs*) are affected. Usually a *ceph pg repair <number> fixes the problem. 

 If this does not help, consult