ANF CEPH 2022 du 03 au 07/10/2022 Sébastien Geiger # Remplacement d'un OSD avec cephadm identifier le disque et le serveur sur lequel fonctionne l'osd.5 sur votre environement. dans cet exemple, le disque et /dev/vdc sur ceph3 # identification du disque [ceph: root@ceph1 /]# ceph device ls |grep osd.5 7811529f-a85f-44c0-8 ceph3:vdc osd.5 # remarque : /dev/vdc sur ceph3 # suppression avec draining [ceph: root@ceph1 /]# ceph orch osd rm 5 Scheduled OSD(s) for removal [ceph: root@ceph1 /]# ceph orch osd rm status OSD HOST STATE PGS REPLACE FORCE ZAP DRAIN STARTED AT 5 ceph3 draining 27 False False False 2022-08-25 16:36:26.900875 [ceph: root@ceph1 /]# ceph health HEALTH_WARN Reduced data availability: 6 pgs inactive; Degraded data redundancy: 601/262269 objects degraded (0.229%), 74 pgs degraded, 36 pgs undersized #remarque : l'osd est ramqué avec un WEIGHT à 0 pour forcer le rebalancement des données sur les disques restants [ceph: root@ceph2 /]# ceph orch osd rm status OSD HOST STATE PGS REPLACE FORCE ZAP DRAIN STARTED AT 5 ceph3 done, waiting for purge 0 False False False 2022-08-25 17:36:04.821739 [ceph: root@ceph2 /]# ceph orch daemon stop osd.5 Scheduled to stop osd.5 on host 'ceph3' [ceph: root@ceph2 /]# ceph osd tree #plus de trace de l'osd.5 [ceph: root@ceph2 /]# ceph orch device ls ceph3 # il reste le lvm [ceph: root@ceph2 /]# ceph orch device zap ceph3 /dev/vdc --force zap successful for /dev/vdc on ceph3 [ceph: root@ceph2 /]# ceph orch device ls ceph3 # au bout d'un temps un nouveau lvm apparait [ceph: root@ceph2 /]# ceph osd tree #retour de ceph osd.5 creation automatique par ceph orch [ceph: root@ceph2 /]# ceph health HEALTH_WARN Reduced data availability: 6 pgs inactive; Degraded data redundancy: 608/260175 objects degraded (0.234%), 67 pgs degraded, 34 pgs undersized # remarque Removing an OSD using ceph orch requires some additional cleanup # https://www.suse.com/support/kb/doc/?id=000020642 # cephadm ceph-volume lvm zap --destroy --osd-id # utilisation de rados bench # depuis cephclt [root@cephclt ~]# rados --id prbd -p prbd bench 10 write --no-cleanup hints = 1 Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects Object prefix: benchmark_data_cephclt.novalocal_43843 sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s) 0 0 0 0 0 0 - 0 1 16 25 9 36.0086 36 0.281931 0.460157 2 16 43 27 53.9997 72 1.9051 0.862414 3 16 59 43 57.3295 64 1.24115 0.948302 4 16 72 56 55.9949 52 1.53419 0.96618 5 16 87 71 56.794 60 1.70416 0.9525 6 16 110 94 62.6592 92 1.02726 0.940576 7 16 119 103 58.8495 36 0.111639 0.926973 8 16 134 118 58.9914 60 0.172822 0.922839 9 16 151 135 59.9903 68 1.92169 0.961151 10 16 171 155 61.9896 80 0.304496 0.976974 Total time run: 10.2998 Total writes made: 171 Write size: 4194304 Object size: 4194304 Bandwidth (MB/sec): 66.4091 Stddev Bandwidth: 17.7138 Max bandwidth (MB/sec): 92 Min bandwidth (MB/sec): 36 Average IOPS: 16 Stddev IOPS: 4.42844 Max IOPS: 23 Min IOPS: 9 Average Latency(s): 0.954805 Stddev Latency(s): 0.670326 Max latency(s): 3.46445 Min latency(s): 0.0498461 [root@cephclt ~]# rados --id prbd -p prbd bench 10 seq hints = 1 sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s) 0 16 16 0 0 0 - 0 1 16 63 47 187.766 188 0.0854681 0.220285 2 15 107 92 183.822 180 0.501015 0.288438 3 15 171 156 207.857 256 0.0745509 0.286995 Total time run: 3.84347 Total reads made: 171 Read size: 4194304 Object size: 4194304 Bandwidth (MB/sec): 177.964 Average IOPS: 44 Stddev IOPS: 10.4403 Max IOPS: 64 Min IOPS: 45 Average Latency(s): 0.311121 Max latency(s): 1.16175 Min latency(s): 0.0167831 [root@cephclt ~]# rados --id prbd -p prbd bench 10 rand hints = 1 sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s) 0 16 16 0 0 0 - 0 1 16 113 97 387.588 388 0.0249663 0.134336 2 16 218 202 403.76 420 0.0143258 0.14661 3 16 350 334 445.136 528 0.291193 0.1369 4 16 497 481 480.817 588 0.020007 0.128698 5 16 696 680 543.815 796 0.10017 0.11516 6 16 843 827 551.051 588 0.0534796 0.114604 7 15 988 973 555.633 584 0.125233 0.113314 8 16 1214 1198 598.576 900 0.0771962 0.105439 9 16 1412 1396 619.859 792 0.0423539 0.101713 10 14 1640 1626 649.837 920 0.0113432 0.0965771 Total time run: 10.1802 Total reads made: 1640 Read size: 4194304 Object size: 4194304 Bandwidth (MB/sec): 644.386 Average IOPS: 161 Stddev IOPS: 47.5329 Max IOPS: 230 Min IOPS: 97 Average Latency(s): 0.0974517 Max latency(s): 0.74566 Min latency(s): 0.00453869 [root@cephclt ~]# rados --id prbd -p prbd cleanup Removed 171 objects