티스토리 뷰
disk 장애후 ceph가 급속도로 느려진 현상을 보이고 있어 이에 대한 성능측정을 통해 어느 부분에서의 이슈가 있는지 확인 해보고자 한다.
먼저 ceph performance에 대한 글을 한번 읽고 어떤 문제가 있는지 혹은 어떤 방식으로 측정할지는 고민해보는것이 좋다.
Benchmark Test
benchmark test 를 rados 명령을 통해 수행해보자.
아래와 같이 ceph pool을 확인하고
[root@deploy ~]# ceph osd pool ls
kube-hdd
kube-ssd
images
volumes-ssd
volumes-hdd
backups
vms
cinder.volumes
kubernetes
특정한 pool하나를 지정하여 benchmark를 수행해보자.
(혹은 새로 생성해서 진행해도 무방하다.)
[root@deploy ~]# rados bench -p backups 60 write -b 4M -t 16 --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 60 seconds or 0 objects
Object prefix: benchmark_data_ceph001_23644
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 21 5 20.0286 20 0.239388 0.110094
2 16 22 6 12.0081 4 1.73452 0.380832
3 16 30 14 18.6747 32 2.91985 1.13503
4 16 45 29 29.0086 60 1.17095 1.35023
5 16 50 34 27.206 20 0.0152663 1.54117
6 16 55 39 26.0045 20 2.95522 1.77672
7 16 60 44 25.1464 20 0.0141587 1.83639
8 16 62 46 23.0027 8 4.21219 1.92558
9 16 67 51 22.6689 20 2.97088 2.08094
10 16 71 55 22.0018 16 0.0137967 1.93996
11 16 81 65 23.638 40 3.68136 1.87953
12 16 90 74 24.6681 36 0.812743 1.73885
13 16 99 83 25.5397 36 4.9767 1.63823
14 16 101 85 24.2867 8 3.51051 1.65365
15 16 111 95 25.3341 40 0.0212239 1.88131
아래와 같이 benchmark에 따른 data 생성이 이루어진다. (위 명령옵션이 no-cleanup이 있다보니..)
[root@deploy ~]# rados df
POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR
backups 1.4 GiB 116 0 348 0 0 0 0 0 B 116 464 MiB 0 B 0 B
cinder.volumes 0 B 0 0 0 0 0 0 0 0 B 0 0 B 0 B 0 B
images 11 GiB 470 0 1410 0 0 0 0 0 B 0 0 B 0 B 0 B
kube-hdd 208 GiB 15937 0 47811 0 0 0 151264489 2.3 TiB 858395176 10 TiB 0 B 0 B
kube-ssd 0 B 0 0 0 0 0 0 0 0 B 0 0 B 0 B 0 B
kubernetes 304 GiB 27085 0 81255 0 0 0 1949809 86 GiB 50065864 1.1 TiB 0 B 0 B
vms 4.1 MiB 18 0 54 0 0 0 25742 421 MiB 6943 1.8 GiB 0 B 0 B
volumes-hdd 41 GiB 5888 0 17664 0 0 0 65238 54 MiB 46200 2.4 GiB 0 B 0 B
volumes-ssd 192 KiB 3 0 9 0 0 0 1902532 7.1 GiB 186863 23 GiB 0 B 0 B
total_objects 49517
total_used 543 GiB
total_avail 41 TiB
total_space 42 TiB
실제 확인해보면 아래와 같이 benchmark_data로 시작되는 object들이 생성되어 있음을 확인할 수 있다.
[root@deploy ~]# rados -p lma ls
benchmark_data_ceph001_23719_object34
benchmark_data_ceph001_23719_object16
benchmark_data_ceph001_23719_object11
benchmark_data_ceph001_23719_object21
benchmark_data_ceph001_23719_object14
benchmark_data_ceph001_23719_object4
benchmark_data_ceph001_23719_object9
benchmark_data_ceph001_23719_object31
benchmark_data_ceph001_23719_object5
benchmark_data_ceph001_23719_object12
benchmark_data_ceph001_23719_object32
benchmark_data_ceph001_23719_object20
benchmark_data_ceph001_23719_object15
benchmark_data_ceph001_23719_object8
benchmark_data_ceph001_23719_object3
불필요한 파일이기에 아래와 같이 삭제를 진행한다.
[root@deploy ~]# rados -p lma cleanup
Warning: using slow linear search
Removed 39 objects
다음과 같은 명령으로도 benchmark test를 수행해볼수 있다.
아래 option중 60이 runtime에 대한 부분이라 60초 정도의 대기시간을 가지고 보면 cleanup까지 완료하게 된다.
[root@deploy ~]# rados bench -p images 60 write -b 4M -t 16
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 60 seconds or 0 objects
Object prefix: benchmark_data_ceph001_23891
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 20 4 16.0218 16 0.261381 0.102508
2 16 20 4 8.0052 0 - 0.102508
3 16 22 6 8.00329 4 2.646 0.851604
4 16 28 12 12.0035 24 0.0149619 1.64593
5 16 29 13 10.4023 4 2.29119 1.69556
6 16 36 20 13.3356 28 3.88388 2.24519
7 16 39 23 13.1447 12 0.754656 2.36526
8 16 40 24 12.0014 4 1.92938 2.3471
9 16 42 26 11.5566 8 8.7908 2.84277
10 16 43 27 10.8008 4 4.11371 2.88984
11 16 48 32 11.6371 20 4.88391 3.098
12 16 57 41 13.6674 36 0.0135469 2.54526
13 16 59 43 13.2313 8 12.8694 2.85755
14 16 61 45 12.8576 8 2.76138 2.88132
15 16 65 49 13.067 16 3.45294 3.5127
16 16 67 51 12.7503 8 5.02457 3.77546
17 16 70 54 12.7061 12 9.02825 3.96778
18 16 74 58 12.889 16 6.06797 4.07444
19 16 75 59 12.4212 4 2.81802 4.05314
2020-10-13 15:31:22.669988 min lat: 0.0134455 max lat: 15.4017 avg lat: 4.05314
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
20 16 75 59 11.8 0 - 4.05314
21 16 87 71 13.5238 24 0.480974 3.81639
22 16 88 72 13.0909 4 1.46078 3.78367
23 16 91 75 13.0434 12 5.83569 4.04294
24 16 93 77 12.8332 8 0.892728 3.99609
25 16 96 80 12.7999 12 1.06257 4.15543
26 16 98 82 12.6152 8 0.67585 4.10257
27 16 104 88 13.0369 24 0.378399 3.92878
28 16 104 88 12.5712 0 - 3.92878
29 16 107 91 12.5515 6 2.84133 3.97163
30 16 111 95 12.6664 16 3.17491 3.88385
31 16 112 96 12.3868 4 1.71044 3.86121
32 16 112 96 11.9997 0 - 3.86121
33 16 115 99 11.9997 6 3.02306 4.05637
34 16 117 101 11.8821 8 1.01826 3.98924
35 16 119 103 11.7711 8 4.29856 4.14715
36 16 119 103 11.4441 0 - 4.14715
37 16 121 105 11.351 4 2.6323 4.12513
38 16 121 105 11.0523 0 - 4.12513
39 16 122 106 10.8715 2 5.32104 4.13641
2020-10-13 15:31:42.671241 min lat: 0.0134455 max lat: 19.9444 avg lat: 4.23222
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
40 16 125 109 10.8997 12 4.17409 4.23222
41 16 125 109 10.6338 0 - 4.23222
42 16 126 110 10.4759 2 5.24569 4.24143
43 16 132 116 10.7904 24 0.0317681 4.74179
44 16 132 116 10.5451 0 - 4.74179
45 16 138 122 10.8441 12 2.3702 4.87268
46 16 142 126 10.9562 16 15.8002 5.05366
47 16 148 132 11.2337 24 4.53609 5.24322
48 16 151 135 11.2496 12 5.5353 5.2506
49 16 156 140 11.4282 20 2.70998 5.16667
50 16 162 146 11.6796 24 0.0166532 5.01238
51 16 168 152 11.9211 24 0.513471 4.8944
52 16 174 158 12.1534 24 7.09976 5.04459
53 16 181 165 12.4524 28 3.00569 4.94404
54 16 181 165 12.2218 0 - 4.94404
55 16 190 174 12.6541 18 0.0145103 4.77253
56 16 191 175 12.4995 4 2.35737 4.75873
57 16 196 180 12.6311 20 1.46972 4.71671
58 16 202 186 12.8271 24 1.16287 4.66769
59 16 204 188 12.7453 8 1.99294 4.63563
2020-10-13 15:32:02.672514 min lat: 0.0134455 max lat: 30.7736 avg lat: 4.62292
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
60 16 205 189 12.5995 4 2.23494 4.62292
61 16 205 189 12.3929 0 - 4.62292
62 16 206 190 12.2576 2 16.2268 4.684
63 16 206 190 12.063 0 - 4.684
64 16 206 190 11.8745 0 - 4.684
65 16 206 190 11.6918 0 - 4.684
66 16 206 190 11.5147 0 - 4.684
67 16 206 190 11.3428 0 - 4.684
68 16 206 190 11.176 0 - 4.684
69 16 206 190 11.014 0 - 4.684
70 16 206 190 10.8567 0 - 4.684
71 16 206 190 10.7038 0 - 4.684
72 8 206 198 10.9995 3.2 12.963 5.12861
73 8 206 198 10.8488 0 - 5.12861
74 8 206 198 10.7022 0 - 5.12861
Total time run: 74.7826
Total writes made: 206
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 11.0186
Stddev Bandwidth: 9.35059
Max bandwidth (MB/sec): 36
Min bandwidth (MB/sec): 0
Average IOPS: 2
Stddev IOPS: 2.36447
Max IOPS: 9
Min IOPS: 0
Average Latency(s): 5.67049
Stddev Latency(s): 6.46971
Max latency(s): 30.7736
Min latency(s): 0.0134455
Cleaning up (deleting benchmark objects)
Removed 206 objects
Clean up completed and total clean up time :7.81279
위 결과는 write에 대한 결과이다.
추가로 각 osd 별 bench 결과를 확인할수도 있다.
[root@deploy ~]# ceph tell osd.3 bench
{
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 2.2708148029999999,
"bytes_per_sec": 472844294.73573411,
"iops": 112.73486488717415
}
위의 결과를 간단히 보면 472MB/sec 의 throughput와 112 IOPS를 기록했다고 보면 된다.
OpenStack
가급적 VM에서 fio 명령을 사용하여 benchmark 결과를 만들어보는 것을 추천한다.
중간단에서 어떤 문제가 있을지 모르기에 실제 VM에서 volume을 mount하여 테스트 해보는것이 좋다.
아래와 같은 두가지 방식을 제안하고 있다.
fio
fio command를 이용한 방법은 아래 링크를 참고한다.
rbd bench --io-type write
기존 서비스에 영향을 최소화 하기위해 block device에 대한 benchmark를 수행할 pool을 새로 생성한다.
[root@deploy001 test]# ceph osd pool create rbdbench 32 32
pool 'rbdbench' created
block 이미지를 생성하고 생성한 이미지(image-test)를 map 시켜준다.
(당시 아래와 같은 feature의 disable을 요구하고 있기에 이를 맞춰준다.)
[root@deploy test]# rbd create image-test --size 1024 --pool rbdbench
[root@deploy test]# rbd map image-test --pool rbdbench --name client.admin
rbd: sysfs write failed
RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable rbdbench/image-test object-map fast-diff deep-flatten".
In some cases useful info is found in syslog - try "dmesg | tail".
rbd: map failed: (6) No such device or address
[root@deploy test]# rbd feature disable rbdbench/image-test object-map fast-diff deep-flatten
[root@deploy test]# rbd map image-test --pool rbdbench --name client.admin
/dev/rbd0
[root@deploy001 test]# ls -al /dev/ | grep rbd
drwxr-xr-x. 3 root root 60 10월 13 16:13 rbd
brw-rw----. 1 root disk 252, 0 10월 13 16:13 rbd0
위와 같이 생성이 완료되면 /dev/rbd0로 map 된다.
이제 해당 block device의 지정한 filesystem(ext4, 이는 필요에 따라 변경가능)로 생성해준다.
[root@deploy test]# /sbin/mkfs.ext4 -m0 /dev/rbd/rbdbench/image-test
mke2fs 1.42.9 (28-Dec-2013)
Discarding device blocks: done
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=1024 blocks, Stripe width=1024 blocks
65536 inodes, 262144 blocks
0 blocks (0.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=268435456
8 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376
Allocating group tables: done
Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done
[root@deploy001 test]# mkdir /mnt/ceph-block-device
[root@deploy001 test]# mount /dev/rbd/rbdbench/image-test /mnt/ceph-block-device/
모두 완료되면 mount할 directory를 생성하여 mount한다.
이제 rbd bench 명령을 사용해 benchmark를 측정해보자.
[root@deploy001 test]# rbd bench --io-type write image-test --pool=rbdbench
bench type write io_size 4096 io_threads 16 bytes 1073741824 pattern sequential
SEC OPS OPS/SEC BYTES/SEC
1 8496 6864.53 28117106.97
2 9952 4827.13 19771912.36
3 13136 4165.99 17063885.24
4 17456 3989.96 16342872.00
5 19520 3865.46 15832923.46
6 22288 2800.41 11470483.30
7 23936 2662.10 10903971.76
8 26080 2390.40 9791083.33
9 26928 2015.32 8254761.59
10 28448 1578.50 6465550.14
11 29696 1502.94 6156060.21
^C2020-10-13 16:26:10.357 7f0f4effd700 -1 received signal: Interrupt, si_code : 128, si_value (int): 534070624, si_value (ptr): 0x7ffe1fd54560, si_errno: 0, si_pid : 0, si_uid : 0, si_addr0, si_status534070624
elapsed: 16 ops: 30507 ops/sec: 1845.00 bytes/sec: 7557114.57
제일 마지막 결과를 확인하여 실제 어느정도의 iops를 가지는지 확인해볼수 있다.
ops: 30507 ops/sec: 1845.00 bytes/sec: 7557114.57
테스트가 모두 완료된 후 아래와 같이 cleanup 과정을 거친다.
[root@deploy001 test]# umount /mnt/ceph-block-device/
[root@deploy001 test]# rbd unmap /dev/rbd0
[root@deploy001 test]# rados -p rbdbench cleanup
마지막으로 pool을 삭제한다.
[root@deploy001 test]# ceph osd pool delete rbdbench rbdbench --yes-i-really-really-mean-it
pool 'rbdbench' removed
아래 사이트를 참고하여 위와 같은 작업을 수행하였다.
Performance Tunning
CPU performance 설정인데 2016년도 자료로 현재는 default로 performance로 셋팅되어 있다. (필자가 사용하는 ceph-ansible 기준)
Latency 관련 내용
Ceph pg 조정
아래 링크를 기반으로 pg num을 조정후 아래와 같이 좀더 나은 성능을 확인하였다.
[root@deploy001 ~]# rados bench -p images 60 write -b 4M -t 16
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 60 seconds or 0 objects
Object prefix: benchmark_data_deploy001_52437
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 266 250 1001.6 1000 0.0137876 0.0484179
2 16 372 356 712.5 424 0.012313 0.0746386
3 16 380 364 485.526 32 0.0172027 0.0825709
4 16 391 375 375.095 44 0.0170752 0.105056
5 16 405 389 311.252 56 0.0800455 0.150878
6 16 413 397 264.694 32 0.0138733 0.1597
7 16 432 416 237.729 76 1.55959 0.227514
8 16 452 436 218.006 80 0.0132394 0.24204
9 16 460 444 197.334 32 0.379661 0.26163
10 16 478 462 184.798 72 0.0131734 0.293294
11 16 503 487 177.086 100 0.270308 0.313229
12 16 520 504 167.991 68 0.857487 0.323478
13 16 544 528 162.45 96 0.0128463 0.357975
14 16 558 542 154.845 56 0.0607749 0.36093
15 16 573 557 148.52 60 0.295766 0.391108
16 16 586 570 142.486 52 0.339325 0.398898
17 16 599 583 137.162 52 0.0512127 0.426197
18 16 611 595 132.207 48 3.71891 0.445754
19 16 632 616 129.669 84 0.30158 0.454912
2020-10-26 18:21:00.852223 min lat: 0.0115086 max lat: 6.27441 avg lat: 0.452148
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
20 16 647 631 126.185 60 0.235778 0.452148
21 16 670 654 124.556 92 0.435795 0.480625
22 16 685 669 121.621 60 0.0162513 0.498286
23 16 691 675 117.376 24 1.69131 0.510926
24 16 709 693 115.485 72 0.0147364 0.528154
25 16 732 716 114.542 92 0.013838 0.539504
26 16 749 733 112.752 68 2.70373 0.547057
27 16 769 753 111.538 80 0.0128841 0.549705
28 16 782 766 109.41 52 0.0129247 0.557695
29 16 812 796 109.774 120 0.0452509 0.575331
30 16 824 808 107.715 48 1.91946 0.572266
31 16 836 820 105.787 48 0.361165 0.575898
32 16 854 838 104.731 72 0.0150643 0.585064
33 16 870 854 103.496 64 0.0125555 0.590651
34 16 892 876 103.04 88 0.0705526 0.608154
35 16 899 883 100.896 28 1.3139 0.610418
36 16 920 904 100.426 84 0.396711 0.625457
37 16 937 921 99.5491 68 0.395234 0.626916
38 16 959 943 99.2448 88 0.255478 0.635579
39 16 969 953 97.7253 40 1.76258 0.640573
2020-10-26 18:21:20.857338 min lat: 0.0115086 max lat: 6.60594 avg lat: 0.640597
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
40 16 995 979 97.8817 104 0.0717126 0.640597
41 16 1004 988 96.372 36 0.0126092 0.641064
42 16 1022 1006 95.7914 72 0.833975 0.651894
43 16 1037 1021 94.9585 60 0.0140078 0.65484
44 16 1044 1028 93.4366 28 3.49095 0.661643
45 16 1058 1042 92.6045 56 0.307017 0.667844
46 16 1079 1063 92.4172 84 0.342982 0.673593
47 16 1094 1078 91.7273 60 0.218679 0.67623
48 16 1112 1096 91.316 72 0.0151919 0.682937
49 16 1121 1105 90.187 36 3.47224 0.6926
50 16 1144 1128 90.223 92 0.0131828 0.690731
51 16 1162 1146 89.8654 72 0.383499 0.688672
52 16 1171 1155 88.8294 36 0.353203 0.690701
53 16 1188 1172 88.4362 68 0.0755775 0.700197
54 16 1208 1192 88.2798 80 0.582108 0.705604
55 16 1220 1204 87.5473 48 0.0125563 0.705474
56 16 1235 1219 87.0551 60 0.0132853 0.705285
57 16 1257 1241 87.0713 88 0.0135484 0.710079
58 16 1274 1258 86.7423 68 0.0130393 0.714436
59 16 1286 1270 86.0855 48 0.214974 0.710559
2020-10-26 18:21:40.861172 min lat: 0.0115086 max lat: 7.89654 avg lat: 0.720517
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
60 16 1310 1294 86.2504 96 0.548454 0.720517
61 15 1311 1296 84.9675 8 0.293133 0.719887
Total time run: 61.8125
Total writes made: 1311
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 84.8372
Stddev Bandwidth: 129.698
Max bandwidth (MB/sec): 1000
Min bandwidth (MB/sec): 8
Average IOPS: 21
Stddev IOPS: 32.4246
Max IOPS: 250
Min IOPS: 2
Average Latency(s): 0.748189
Stddev Latency(s): 1.22763
Max latency(s): 8.52709
Min latency(s): 0.0115086
Cleaning up (deleting benchmark objects)
Removed 1311 objects
Clean up completed and total clean up time :0.926593
Ceph Performance 관련 blogs
구성환경에 따라 다르겠지만 (All-flash, SAS/SSD 조합, 등등...) BlueStore를 사용하는 환경에서 한번 읽어보면 좋을거라 판단된다.
- ceph.io/community/bluestore-default-vs-tuned-performance-comparison/
- ceph.io/community/part-4-rhcs-3-2-bluestore-advanced-performance-investigation/
참고사이트
'Storage > System&Tools' 카테고리의 다른 글
Object Gateway with radosgw (0) | 2020.11.29 |
---|---|
Ceph-csi (0) | 2020.04.21 |
Ceph (0) | 2017.03.10 |
- Total
- Today
- Yesterday
- socket
- nginx-ingress
- aquasecurity
- boundary ssh
- Terraform
- minio
- Helm Chart
- openstack backup
- K3S
- kubernetes install
- kata container
- metallb
- macvlan
- azure policy
- GateKeeper
- OpenStack
- crashloopbackoff
- wsl2
- hashicorp boundary
- DevSecOps
- open policy agent
- openstacksdk
- ceph
- ansible
- kubernetes
- minikube
- vmware openstack
- jenkins
- mattermost
- Jenkinsfile
일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 |
8 | 9 | 10 | 11 | 12 | 13 | 14 |
15 | 16 | 17 | 18 | 19 | 20 | 21 |
22 | 23 | 24 | 25 | 26 | 27 | 28 |
29 | 30 | 31 |