티스토리 뷰

Cloud/Cloud Native

Accident for deleting pvc

Jacob_baek 2020. 6. 23. 18:59

이슈사항 및 발생원인

production 환경에서 잘못해서 PVC를 삭제한 경우 어떻게 대처하고 재생성을 하였는지를 기록해보고자 한다.

실수로 dev 환경으로 착각하고 아래와 같은 명령어를 실행하였다.

[root@kube ~]# kubectl delete pvc/spinnaker-minio -n spinnaker

실제 pvc는 volumeattached 명령어로 확인해본 결과 아직 attached 상태이기에 pv가 삭제되지는 않았다.
(다시한번 production에 대한 permission 관리와 다양한 접근제어가 필요하다는 생각을...)

[root@kube ~]# kubectl get volumeattachment -n spinnaker
NAME                                                                   ATTACHER           PV                                         NODE                 ATTACHED   AGE
csi-1c55703a3cb26172ebdb32009649025c70e69d304ae7450f7f1578811d4022df   rbd.csi.ceph.com   pvc-03a39b1f-529e-4593-a5c3-a0b2405605b8   labs-kube-infra003   true       160d
csi-1fb754fc8fe8099b1a192cac40d2697d43c95b4330d8f4dd50911791e9c25634   rbd.csi.ceph.com   pvc-de078b0b-3971-4ff4-869b-94286ac8e25b   labs-kube-infra003   true       160d
csi-6ab5b4d38a317ecf49de5e89d18a309bd606f7191169be2c93b0c357352deafa   rbd.csi.ceph.com   pvc-724c6e4a-c31e-4fd8-bb1e-4c31b2115d2b   labs-kube-infra001   true       26h
csi-78aba44159c7b7dcb2ab1ddbf9a29636d0d523f307530a5fac75c8b0dfb2a649   rbd.csi.ceph.com   pvc-18ec78bc-7c2f-4c32-a0cd-17d1ef39a5da   labs-kube-infra003   true       26d
csi-8040726a7148a1b974a3e0d768d1149c93a3f016320cdebf163f27cdfb94ea9a   rbd.csi.ceph.com   pvc-4c17447d-9296-486f-86d0-e3fe97212d5e   labs-kube-infra003   true       154d
csi-a60256901e0359b09a21b191f0dae84cfbf13f9d5b5fb59e50707bf5d6993cc1   rbd.csi.ceph.com   pvc-b58f1a67-ef46-477e-9855-4dbd99a779ee   labs-kube-infra001   true       99d
csi-b97e177375b47e804630cbdd375b5143571dd6439522f1185e0db29e7cf75c1a   rbd.csi.ceph.com   pvc-88ad71d2-76a4-4a84-b551-e6a0528c3b6c   labs-kube-infra003   true       160d
csi-c0526b6e8e5c48a4cd03a33379c1243ba9a19f1f4dc5351e36ec58cf85fa3f61   rbd.csi.ceph.com   pvc-cb233626-914a-49e4-8a9a-2df36299334f   labs-kube-infra001   true       47d
csi-c294ba3a06564c9a61516528bf69196197c3ea50fbda647e23f0bd011cec385c   rbd.csi.ceph.com   pvc-bd62608d-4290-4452-b05e-427e3358e927   labs-kube-infra001   true       154d
csi-c94771c39008c9c5ca2ffd0fc2eded39519d54a84705cdd0fa166b3fbb5da587   rbd.csi.ceph.com   pvc-6e30e1c4-25e5-4028-a750-8d345607a0ea   labs-kube-infra001   true       47d
csi-caeb766845d9b67da72a9ce7722c1371baa0cf4b25bb88bad0106a81167a3c3d   rbd.csi.ceph.com   pvc-ec02fe57-9dc1-423d-9029-dffde5d6cf7c   labs-kube-infra003   true       26h
csi-d6325abe1a5025e5fef9542e7d4ed921cababb4c245fa210673722ea78e252ba   rbd.csi.ceph.com   pvc-abc9480e-4a12-47c4-8d93-0045aa3df8c6   labs-kube-infra002   true       26h
csi-e5dd81617dc89f8ac4f5ee2ad5a06cf8fccbfe19da9f1f8aab4052ac9cb9f844   rbd.csi.ceph.com   pvc-b2f32f76-46fe-4ad0-a7f4-018955b11e71   labs-kube-infra001   true       154d
csi-e95796dd692ddd3da775d8cb5152d315bd0999474262781cf065dcd6eefe60db   rbd.csi.ceph.com   pvc-3685c8f5-aba0-4e0b-99df-ea339842a9d7   labs-kube-infra003   true       32d

실제 pvc는 다음과 같은 terminating 상태를 가졌다.

[root@kube ~]# kubectl get pvc -n spinnaker
NAME                                         STATUS        VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
halyard-home-spinnaker-spinnaker-halyard-0   Bound         pvc-bd62608d-4290-4452-b05e-427e3358e927   10Gi       RWO            csi-rbd-sc     154d
redis-data-spinnaker-redis-master-0          Bound         pvc-abc9480e-4a12-47c4-8d93-0045aa3df8c6   8Gi        RWO            csi-rbd-sc     154d
redis-data-spinnaker-redis-slave-0           Bound         pvc-b2f32f76-46fe-4ad0-a7f4-018955b11e71   8Gi        RWO            csi-rbd-sc     154d
redis-data-spinnaker-redis-slave-1           Bound         pvc-4c17447d-9296-486f-86d0-e3fe97212d5e   8Gi        RWO            csi-rbd-sc     154d
spinnaker-minio                              Terminating   pvc-3685c8f5-aba0-4e0b-99df-ea339842a9d7   10Gi       RWO            csi-rbd-sc     32d

대처 및 해결방법

우선 삭제후 재생성이 필요하다고 생각하였고
아래와 같은 metadata patch 명령어를 실행해 기존에 있던 pvc를 삭제해버렸다.

kubectl patch pvc db-pv-claim -p '{"metadata":{"finalizers":null}}'

기존에 백업해 놓은 pvc manifest를 복사해와 이를 가지고 복원하고자 했다.
하지만 아래와 같이 실제 status가 Lost로 정상상태인 Bound가 되지 않았다.

[root@kube ~]# kubectl get pvc -n spinnaker
NAME                                         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
halyard-home-spinnaker-spinnaker-halyard-0   Bound    pvc-bd62608d-4290-4452-b05e-427e3358e927   10Gi       RWO            csi-rbd-sc     154d
redis-data-spinnaker-redis-master-0          Bound    pvc-abc9480e-4a12-47c4-8d93-0045aa3df8c6   8Gi        RWO            csi-rbd-sc     154d
redis-data-spinnaker-redis-slave-0           Bound    pvc-b2f32f76-46fe-4ad0-a7f4-018955b11e71   8Gi        RWO            csi-rbd-sc     154d
redis-data-spinnaker-redis-slave-1           Bound    pvc-4c17447d-9296-486f-86d0-e3fe97212d5e   8Gi        RWO            csi-rbd-sc     154d
spinnaker-minio                              Lost     pvc-3685c8f5-aba0-4e0b-99df-ea339842a9d7   0                         csi-rbd-sc     106s

확인해본 결과 아래와 같이 해당 pvc에서 claim하려는 pv가 다른곳에서 이미 bound 한상태이라 lost상태가 된것이었다.

[root@kube ~]# kubectl describe pvc/spinnaker-minio -n spinnaker
Name:          spinnaker-minio
Namespace:     spinnaker
StorageClass:  csi-rbd-sc
Status:        Lost
Volume:        pvc-3685c8f5-aba0-4e0b-99df-ea339842a9d7
Labels:        app=minio
               chart=minio-1.6.3
               heritage=Helm
               release=spinnaker
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: rbd.csi.ceph.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      0
Access Modes:  
VolumeMode:    Filesystem
Mounted By:    spinnaker-minio-76fb7f68c9-hrsf5
Events:
  Type     Reason         Age    From                         Message
  ----     ------         ----   ----                         -------
  Warning  ClaimMisbound  3m50s  persistentvolume-controller  Two claims are bound to the same volume, this one is bound incorrectly

실제 해당 pv를 확인해보자 아래와 같이 uid가 새로 생성한 pvc의 uid가 아닌 다른 uid를 가지고 있었다.

[root@kube ~]# kubectl get pv/pvc-3685c8f5-aba0-4e0b-99df-ea339842a9d7 -n spinnaker -o yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: rbd.csi.ceph.com
  creationTimestamp: "2020-05-22T07:53:30Z"
  finalizers:
  - kubernetes.io/pv-protection
  name: pvc-3685c8f5-aba0-4e0b-99df-ea339842a9d7
  resourceVersion: "32467045"
  selfLink: /api/v1/persistentvolumes/pvc-3685c8f5-aba0-4e0b-99df-ea339842a9d7
  uid: 43e88b05-c89f-4eb3-b81a-eaee251c2247
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 10Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: spinnaker-minio
    namespace: spinnaker
    resourceVersion: "25843213"
    uid: 3685c8f5-aba0-4e0b-99df-ea339842a9d7
  csi:
    driver: rbd.csi.ceph.com
    fsType: ext4
    nodeStageSecretRef:
      name: csi-rbd-secret
      namespace: default
    volumeAttributes:
      clusterID: 5fb1204b-6152-41a8-b4cb-f819e8728a6c
      pool: kubernetes
      storage.kubernetes.io/csiProvisionerIdentity: 1589350154546-8081-rbd.csi.ceph.com
    volumeHandle: 0001-0024-5fb1204b-6152-41a8-b4cb-f819e8728a6c-0000000000000003-495c8d33-9c01-11ea-875b-3acb21213cec
  mountOptions:
  - discard
  persistentVolumeReclaimPolicy: Delete
  storageClassName: csi-rbd-sc
  volumeMode: Filesystem
status:
  phase: Released

하여 검색해보니 아래와 같이 pv의 claimRef의 uid를 patch로 새로 생성한 pvc uid로 변경하거나 edit를 통해 해당 claimRef의 uid를 변경하는것을 권장했다.
edit를 통해 변경하였고 이와 같이 변경한후 아래와 같이 정상적으로 bound된것을 확인할 수 있었다.

[root@kube ~]# kubectl get pvc -n spinnaker
NAME                                         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
halyard-home-spinnaker-spinnaker-halyard-0   Bound    pvc-bd62608d-4290-4452-b05e-427e3358e927   10Gi       RWO            csi-rbd-sc     154d
redis-data-spinnaker-redis-master-0          Bound    pvc-abc9480e-4a12-47c4-8d93-0045aa3df8c6   8Gi        RWO            csi-rbd-sc     154d
redis-data-spinnaker-redis-slave-0           Bound    pvc-b2f32f76-46fe-4ad0-a7f4-018955b11e71   8Gi        RWO            csi-rbd-sc     154d
redis-data-spinnaker-redis-slave-1           Bound    pvc-4c17447d-9296-486f-86d0-e3fe97212d5e   8Gi        RWO            csi-rbd-sc     154d
spinnaker-minio                              Bound    pvc-3685c8f5-aba0-4e0b-99df-ea339842a9d7   10Gi       RWO            csi-rbd-sc     113s

즉, 해결방법은 기존 Terminating상태의 pvc를 삭제하고 새로운 pvc를 생성하는데 기존 pv가 claimRef로 참조되어있던 정보를 새로운 pvc의 정보로 업데이트 해주면 되었다.

참고사이트

'Cloud > Cloud Native' 카테고리의 다른 글

nginx ingress with namespace  (0) 2020.08.24
Make Helm chart repo  (0) 2020.07.23
minikube start in WSL2  (0) 2019.12.07
kubernetes troubleshooting  (0) 2019.11.14
Multi configured kubectl  (0) 2019.10.28
댓글
공지사항
최근에 올라온 글
최근에 달린 댓글
Total
Today
Yesterday
링크
«   2025/01   »
1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31
글 보관함