본문 바로가기

Technical/Cloud, Virtualization, Containers

[Kubernetes StatefulSet] Mongodb Replicaset by StatefulSet(2/5)


Mongodb Replicaset을 구현하는 방법은 여러 가지(sidecar, init-container 방식 등)이 있지만, 여기서는 docker.io의 mongo 3.4 ~ mongo:3.7 까지의 범용 이미지를 사용한 Replicaset 구현 방법을 다루어 보고자 한다.


Mongodb Replicaset 구현 & 기능 검증


[Prerequisites]


  • Running k8s cluster with persistent storage(glusterfs class, etc.)
  • Tested kubernetes version: 1.9.x, 1.11.x


[Deployment resources - mongodb-service.yaml]


# Service/endpoint for load balancing the client connection from outside
# By NodePort
apiVersion: v1
kind: Service
metadata:
  namespace: ns-mongo
  name: mongodb-svc
  labels:
    role: mongo
spec:
  type: NodePort
  ports:
    - port: 27017
      name: client
      nodePort: 30017
  selector:
    role: mongors
---
# A headless service to create DNS records
apiVersion: v1
kind: Service
metadata:
  annotations:
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
  namespace: ns-mongo
  name: mongodb-hs
  labels:
    name: mongo
spec:
  # the list of ports that are exposed by this service
  ports:
    - port: 27017
      name: mongodb
      targetPort: 27017
  clusterIP: None
  selector:
    role: mongors
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  namespace: ns-mongo
  name: mongod-ss
spec:
  serviceName: mongodb-hs
  replicas: 3
  template:
    metadata:
      labels:
        role: mongors
        environment: test
        replicaset: MainRepSet
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: replicaset
                  operator: In
                  values:
                  - MainRepSet
              topologyKey: kubernetes.io/hostname
      terminationGracePeriodSeconds: 10
      volumes:
        - name: secrets-volume
          secret:
            secretName: shared-bootstrap-data
            defaultMode: 256
      containers:
        - name: mongod-container
          # Notice
          # Tested on mongo:3.4, 3.6, 3.7
          # as to mongo:3.2, an error happens like below:
          # Error parsing option "wiredTigerCacheSizeGB" as int: Bad digit "." while parsing 0.25
          image: mongo:3.4
          command:
            - "numactl"
            - "--interleave=all"
            - "mongod"
            - "--wiredTigerCacheSizeGB"
            - "0.25"
            - "--bind_ip"
            - "0.0.0.0"
            - "--replSet"
            - "MainRepSet"
            - "--auth"
            - "--clusterAuthMode"
            - "keyFile"
            - "--keyFile"
            - "/etc/secrets-volume/internal-auth-mongodb-keyfile"
            - "--setParameter"
            - "authenticationMechanisms=SCRAM-SHA-1"
          resources:
            requests:
              cpu: 0.3
              memory: 128Mi
          ports:
            - containerPort: 27017
          volumeMounts:
            - name: secrets-volume
              readOnly: true
              mountPath: /etc/secrets-volume
            - name: mongodb-pv-claim
              mountPath: /data/db
  volumeClaimTemplates:
    - metadata:
        namespace: ns-mongo
        name: mongodb-pv-claim
        annotations:
          volume.beta.kubernetes.io/storage-class: glusterfs-storage
      spec:
        accessModes: [ "ReadWriteOnce" ]
        resources:
          requests:
            storage: 400Mi

* Service(NodePort), headless service, statefulset 으로 구성



[Deployment steps]


  • Statefulset deployment

# git clone https://github.com/DragOnMe/mongo-statefulset-glusterfs.git

# cd mongo-statefulset-gluster-pv

# ./01-generate_mongo_ss.sh

namespace "ns-mongo" created

secret "shared-bootstrap-data" created

service "mongodb-svc" created

service "mongodb-hs" created

statefulset "mongod-ss" created


Waiting for the 3 containers to come up (2018. 09. 16. (일) 17:19:37 KST)...

 (IGNORE any reported not found & connection errors)

  Error from server (NotFound): pods "mongod-ss-2" not found

  ...

  Error from server (NotFound): pods "mongod-ss-2" not found

  error: unable to upgrade connection: container not found ("mongod-container")

  ...

  error: unable to upgrade connection: container not found ("mongod-container")

  connection to 127.0.0.1:27017

...mongod containers are now running (TIMESTAMP)


deployment "mongo-client" created

NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                                   STORAGECLASS        REASON    AGE

pvc-17853524-b98a-11e8-8c49-080027f6d038   1G         RWO            Delete           Bound     ns-mongo/mongodb-pv-claim-mongod-ss-1   glusterfs-storage             3m

pvc-1cea12e2-b98a-11e8-8c49-080027f6d038   1G         RWO            Delete           Bound     ns-mongo/mongodb-pv-claim-mongod-ss-2   glusterfs-storage             3m

pvc-407bdb23-b989-11e8-8c49-080027f6d038   1G         RWO            Delete           Bound     ns-mongo/mongodb-pv-claim-mongod-ss-0   glusterfs-storage             9m


NAME                                                 TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)           AGE

svc/glusterfs-dynamic-mongodb-pv-claim-mongod-ss-0   ClusterIP   10.140.100.198   <none>        1/TCP             9m

svc/glusterfs-dynamic-mongodb-pv-claim-mongod-ss-1   ClusterIP   10.136.50.38     <none>        1/TCP             3m

svc/glusterfs-dynamic-mongodb-pv-claim-mongod-ss-2   ClusterIP   10.135.150.63    <none>        1/TCP             3m

svc/mongodb-hs                                       ClusterIP   None             <none>        27017/TCP         9m

svc/mongodb-svc                                      NodePort    10.138.204.207   <none>        27017:30017/TCP   9m


NAME                     DESIRED   CURRENT   AGE

statefulsets/mongod-ss   3         3         9m


NAME                               STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS        AGE

pvc/mongodb-pv-claim-mongod-ss-0   Bound     pvc-407bdb23-b989-11e8-8c49-080027f6d038   1G         RWO            glusterfs-storage   9m

pvc/mongodb-pv-claim-mongod-ss-1   Bound     pvc-17853524-b98a-11e8-8c49-080027f6d038   1G         RWO            glusterfs-storage   3m

pvc/mongodb-pv-claim-mongod-ss-2   Bound     pvc-1cea12e2-b98a-11e8-8c49-080027f6d038   1G         RWO            glusterfs-storage   3m

Waiting for rollout to finish: 0 of 1 updated replicas are available...

deployment "mongo-client" successfully rolled out

* 3개의 mongod pod가 순차적으로 만들어지며, 마지막 2 번 pod가만들어지면 결과를 보여줌

* 생성이 진행되는 동안 Error from server, ... unable to upgrade connection 과 같은 오류가 발생하지만 종료될 때까지 무시


# kubectl get pods -n ns-mongo 

NAME                           READY     STATUS              RESTARTS   AGE

mongo-client-799dc789b-p8kgv   1/1       Running             0          10m

mongod-ss-0                    1/1       Running             0          36m

mongod-ss-1                    1/1       Running             0          30m

mongod-ss-2                    0/1       ContainerCreating   0          4s


# kubectl logs -f -n ns-mongo mongod-ss-0

2018-09-16T09:11:09.843+0000 I CONTROL  [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=mongod-ss-0

2018-09-16T09:11:09.844+0000 I CONTROL  [initandlisten] db version v3.4.17

...

2018-09-16T09:11:11.409+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.

2018-09-16T09:11:11.409+0000 I CONTROL  [initandlisten] **        We suggest setting it to 'never'

2018-09-16T09:11:11.834+0000 I REPL     [initandlisten] Did not find local replica set configuration document at startup;  NoMatchingDocument: Did not find replica set configuration document in local.system.replset

2018-09-16T09:11:11.838+0000 I NETWORK  [thread1] waiting for connections on port 27017


# kubectl logs -f -n ns-mongo mongod-ss-1

2018-09-16T09:11:18.847+0000 I CONTROL  [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=mongod-ss-1

...

2018-09-16T09:11:20.808+0000 I NETWORK  [thread1] waiting for connections on port 27017


# kubectl logs -f -n ns-mongo mongod-ss-2

2018-09-16T09:11:25.977+0000 I CONTROL  [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=mongod-ss-2

...

2018-09-16T09:11:27.538+0000 I CONTROL  [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.

2018-09-16T09:11:27.539+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.

...

2018-09-16T09:11:27.878+0000 I FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory '/data/db/diagnostic.data'

2018-09-16T09:11:28.224+0000 I REPL     [initandlisten] Did not find local voted for document at startup.

2018-09-16T09:11:28.224+0000 I REPL     [initandlisten] Did not find local replica set configuration document at startup;  NoMatchingDocument: Did not find replica set configuration document in local.system.replset

2018-09-16T09:11:28.226+0000 I NETWORK  [thread1] waiting for connections on port 27017

...

2018-09-16T09:11:32.214+0000 I -        [conn1] end connection 127.0.0.1:53452 (1 connection now open)

* Replica set 의 구성 대상인 3개의 mongod pod가 위와 같은 대기상태의 메시지 log(...waiting for connection..) 를 보일 때까지 대기


  • Replicaset 초기화 및 mongod admin 계정 생성

# ./02-configure_repset_auth.sh abc123

Configuring the MongoDB Replica Set

MongoDB shell version v3.4.17

connecting to: mongodb://127.0.0.1:27017

MongoDB server version: 3.4.17

{ "ok" : 1 }


Waiting for the MongoDB Replica Set to initialise...

MongoDB shell version v3.4.17

connecting to: mongodb://127.0.0.1:27017

MongoDB server version: 3.4.17

.

.

...initialisation of MongoDB Replica Set completed


Creating user: 'main_admin'

MongoDB shell version v3.4.17

connecting to: mongodb://127.0.0.1:27017

MongoDB server version: 3.4.17

Successfully added user: {

"user" : "main_admin",

"roles" : [

{

"role" : "root",

"db" : "admin"

}

]

}

* Replicset 구성을 수행하도록 하고, 데이터베이스 암호는 abc123 으로 설정

* Mongodb Replicaset 의 구현이 완료되었다. 이제 Replicaset 을 인식하는 클러스터 내의 다른 앱에서 각 mongod pod에 아래와 같은 URI 로 접속이 가능하다. 예를 들어 pymongo 의 경우 mongodb://mongod-ss-0.mongodb-hs.ns-mongo.svc.cluster.local:27017,mongod-ss-1.mongodb-hs.ns-mongo.svc.cluster.local:27017,mongod-ss-2.mongodb-hs.ns-mongo.svc.cluster.local:27017/?replicaSet=test 와 같은 형식으로 접속 가능


  - mongod-ss-0.mongodb-hs.ns-mongo.svc.cluster.local:27017

  - mongod-ss-1.mongodb-hs.ns-mongo.svc.cluster.local:27017

  - mongod-ss-2.mongodb-hs.ns-mongo.svc.cluster.local:27017



[How to check if things are working properly]


  • Replicaset member, mongod 간 데이터 동기화

# export MONGOD_NAMESPACE="ns-mongo"

# export MONGO_CLIENT=`kubectl get pods -n $MONGOD_NAMESPACE | grep mongo-client | awk '{print $1}'`

# kubectl exec -it -n $MONGOD_NAMESPACE $MONGO_CLIENT -- mongo mongodb://mongod-ss-0.mongodb-hs.ns-mongo.svc.cluster.local:27017

MongoDB shell version v3.4.2

connecting to: mongodb://mongod-ss-0.mongodb-hs.ns-mongo.svc.cluster.local:27017

MongoDB server version: 3.4.17

MainRepSet:PRIMARY> db.getSiblingDB('admin').auth("main_admin", "abc123");

1

MainRepSet:PRIMARY> use test;

switched to db test

MainRepSet:PRIMARY> db.testcoll.insert({a:1});

WriteResult({ "nInserted" : 1 })

MainRepSet:PRIMARY> db.testcoll.insert({b:2});

WriteResult({ "nInserted" : 1 })

MainRepSet:PRIMARY> db.testcoll.find();

{ "_id" : ObjectId("5b9fd8f0bc9812b50016a157"), "a" : 1 }

{ "_id" : ObjectId("5b9fd8f8bc9812b50016a158"), "b" : 2 }

* Primary(0 번)에 접속하여 데이터 입력, 저장


# kubectl exec -it -n $MONGOD_NAMESPACE $MONGO_CLIENT -- mongo mongodb://mongod-ss-1.mongodb-hs.ns-mongo.svc.cluster.local:27017

MongoDB shell version v3.4.2

connecting to: mongodb://mongod-ss-1.mongodb-hs.ns-mongo.svc.cluster.local:27017

MongoDB server version: 3.4.17

MainRepSet:SECONDARY> db.getSiblingDB('admin').auth("main_admin", "abc123");

1

MainRepSet:SECONDARY> use test;

switched to db test

MainRepSet:SECONDARY> db.setSlaveOk(1);

MainRepSet:SECONDARY> db.testcoll.find();

{ "_id" : ObjectId("5b9fd8f0bc9812b50016a157"), "a" : 1 }

{ "_id" : ObjectId("5b9fd8f8bc9812b50016a158"), "b" : 2 }

* Secondary(1번과 2번)에 접속하여  데이터 동기화 확인



  • Primary Pod 삭제 후, ReplicaSet 및 데이터 유지 확인
# kubectl get pod -n ns-mongo -o wide
NAME                           READY     STATUS    RESTARTS   AGE       IP          NODE
mongo-client-799dc789b-6rgvv   1/1       Running   0          1d        10.38.0.8   kubenode3
mongod-ss-0                    1/1       Running   6          1d        10.40.0.5   kubenode2
mongod-ss-1                    1/1       Running   6          1d        10.38.0.2   kubenode3
mongod-ss-2                    1/1       Running   0          1d        10.38.0.7   kubenode3

# kubectl delete pod -n ns-mongo mongod-ss-0
pod "mongod-ss-0" deleted

# kubectl get pod -n ns-mongo -o wide
NAME                           READY     STATUS    RESTARTS   AGE       IP          NODE
mongo-client-799dc789b-6rgvv   1/1       Running   0          1d        10.38.0.8   kubenode3
mongod-ss-0                    1/1       Running   0          4s        10.40.0.5   kubenode2
mongod-ss-1                    1/1       Running   6          1d        10.38.0.2   kubenode3
mongod-ss-2                    1/1       Running   0          1d        10.38.0.7   kubenode3

# kubectl exec -it -n $MONGOD_NAMESPACE $MONGO_CLIENT -- mongo mongodb://mongod-ss-0.mongodb-hs.ns-mongo.svc.cluster.local:27017
MongoDB shell version v3.4.2
connecting to: mongodb://mongod-ss-0.mongodb-hs.ns-mongo.svc.cluster.local:27017
MongoDB server version: 3.4.17
MainRepSet:SECONDARY> db.getSiblingDB('admin').auth("main_admin", "abc123");
1
MainRepSet:SECONDARY> use test;
switched to db test
MainRepSet:SECONDARY> db.setSlaveOk(1);
MainRepSet:SECONDARY> db.testcoll.find();
{ "_id" : ObjectId("5b9fd8f0bc9812b50016a157"), "a" : 1 }
{ "_id" : ObjectId("5b9fd8f8bc9812b50016a158"), "b" : 2 }

# kubectl exec -it -n $MONGOD_NAMESPACE $MONGO_CLIENT -- mongo mongodb://mongod-ss-2.mongodb-hs.ns-mongo.svc.cluster.local:27017
MongoDB shell version v3.4.2
connecting to: mongodb://mongod-ss-2.mongodb-hs.ns-mongo.svc.cluster.local:27017
MongoDB server version: 3.4.17
MainRepSet:PRIMARY>
* mongod-ss-0 을 삭제한 후 mongod-ss-0은 다시 되살아 났으며(데이터는 유지), mongod-ss-2가 Primary로 승격되었다. 결과적으로 Replicaset은 계속 유지

  • 서비스 삭제 후 재생성, DB 데이터 유지 확인

# ./03-delete_service.sh 

statefulset "mongod-ss" deleted

service "mongodb-hs" deleted

NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                                   STORAGECLASS        REASON    AGE

pvc-6eb5c4f7-b990-11e8-8c49-080027f6d038   1G         RWO            Delete           Bound     ns-mongo/mongodb-pv-claim-mongod-ss-0   glusterfs-storage             1d

pvc-740db9df-b990-11e8-8c49-080027f6d038   1G         RWO            Delete           Bound     ns-mongo/mongodb-pv-claim-mongod-ss-1   glusterfs-storage             1d

pvc-78ccd2ce-b990-11e8-8c49-080027f6d038   1G         RWO            Delete           Bound     ns-mongo/mongodb-pv-claim-mongod-ss-2   glusterfs-storage             1d

# kubectl get all -n ns-mongo -l role=mongors

No resources found.

* Service(headless)와 Statefulset을 삭제


# ./04-recreate_service.sh 

NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                                   STORAGECLASS        REASON    AGE

pvc-6eb5c4f7-b990-11e8-8c49-080027f6d038   1G         RWO            Delete           Bound     ns-mongo/mongodb-pv-claim-mongod-ss-0   glusterfs-storage             1d

pvc-740db9df-b990-11e8-8c49-080027f6d038   1G         RWO            Delete           Bound     ns-mongo/mongodb-pv-claim-mongod-ss-1   glusterfs-storage             1d

pvc-78ccd2ce-b990-11e8-8c49-080027f6d038   1G         RWO            Delete           Bound     ns-mongo/mongodb-pv-claim-mongod-ss-2   glusterfs-storage             1d

service "mongodb-svc" unchanged

service "mongodb-hs" created

statefulset "mongod-ss" created

NAME                                                 TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)           AGE

svc/glusterfs-dynamic-mongodb-pv-claim-mongod-ss-0   ClusterIP   10.134.165.156   <none>        1/TCP             1d

svc/glusterfs-dynamic-mongodb-pv-claim-mongod-ss-1   ClusterIP   10.131.121.182   <none>        1/TCP             1d

svc/glusterfs-dynamic-mongodb-pv-claim-mongod-ss-2   ClusterIP   10.138.205.244   <none>        1/TCP             1d

svc/mongodb-hs                                       ClusterIP   None             <none>        27017/TCP         6s

svc/mongodb-svc                                      NodePort    10.129.188.234   <none>        27017:30017/TCP   1d


NAME                     DESIRED   CURRENT   AGE

statefulsets/mongod-ss   3         3         6s


NAME                              READY     STATUS              RESTARTS   AGE

po/mongo-client-799dc789b-6rgvv   1/1       Running             0          1d

po/mongod-ss-0                    1/1       Running             0          6s

po/mongod-ss-1                    1/1       Running             0          3s

po/mongod-ss-2                    0/1       ContainerCreating   0          1s

NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                                   STORAGECLASS        REASON    AGE

pvc-6eb5c4f7-b990-11e8-8c49-080027f6d038   1G         RWO            Delete           Bound     ns-mongo/mongodb-pv-claim-mongod-ss-0   glusterfs-storage             1d

pvc-740db9df-b990-11e8-8c49-080027f6d038   1G         RWO            Delete           Bound     ns-mongo/mongodb-pv-claim-mongod-ss-1   glusterfs-storage             1d

pvc-78ccd2ce-b990-11e8-8c49-080027f6d038   1G         RWO            Delete           Bound     ns-mongo/mongodb-pv-claim-mongod-ss-2   glusterfs-storage             1d


Keep running the following command until all 'mongod-ss-n' pods are shown as running:  kubectl get svc,sts,pods -n ns-mongo

# kubectl get all -n ns-mongo -l role=mongors

NAME                     DESIRED   CURRENT   AGE

statefulsets/mongod-ss   3         3         3m


NAME             READY     STATUS    RESTARTS   AGE

po/mongod-ss-0   1/1       Running   0          3m

po/mongod-ss-1   1/1       Running   0          3m

po/mongod-ss-2   1/1       Running   0          3m

* Service와 Replicaset 재생성. 새로운 mongod pod가 만들어지면서 기존 pv를 binding



# kubectl exec -it -n $MONGOD_NAMESPACE $MONGO_CLIENT -- mongo mongodb://mongod-ss-0.mongodb-hs.ns-mongo.svc.cluster.local:27017

MongoDB shell version v3.4.2

connecting to: mongodb://mongod-ss-0.mongodb-hs.ns-mongo.svc.cluster.local:27017

MongoDB server version: 3.4.17

MainRepSet:PRIMARY> db.getSiblingDB('admin').auth("main_admin", "abc123");

1

MainRepSet:PRIMARY> use test;

switched to db test

MainRepSet:PRIMARY> db.testcoll.find();

{ "_id" : ObjectId("5b9fd8f0bc9812b50016a157"), "a" : 1 }

{ "_id" : ObjectId("5b9fd8f8bc9812b50016a158"), "b" : 2 }

* 기존 PV의 데이터가 그대로 유지되면서 Replicaset은 초기 상태로 복구(mongod-ss-0이 primary), db 데이터도 그대로 유지



- Barracuda -