Skip to main content

Consuming File System artifacts from Kubernetes Pods

When you are deploying an application which contains artifacts written on file system dynamically withing kubernetes (k8s), for example a tomcat server exposed to outside to deploy war files, you need to make sure the file system state is preserved always. Otherwise if the pod goes down, you might loose data.

So one solution is to mount an external disk. Yes indeed you can do that. But how robust is that solution. Say something happened to the external disk. How can you recover the data? Use several disks and rsync to sync the data. Sounds a robust solution. Say you want to increase the reliability. And what happens if rsync process get killed. How much will it cost to make it's reliability closer to 100%?

We have a robust, simple solution. It's using gluster to save data. [1] [2]



We install a pod named gluster for each node. There is an additional disk attached to each node which will be used as the data storage for gluster. This disk is formatted in a special format and attached to the gluster.

This disk can be attached to a specific path and we can mention it as below (eg :- /dev/sdb) in the topology.json. Following is a sample. When your run gkdeploy script to install gluster, you have to place topology.json in the same location.

 
   "clusters": 
       
         "nodes": 
             
               "node": 
                  "hostnames": 
                     "manage": 
                        "node0"
                     ],
                     "storage": 
                        "192.168.10.100"
                     ]
                  },
                  "zone":1
               },
               "devices": 
                  "/dev/sdb"
               ]
            },
             
               "node": 
                  "hostnames": 
                     "manage": 
                        "node1"
                     ],
                     "storage": 
                        "192.168.10.101"
                     ]
                  },
                  "zone":1
               },
               "devices": 
                  "/dev/sdb"
               ]
            },
             
               "node": 
                  "hostnames": 
                     "manage": 
                        "node2"
                     ],
                     "storage": 
                        "192.168.10.102"
                     ]
                  },
                  "zone":1
               },
               "devices": 
                  "/dev/sdb"
               ]
            }
         ]
      }
   ]
}

Once you install gluster in k8s now comes the question on how they interconnect? There is an another component called heketi [3] which is generally used to manage volumes. Here heketi is used to manage gluster cluster. You can use heketi cli [4] to do advanced operations. Heketi is included in the gk-deploy script from glusterfs.

Note that this blog is written to explain a problem and a solution. Here I haven't included any low level operational details or installation details.

Once gluster is installed in your k8s setup, next question is to how to use it? First you need to create a storage class in k8s with the heketi cli url. Here this url should be accessible to master node of k8s deployment. It looks like below.

kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: glusterfs-storage
provisioner: kubernetes.io/glusterfs
parameters:
resturl: "http://${HEKETI_URL}"

After that there should be a pvc (Persisted Volume Claim) to use this storage class.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: gluster-pvc
annotations:
volume.beta.kubernetes.io/storage-class: glusterfs-storage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi

Then you can use pvc in your pod conflagration as a volume mount.

volumeMounts:
-
mountPath: /mnt/files/diff
name: sample-files

volumes:
-
name: synapse-files
persistentVolumeClaim:
claimName: gluster-pvc

Hope this helped someone to create a solution.
Good luck on deployment!!!

[1] https://github.com/gluster/glusterfs
[2] https://github.com/gluster/gluster-kubernetes
[3] https://github.com/heketi/heketi
[4] https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html/container-native_storage_for_openshift_container_platform/chap-documentation-red_hat_gluster_storage_container_native_with_openshift_platform-heketi_cli

Comments

Popular posts from this blog

Generate JWT access tokens from WSO2 Identity Server

In Identity Server 5.2.0 we have created an interface to generate access tokens. Using that we have developed a sample to generate JWT tokens. You can find that sample under msf4j samples[1][2]. If you are build it as it is you will need to use Java 8 to build since msf4j is developed on Java 8. So you will need to run Identity Server on Java 8 as well. After building the project[2] please copy the jar inside target directory to $IS_HOME/repository/components/dropins/ directory. And then please add the following configuration to Identity.xml which is placed under $IS_HOME/repository/conf/identity/ folder inside tag OAuth . <IdentityOAuthTokenGenerator>com.wso2.jwt.token.builder.JWTAccessTokenBuilder</IdentityOAuthTokenGenerator> Then go to the database you used to store oauth tokens (This is the database pointed from the datasource you mentioned in the $IS_HOME/repository/conf/identity/identity.xml) and then alter the size of the column ACCESS_TOKEN of the tab...

Integrate New Relic with WSO2 API Manager

In WSO2 API Manager, we have two transports. HTTP servlet transport and Passthru / NIO transport. All the web application requests are handled through HTTP servlet transport which is on 9763 port and 9443 port with ssl and here we are using tomcat inside WSO2 products. All the service requests are served via Passthru / NIO transport which is on 8082 and 8243 with ssl. When we integrate API Manager with new relic in the way discussed in blog posts [5],[6], new relic only detects the calls made to tomcat transports. So we couldn’t get the API calls related data OOTB. But by further analyzing new relic APIs I managed to find a workaround for this problem. New relic supports publishing custom events via their insights api[1]. So what we can do is publish these data via custom API handler[2]. Following is a sample implementation of a handler that I used to test the scenario. I will attach the full project herewith[7]. I have created an osgi bundle with this implementation so after building ...

Setting up Single node Kubernetes Cluster with Core OS bare metal

You might know already there is an official documentation to follow to setup a Kubernetes cluster on Core OS bare metal. But when do that specially single node cluster, I found some gaps in that documentation [1] . And another reason for this blog post is to get everything into one place. So this blog post will describe how to overcome the issues of setting up a single node cluster. Installing Core OS bare metal. You can refer to doc [2]  to install core os.  First thing is about users. Documentation [2]  tells you how to create a user without password. To login as that user you will need ssh keys. So to create a user with username password, you can use a cloud-config.yaml file. Here is a sample. #cloud-config users: - name: user passwd: $6$SALT$3MUMz4cNIRjQ/Knnc3gXjJLV1vdwFs2nLvh//nGtEh/.li04NodZJSfnc4jeCVHd7kKHGnq5MsenN.tO6Z.Cj/ groups: - sudo - docker Here value for passwd is a hash value. One of the below methods can be used...