The introduction of containers has revolutionized the software development industry and made building, testing, and deploying applications significantly easier. However, modern applications and infrastructure require an ever-growing number of containers and servers. These generate such a volume of logs and information that monitoring everything is a massive challenge. Deadlines, inexperience, culture, and management are just some of the obstacles that can affect how successful teams are at overcoming this challenge.
Monitoring an application’s current state is one of the most effective ways to anticipate problems and discover bottlenecks in a production environment. Yet, it is also currently one of the biggest challenges faced by almost all software development organizations.
The growing adoption of microservices makes logging and monitoring more complex, with the need for complex Kubernetes monitoring. This is because a large number of applications—distributed and diversified in nature—are communicating with each other. A single point of failure can stop the entire process, and identifying it is becoming increasingly difficult.
Monitoring, of course, is just one challenge that microservices pose. Handling availability, performance, and deployments are pushing teams to create or use orchestrators to handle all services and servers. There are several cluster orchestration tools, but Kubernetes (K8S) is becoming increasingly popular compared to its competitors. A container orchestration tool such as Kubernetes handles containers in several computers and removes the complexity of handling distributed processing.
But how do you monitor such a tool? There are so many variables to keep track of that we need new tools and new methods to effectively capture the data that matters.
In this guide, we will discuss the importance of monitoring K8S, what metrics can be used, and compare several tools that can be used for this purpose.
Why Monitor Kubernetes and What Metrics Can Be Measured
As mentioned, Kubernetes is currently the most popular container orchestrator available. It is officially available in major clouds, such as AWS, Azure, and Google, and it can also run in a local data center. Even Docker has embraced Kubernetes and is now offering it as part of some of its packages.
When it comes to gathering information from your containers and monitoring your Kubernetes platform, there are several metrics to monitor. These can generally be separated into two main components:
- monitoring the cluster itself
- monitoring pods.
1. Monitoring Kubernetes Clusters
Cluster monitoring involves measuring the health of the entire Kubernetes cluster, which consists of the servers operating Kubernetes and the various pods being managed under the service.
We want to know whether all the nodes in the cluster are working properly and at what capacity, how many applications are running on each node, and the resource utilization of the entire cluster.
Node resource utilization
There are many metrics in this area, all focused on resource utilization. Metrics such as network bandwidth, disk utilization, CPU, and memory utilization are examples of this. Using these metrics, you can find out whether or not to increase or decrease the number and size of nodes in the cluster.
The number of nodes
The number of nodes available is an important metric to follow. This allows you to figure out what you are paying for (if you are using cloud providers) and discover what the cluster is being used for.
The number of pods running will show you whether the number of nodes available is sufficient and whether they will be able to handle the entire workload in case a node fails.
2. Monitoring Kubernetes Pods
By contrast, pod monitoring focuses on monitoring the performance and operation of each individual pod (or container) that is currently in operation through Kubernetes. The act of monitoring a pod can be further separated into three categories:
- Kubernetes metrics
- pod container metrics
- application metrics.
Using Kubernetes metrics, we can monitor how a specific pod and its deployment are being handled by the orchestrator. This includes measurements for the following areas:
- the number of instances a pod has at the moment and how many were expected (if the number is low, your cluster may be out of resources)
- how the on-progress deployment is going (how many instances were changed from an older version to a new one)
- health checks
- network data available through network services.
Pod container metrics
Individual container metrics are available mostly through the cAdvisor function and exposed by Heapster, which queries every node about the running containers. In this case, metrics such as CPU, network, and memory usage compared with the maximum allowed are the highlights.
Finally, there are the application-specific metrics. These metrics are developed by the application teams themselves and are related to the business rules they address.
For example, database applications will want to expose metrics related to an indices’ state and statistics concerning tables and relationships.
E-commerce applications would highlight data concerning the number of users online and how much money has been made in the last hour.
The metrics that can be exposed at this layer are numerous, and most development teams will want to gather as much information as possible about their applications without placing too much strain on overall pod performance.
Methods For Monitoring Kubernetes
We’ve spoken about the different metrics that can be gathered, but there are also different approaches to gathering these metrics. In this guide, we will focus on two approaches to collecting metrics from your cluster and exporting them to an external endpoint. There are many ways of doing this, but if you can grasp these methods, you should be able to gather as many metrics as you will likely need.
What is important, though, especially for big companies with many different containers and potential clusters, is that the metric collection should be handled consistently. Even if the system has nodes deployed in several places all over the world or in a hybrid cloud, it should handle the metrics collection in the same way and with the same reliability.
1. Monitoring Kubernetes with Kubernetes DaemonSets
The first approach to monitoring all cluster nodes is to create a special kind of Kubernetes pod called DaemonSets.
Kubernetes ensures that every node created has a copy of the DaemonSet pod, which virtually enables one deployment to watch each machine in the cluster. As nodes are destroyed, the pod is also terminated.
Many monitoring solutions make use of the Kubernetes DaemonSet structure to deploy an agent on every cluster node and gather data that they can then display.
2. Monitoring Kubernetes with Heapster
Heapster is a uniform platform adopted by Kubernetes to send monitoring metrics to a system. Heapster serves as a bridge between a cluster and a storage system (such as InfluxDB, DataDog, or Prometheus) designed to collect metrics.
Unlike DaemonSets, Heapster acts as a normal pod and discovers every cluster node via the internal Kubernetes API.
Using Kubelet (a tool that enables master–node communications) and Kubelet’s open-source agent cAdvisor, all relevant information about the cluster and its containers can be gathered and pushed to the configured storage solution.
A cluster can consist of thousands of nodes and an even greater number of pods. As such, it is practically impossible to observe each one on a normal basis, so it is important to create multiple labels for each deployment and try to align them along application or common system names to correctly know what metrics belong to the operation of certain parts of an application.
Choosing Your Monitoring Solutions
Now that you know how to gather data from Kubernetes and your numerous pods, you will want to choose a monitoring solution that will help you make sense of all this data and provide you with the appropriate visualization and alerting to allow your teams to correctly monitor and respond to certain events.
We aren’t going to go into detail about these, as each company will have its own preferences for the tools they would want to use, but solutions such as Prometheus, Dynatrace, and the various cloud monitoring tools on the market are typically all very capable and customizable in providing teams with structuring how they can view the respective gathered data.
With so much importance placed on containers and so much data that can be gathered from Kubernetes clusters and their many pods, making sense of all the data is increasingly important. Thankfully, there are many ways of gathering data from Kubernetes and tools to support you in this aim.