Kubernetes is taking the cloud infrastructure world by storm because it allows operators to stop worrying as much about individual servers thanks to its distributed clustered nature. It also puts a lot more power into the hands of developers to deploy and manage their own workloads, which is key for releasing products more quickly. However, the distributed nature of the platform introduces new challenges, one of which is monitoring and metrics collection. When workloads are being placed ad-hoc on a pool of servers, and potentially being redistributed amongst them during deployments or failures, how can you reliably monitor the health and performance of your applications? In this article we are going to take a look at some of the open source options available that work well with Kubernetes.
Core Metrics API
Kubernetes 1.8 and above comes with a core metrics API available in the platform. This API allows users or other controllers to view CPU and Memory metrics for pods and containers in real time. Access to the metrics API is available via the command line interface using `kubectl top` for example, or by directly interfacing with your Kubernetes API endpoint. In order to make use of this API you will need to ensure that the Metrics Server is deployed on your cluster. The metrics API is really a base which many of the subsequent solutions we will discuss consume, and metrics data is not persisted for historical consumption.
cAdvisor is a metrics collection agent developed at Google that has native support for Docker containers but should also work well with other types of container runtimes. It gathers metrics about resource utilization, resource isolation, historical utilization, and more, both at the container level and the system level. It exposes both a remote REST API endpoint for examining metrics, as well as a built in WebUI for visualizing collected data. Many other metrics collection systems make use of cAdvisor as an underlying technology to gather metrics.
Heapster is a performance monitoring and metrics collection system compatible with Kubernetes versions 1.0.6 and above. It allows for the collection of not only performance metrics about your workloads, pods, and containers, but also events and other signals generated by your cluster. The great thing about Heapster is that it is fully open source as part of the Kubernetes project, and supports a multitude of backends for persisting the data, including but not limited to, Influxdb, Elasticsearch, and Graphite.
Prometheus is an open source metrics collection system originally developed at Soundcloud, and more recently inducted into the CNCF. Prometheus is powerful thanks to its data model, rich set of client libraries, and its ability to create alerts based off of metrics. Prometheus comes standard with it’s own dashboard which is available for running ad-hoc queries or quick debugging, but the best experience will be had when using an integration with visualization backends such as Grafana. Support for bridging in data from other 3rd party tools such as HaProxy, StatsD, or system level metrics allows for Prometheus to act as a centralized hub for all of your metrics data collection.
InfluxData TICK Stack
InfluxData is a company that has developed tools specifically designed for metrics collection, aggregation, and visualization. Their product known as the TICK-Stack is based on an open source core made up of 4 distinct projects, Telegraf, InfuxDB, Chronograf, and Kapacitor. Those components are responsible for collecting metrics and events from your cluster, storing them, visualizing, and creating custom logic around alerting. Like Prometheus, alerts and visualizations are the core competency of this platform, and it does so in a very performant way. The only downside is that in order to have high availability of the InfluxDB storage engine, users must pay for InfluxData Enterprise, or InfluxCloud, their hosted solution.
There are many other tools out there for metrics collection that were not included in this list, many of which have much more advanced functionality but are paid products. Are there other open source tools that we missed in this overview? Let us know in the comments!
Manage Kubernetes clusters running anywhere with built in monitoring based on Prometheus, and a responsive mobile-friendly user interface.