Containership 3.8 — Supporting Kubernetes 1.13 & GPU Clusters

GPU Support

As operations teams look to modernize their existing infrastructure, both on-premise and in the cloud, they must also ensure they are continuing to meet the needs of their internal constituents. Varying demands from stakeholders with differing compute, policy and compliance requirements have traditionally led to sprawl and “snowflake” infrastructure. Unsurprisingly, enterprises are turning to Kubernetes to provide a standardized environment upon which to build. The growing traction of a technology often leads to growing maturity, and subsequently growing adoption in the enterprise. As such, operations teams are faced with building and supporting environments running these technologies.

As the number of deep learning projects and frameworks continues to grow, operations teams are looking to provide teams with these capabilities within a Kubernetes environment. Emerging fields, such as Machine Learning, rely heavily on graphics processing units (GPUs). While some organizations have made investments in physical GPU clusters on-premise, others have decided to leverage the cloud, given the rapid improvements in GPUs over the past few years. With an increased demand for performing this work in a cloud environment, Containership has worked with its customers to provide first-class support for GPU clusters within Containership Cloud Platform.

As of today, users can now launch Nvidia GPU based virtual machines on AWS, Google, Azure, and Packet. During the provisioning process, Containership handles the driver installation, Docker engine configuration, and automatically launches the Nvidia Kubernetes device plugin. Once provisioned, users are able to simply request GPUs directly from their Kubernetes workload definitions.

Check out some examples to get started!

Cerebral Updates

In Containership Cloud Platform 3.7 we announced initial support for node pool autoscaling through our open source Kubernetes cluster autoscaler, Cerebral. By leveraging Prometheus installed on the cluster, CKE node pools can be automatically scaled based on actual CPU and memory utilization generated by pods. Now, with the latest release, node pools can also be scaled based on Kubernetes allocation metrics which are gathered from resource requests set on pods. Like the existing utilization-based metrics, Cerebral’s allocation-based metrics allow users to autoscale preemptively, which differs from other solutions. In many cases when a cluster is under load, waiting for a pod to become unschedulable to add additional capacity is too late. By allowing users to trigger scaling events at configurable thresholds, users can ensure instances are provisioned in advance of scheduling issues.

With the newly introduced GPU support in Containership, Cerebral now also natively supports GPU allocation metrics. Given the cost of running GPU instances in the cloud, it is unsurprising that running deep learning infrastructure can get expensive quickly. By leveraging Cerebral allocation-based autoscaling, users can decrease infrastructure costs with minimal overhead.

Since Cerebral uses pluggable engines, it even works outside of CKE clusters. For example, if you have a kops, EKS, or DigitalOcean Kubernetes Service cluster, you can take advantage of all of Cerebral’s features!

Certified and Secure

Security is our top priority. In recent weeks there have been major security vulnerabilities (CVE2019-5736) & (CVE-2019-1002100) identified in Kubernetes and related components. As such, we have released patched Kubernetes versions to mitigate these issues, and notified all customers. Users can upgrade their clusters through the normal one-click upgrade process.

We have also recently passed conformance testing for Kubernetes v1.13.4, which includes security fixes out-of-the-box. You can either upgrade or launch a new cluster today on the latest minor release of Kubernetes.

As always, Containership remains committed to providing its users with support for the latest three minor versions of Kubernetes through a no-hassle upgrade process. You can launch a new v1.13.4 cluster today, or upgrade an existing cluster with ease. Just because Kubernetes moves fast, doesn’t mean you should be left behind.

Registry Support

First-class support for JFrog Artifactory’s Docker registry has landed in Containership Cloud Platform 3.8! As many enterprises have existing integration with Artifactory as a general artifact repository, they are also utilizing it for Docker image storage. By simply configuring the server address and authentication credentials, Artifactory registries can be connected to Containership Cloud, and workloads can run containers referencing images stored in said registry. With this addition, Containership Cloud Platform now has native support for the six most popular hosted Docker registries, as well as private Docker registries.

AWS CSI Integration

Containership has always supported container storage interfaces (CSIs) as a plugin for users, which makes your chosen providers storage offering easy to leverage through Kubernetes. This ease of use has been added for AWS as a Containership Plugin, using version AWS 0.2.0 CSI driver, which allows for the provisioning of Kubernetes volumes backed by Elastic Block Storage. This plugin can be added at the time of cluster create, or added later from your clusters plugin page.

Interested in learning more about our GPU clusters or have questions?

Show Comments