Kubernetes: A comparison of managed engines

This document contains an extensive, though not exhaustive comparison of the four most prolific managed Kubernetes offerings: Oracle Container Engine for Kubernetes (OKE), Amazon Elastic Kubernetes Service (EKS), Azure Kubernetes Service (AKS), and Google Kubernetes Engine (GKE). It was developed with input from the community and will continue to be revised as the technology changes.

 

First published 7th January, 2023, this document will be updated regularly to ensure consistent and relevant information is made available. Should you have any questions or wish to contribute feedback, you may reach us any time on Slack!

General Information

  OKE EKS AKS GKE

Currently supported Kubernetes version(s)

1.24.1, 1.23.4, 1.22.5
(source)

1.24, 1.23.7, 1.22.10, 1.21.13
(source)

1.24.6, 1.23.12, 1.22.15
Current minor version and previous 2 minor versions (source)

1.24.1, 1.23.4, 1.22.5, 1.21.5
(rolling support for the most current versions of K8s; source)

# of supported minor version releases

3

>=3 + 1 deprecated

3

4

Original GA release date

May 2018

June 2018

June 2018

August 2015

Pricing

All management costs are free

$0.10/hour (USD) per cluster + standard costs of EC2 instances and other resources

Pay-as-you-go: Standard costs of node VMs and other resources

$0.10/hour (USD) per cluster + standard costs of GCE machines and other resources

CNCF Kubernetes Conformance

Yes (also, here)

Yes

Yes

Yes

CLI support

Full support of Kubernetes clusters; Oracle Cloud Shell; Kubectl support

Full support of Kubernetes clusters; Kubectl support

Full support of Kubernetes clusters; Kubectl support

Full support of Kubernetes clusters; Kubectl support

Control-plane upgrade process

User initiated
All system components update with cluster upgrade

User initiated
User must also manually update the system services that run on nodes (e.g., kube-proxy, coredns, AWS VPC CNI)

User initiated
All system components update with cluster upgrade

Automatically upgraded by default; can be user-initiated

Node upgrade process

User initiated
Update note pool config, scale out to add nodes with new version, remove old nodes will cordon and drain. Alternatively, create new node pool and divert traffic (blue/green)

Automatically upgraded; or user-initiated; AKS will drain and replace nodes

Automatically upgraded during cluster maintenance window (default; can be turned off); can be user initiated; drains and replaces nodes

Node OS

Supported Images for worker nodes (source)
*new images added regularly

Linux:

Windows:

Linux:

Windows:

Linux:

  • Container-Optimized OS (COS) (default), Ubuntu

Windows:

Container runtime

  • Docker (< 1.20.7)
  • CRI-O (>= 1.20.8)

Control plane high availability options

Control plane is deployed to multiple, Oracle-managed control plane nodes which are distributed across different availability domains (where supported) or different fault domains.

Control plane is deployed across multiple Availability Zones (default)

Control plane components will be spread between the number of zones defined by the Admin

Zonal Clusters:
Single Control Plane

Regional Clusters:
Three Kubernetes control planes quorum

Control plane SLA

99.95% SLO

99.95% (default)

Uptimes SLA

Compute SLA

  • 99.99 for regions with multiple Ads
  • 99.95 for regions with one AD
  • 99.9% for single instance

guarantees 99.95% uptime

Offers 99.95% when availability zones are enabled, and 99.9% when disabled

GKE splits its managed Kubernetes clusters, offering 99.5% uptime for Zonal deployments and 99.95% for regional deployments.

SLA financially-backed

Zero cost – not applicable

Yes

Yes

Yes

GPU support

Yes (NVIDIA); By selecting a compatible Oracle Linux GPU image, CUDA libraries are pre-installed. CUDA libraries for different GPUs do not have to be included in the application container.

Yes (NVIDIA); user must install device plugin in cluster

Yes (NVIDIA); user must install device plugin in cluster

Yes (NVIDIA); user must install device plugin in cluster

Compute Engine A2 VMs; are also available

Container performance metrics

  • Optional
  • Default: Off
  • Metrics are sent to Stackdriver

Monitoring

CloudWatch Container Insights
Requires additional setup, metric selection, etc.

Also supported:

  • CloudWatch Agent
  • Fluent Bit
  • Fluentd
  • Prometheus
  • Other 3rd party tools

Azure Monitor

Also supported:

  • Fluentd
  • Prometheus
  • Other 3rd party tools

Kubernetes Engine Monitor
Google Cloud’s operations suite (formerly Stackdriver)

Also supported:

  • Fluentd
  • Prometheus
  • Other 3rd party tools

Node health monitoring

Self-healing – automatically provisions new worker nodes on failure to maintain cluster availability

Detect and repair capabilities also exist within autoscaling functionality.

Container Insights metrics detect failed nodes. Can trigger replace or allow autoscaling to replace node.

Auto repair is now available. Node status monitoring is available. Use autoscaling rules to shift workloads.

Worker node auto-repair enabled by default

Autoscaling

“CA should handle up to 1000 nodes running 30 pods each. Our testing procedure is described here.” (source)

Cluster Autoscaler through:

  • K8s CA
  • EC2 Auto scaling groups

Cluster autoscaler native capabilities

Cluster autoscaler

Serverless computing

 

Virtual Nodes coming soon: will deliver a complete, serverless Kubernetes experience.

Integrated with Fargate; customer can deploy pods as container instances rather than full VMs. Requires the use of Amazon Application Load Balancer

Virtual nodes make serverless computing possible in AKS. Does not run separately from the available Kubernetes workloads. A customer can use virtual nodes by assigning particular workloads to them.

Cloud Run for Anthos

Accessibility

OCI has multiple realms: one commercial realm (with 32 regions) and multiple realms for Government Cloud: US Government Cloud FedRAMP authorized and IL5 authorized, and United Kingdom Government Cloud

30 regions containing 96 Availability Zones. Service availability may vary by region.
EKS is available on AWS GovCloud regions. AWS Fargate is NOT available in GovCloud regions.

Available in 57 of Azure’s 60 regions. Not all regions include availability zones
Available in GovCloud

Available in 35 regions;

No GovCloud support

Bare metal worker nodes

Supports

Supports

Does not support

Does not support

Worker node types

Flex shapes, x86, ARM, HPC, GPU, clusters with mixed node types. (source)

x86, ARM (Graviton), GPU

Specific node images required for various CPU / GPU combinations. (source)

x86, ARM, GPU
Minimum 2 vCPU per worker node. Provisioned node size cannot be changed without replacement.

x86, ARM (v1.24 or later, only), GPU

Tools for developers

Oracle provided tools:

Oracle customers can take full advantage of the K8s ecosystem -  Loft, Okteto, Shipa.io, Telepresence

AWS Toolkit for VS Code supports ECR and ECS, but not EKS

AWS CloudShell
AWS CloudFormation
AWS SDKs and CLI

Full support for entire K8s ecosystem.

  • Kubernetes extension in VS Code.
  • Bridge to Kubernetes which allows execution of local code as a service in a cluster. It also, replicates dependencies in a local environment.

Google offers either Cloud Code or the VS Code extension to deploy, monitor, and control clusters directly in IDE. Integrates with Cloud Run and Cloud Run for Anthos.

Service Limits

Quick reference

Service/Provider OKE EKS AKS GKE

Max clusters

15 clusters/region (Monthly Universal Credits) or 1 cluster/region (Pay-as-You-Go or Promo) by default.

100/region

5000 per subscription

100/zone + 100 regional clusters

Max nodes per cluster

Each cluster you create can have a maximum of 1000 nodes

30 (Managed node groups) * 100 (Max nodes per group) = 3000

Max nodes per node pool/group

1000

Managed node groups: 100

1000

1000

Max node pools/groups per cluster

No limit on number of node pools as long as total nodes per cluster does not exceed 1,000

Managed node groups: 30

100

Not documented

Max pods per node

110 pods/node

Linux:

Windows:

  • # of IPs per ENI - 1

110 (default)

Networking and Security

Quick reference

Service/Provider OKE EKS AKS GKE

Network plugin/CNI

Amazon VPC Container Network Interface (CNI)

Azure CNI or kubenet

Kubernetes RBAC

Not assigned by default
Mutable after cluster creation
(note; note; note)

Required
Immutable after cluster creation

Enabled by default
Immutable after cluster creation

Enabled by default
Mutable after cluster creation

Kubernetes Network Policy

  • Not enabled by default
  • Calico can be manually installed at any time – can be installed alongside the CNI plugin (source)

PodSecurityPolicy support (PSP)

PSP can be installed at any time (source)

PSP controller installed in all clusters with permissive default policy (v1.13+)

PSP can be installed at any time. Will be deprecated on May 31st 2021 for Azure Policy

PSP can be installed at any time. Currently in Beta

Private or public IP address for cluster Kubernetes API

(note)

Private or Public IP addresses for nodes

  • Worker nodes can be public or private, depending on VCN Subnet configuration

Pod-to-pod traffic encryption supported by provider

Yes, with AWS App Mesh

Open Service Mesh as an add-on, by default

Yes, Anthos Service Mesh

Firewall for cluster Kubernetes API

CIDR allow list option

CIDR allow list option

CIDR allow list option

Read-only root filesystem on node

Pod security policy required

Pod security policy required

Azure policy required

Container Image Services

Quick Reference

  OKE EKS AKS GKE

Image repository service

OCI Container Registry

ECR (Elastic Container Registry)

ACR (Azure Container Registry)

AR (Artifact Registry)

Supported formats

Access security

Supports image signing

Yes

No

Yes

Yes, with Binary Authorization and Voucher

Supports immutable image tags

Yes, and supports:

  • Identifying them with secure hashes
  • Adding versions
  • Controlling visibility and permissions

Yes

Yes, and it supports the locking of images and repositories

No

Image scanning service

Yes, Oracle Vulnerability Scanning Service
(note)

Yes, free service: OS packages only

Yes, paid service: Uses the Qualys scanner in a sandbox to check for vulnerabilities

Yes, paid Service: OS packages only

Registry SLA

None

99.9%; financially backed

99.9%; financially backed

None

Geo-Redundancy

No

Yes, configurable

Yes, configurable as part of the premium service

Yes, by default

Notes

General Information

Node OS

    • (note 1) some, but not all, of the latest Oracle Linux images provided by Oracle Cloud Infrastructure

Note: “Docker is not included in Oracle Linux 8 images. Instead, in node pools running Kubernetes 1.20.x and later, Container Engine for Kubernetes installs and uses the CRI-O container runtime and the crictl CLI (for more information, see Notes about Container Engine for Kubernetes Support for Kubernetes Version 1.20).”

    • (note 2) OKE images are provided by Oracle and built on top of platform images. OKE images are optimized for use as worker node base images, with all the necessary configurations and required software
    • (note 3) Custom images are provided by you and can be based on both supported platform images and OKE images. Custom images contain Oracle Linux operating systems, along with other customizations, configuration, and software that were present when you created the image.

Service Limits

Networking and Security

RBAC

  • (note 1)By default, users are not assigned any Kubernetes RBAC roles (or clusterroles). Before attempting to create a new role (or clusterrole), you must be assigned an appropriately privileged role (or clusterrole). A number of such roles and clusterroles are always created by default, including the cluster-admin clusterrole (for a full list, see Default Roles and Role Bindings in the Kubernetes documentation). The cluster-admin clusterrole essentially confers super-user privileges. A user granted the cluster-admin clusterrole can perform any operation across all namespaces in a given cluster.” (source)
  • (note 2) “For most operations on Kubernetes clusters created and managed by Container Engine for Kubernetes, Oracle Cloud Infrastructure Identity and Access Management (IAM) provides access control.” (source)
  • (note 3) “In addition to IAM, the Kubernetes RBAC Authorizer can enforce additional fine-grained access control for users on specific clusters via Kubernetes RBAC roles and clusterroles.” (source)

Private or public IP Address for Clusters

  • (note 1) “You access the Kubernetes API on the cluster control plane through an endpoint hosted in a subnet of your VCN. This Kubernetes API endpoint subnet can be a private or public subnet. If you specify a public subnet for the Kubernetes API endpoint, you can optionally assign a public IP address to the Kubernetes API endpoint (in addition to the private IP address). You control access to the Kubernetes API endpoint subnet using security rules defined for security lists or network security groups.”

Container Image Services

Image scanning service

  • (note 1) “Create and manage container image targets and to assign them to container image scan recipes. A container image target is a collection of repositories in Container Registry that you want scanned for security vulnerabilities.”

References

  • (StackRox) EKS vs GKE vs AKS - Evaluating Kubernetes in the Cloud
  • (itoutposts) Kubernetes Engines Compared: Full Guide
  • (veritis) EKS Vs. AKS Vs. GKE: Which is the right Kubernetes platform for you?
  • (kloia) Comparison of Kubernetes Engines

 

Contributors: Neil Schnepf, Jeevan Joseph, Eli Schilling, Manish Kapur