10/31/2024 | Press release | Distributed by Public on 10/31/2024 07:24
In this blog post, we take the concepts from Climbing The Ladder | Kubernetes Privilege Escalation (Part 1), which examined privilege escalation in Kubernetes environments and the danger of system pods, and now take a deep dive by analyzing an explicit use case.
Part 2 of this series explores how a chain of misconfigurations in Google's GKE System Pods constitutes a vulnerability (GCP-2023-047) and how an attacker could chain them together to escalate privileges, compromise critical resources, become cluster admin, and take control of an entire Kubernetes cluster.
The use case of GCP-2023-047, found by the author of this blog, highlights a vulnerability within Google Kubernetes Engine (GKE) that stems from a chain of default misconfigurations in System Pods comprised of:
The security bulletin explains that, "An attacker who has compromised the Fluent Bit logging container could combine that access with high privileges required by Cloud Service Mesh (on clusters that have enabled it) to escalate privileges in the cluster." Individually, this chain of default misconfigurations might seem minor, but when combined, they offer a path for attackers to escalate privileges and gain control over the entire Kubernetes cluster.
As illustrated in Figure 1, logging system pods like FluentBit are designed to collect and aggregate data from various pods across the Kubernetes cluster. This necessitates some level of access to each pod. FluentBit, the default logging agent in GKE, is deployed as a DaemonSet across all nodes, running with a configuration that inadvertently exposes sensitive pod tokens.
By mounting the /var/lib/kubelet/pods volume, it gains access to the kube-api-access directory, which contains service account tokens crucial for Kubernetes API interactions. Although FluentBit does not need direct API access, this setup exposes the cluster to significant risk.
Figure 1. FluentBit DaemonSet access to the kube-api-access directoryAnthos Service Mesh (ASM), Google's managed implementation of Istio, manages inter-service communications within GKE. Its CNI DaemonSet named Istio-cni-node, responsible for installing and configuring the Istio CNI plugin, is initially granted elevated permissions, including specific RBAC privileges. Post-installation, these heightened permissions persist unnecessarily.
Figure 2. Istio-CNI DaemonSet retain its high privilegesWithin the kube-system namespace, GKE houses several preinstalled service accounts endowed with significant privileges. Notably, the clusterrole-aggregation-controller service account possesses the capability to modify cluster roles. An attacker accessing this account could adjust its associated roles, escalating their privileges to cluster-admin levels.
Figure 3. Attacker pod yaml fileThe below scenario outlines a multi-step attack on a Kubernetes cluster, starting with the attacker compromising a FluentBot pod, before eventually elevating cluster-admin permissions to take full control of the cluster and achieving complete compromise.
It is crucial to move beyond theoretical examinations of techniques and understand how an attack might actually unfold. In this blog, we have explored GCP-2023-047 and how a combination of GCP misconfigurations and excessive privileges could be used to perform sophisticated privilege escalation and result in control of entire clusters.
Attacks like these underscore the need for both proactive and reactive security controls for Kubernetes environments. SentinelOne Cloud Security helps organizations secure their containerized applications by providing the full range of security controls needed, including Container and Kubernetes Security and Container Runtime Security.
Lateral traversal (aka lateral movement) is a tactic used by threat actors to move from one system or environment to another. It often goes unnoticed as the activities blend in with normal operations, making it a critical activity to identify and prevent sophisticated cyberattacks.
In our next post within this blog series, we dive into how threat actors move beyond their initial landing into an environment and traverse laterally toward high-value resources using a real-world AWS Lambda example.