The Kubernetes default configuration is built for ease of getting started, not production security. After auditing a dozen clusters across finance, telecom, and SaaS companies, the same gaps appear in almost every one. This post covers the controls with the highest impact-to-effort ratio — things you can implement this week.
The Threat Model
Before hardening anything, be clear about what you’re defending against:
- Compromised workload — a container is breached via a vulnerability in the app or its dependencies. Can it reach the API server? Can it move laterally to other pods?
- Misconfigured workload — a developer accidentally ships a privileged container or mounts a sensitive host path. What’s the blast radius?
- Supply chain compromise — a malicious image is pushed to your registry. Does it run?
- Insider threat / stolen credentials — a leaked service account token. What can it access?
Most hardening controls map to limiting the blast radius of one of these scenarios.
1. Network Policies: Deny by Default
The most impactful single change you can make. By default, all pods in a Kubernetes cluster can talk to all other pods. An application compromised in namespace payments can reach user-service in namespace platform.
# Default deny-all ingress/egress for every namespaceapiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: default-deny-all namespace: paymentsspec: podSelector: {} policyTypes: - Ingress - EgressThen explicitly allow only what’s needed:
apiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: allow-payment-service namespace: paymentsspec: podSelector: matchLabels: app: payment-api policyTypes: - Ingress - Egress ingress: - from: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: ingress-nginx ports: - protocol: TCP port: 8080 egress: - to: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: postgres ports: - protocol: TCP port: 5432 # Allow DNS - to: [] ports: - protocol: UDP port: 532. RBAC: Principle of Least Privilege
The most common RBAC mistake I see is cluster-admin bound to service accounts that don’t need it. The second most common is edit or view at the cluster level when namespace-level would suffice.
Start with a concrete audit:
# Find all cluster-admin bindingskubectl get clusterrolebindings \ -o jsonpath='{range .items[?(@.roleRef.name=="cluster-admin")]}{.metadata.name}{"\t"}{range .subjects[*]}{.kind}/{.name}{"\t"}{end}{"\n"}{end}'
# Find service accounts with wildcard permissionskubectl auth can-i --list --as=system:serviceaccount:default:my-saFor workloads, use dedicated service accounts with minimal permissions:
apiVersion: v1kind: ServiceAccountmetadata: name: payment-api namespace: paymentsautomountServiceAccountToken: false # disable unless the pod needs API access---apiVersion: rbac.authorization.k8s.io/v1kind: Rolemetadata: name: payment-api-role namespace: paymentsrules: - apiGroups: [""] resources: ["secrets"] resourceNames: ["payment-api-tls", "stripe-credentials"] verbs: ["get"]automountServiceAccountToken: false is underused. Most application pods never need to talk to the Kubernetes API. Disabling the auto-mount removes a token that an attacker inside the container could use for lateral movement.
3. Pod Security Admission
Pod Security Standards replaced the deprecated PodSecurityPolicy in Kubernetes 1.25. They are namespace-scoped and enforced via admission webhooks built into the API server.
# Label your namespace with the desired enforcement levelapiVersion: v1kind: Namespacemetadata: name: payments labels: pod-security.kubernetes.io/enforce: restricted pod-security.kubernetes.io/enforce-version: latest pod-security.kubernetes.io/warn: restricted pod-security.kubernetes.io/audit: restrictedThe restricted profile blocks:
- Privileged containers
hostNetwork,hostPID,hostIPC- Dangerous capabilities (
NET_RAW,SYS_ADMIN, etc.) - Running as root
- Writable root filesystems
For legacy workloads that can’t be restricted immediately, use baseline (blocks the most dangerous configs) and add the warn label so developers see what needs fixing:
labels: pod-security.kubernetes.io/enforce: baseline pod-security.kubernetes.io/warn: restricted # warns but doesn't block4. Image Security
Enforce Registry Allowlisting
Prevent workloads from pulling images from arbitrary registries:
# Kyverno policy — only allow images from your private registryapiVersion: kyverno.io/v1kind: ClusterPolicymetadata: name: restrict-image-registriesspec: validationFailureAction: Enforce rules: - name: validate-registries match: any: - resources: kinds: ["Pod"] validate: message: "Images must come from registry.company.com" pattern: spec: containers: - image: "registry.company.com/*"Scan Images Before Deployment
Gate deployments on image scan results. In GitLab CI:
trivy-gate: stage: deploy-gate script: - trivy image --exit-code 1 --severity CRITICAL --ignore-unfixed $IMAGE needs: [build] before_script: - docker pull $IMAGECombine with admission webhooks (Starboard or Trivy Operator) to re-scan images already running in the cluster.
5. Secrets Management
Kubernetes Secrets are base64-encoded, not encrypted, by default. Anyone with get secrets permission in a namespace reads them in plaintext.
Encrypt at rest:
# kube-apiserver flag--encryption-provider-config=/etc/kubernetes/encryption-config.yamlapiVersion: apiserver.config.k8s.io/v1kind: EncryptionConfigurationresources: - resources: ["secrets"] providers: - aescbc: keys: - name: key1 secret: <base64-encoded-32-byte-key> - identity: {}Better: use an external secrets manager. External Secrets Operator syncs secrets from HashiCorp Vault, AWS Secrets Manager, or GCP Secret Manager into Kubernetes Secrets, with automatic rotation:
apiVersion: external-secrets.io/v1beta1kind: ExternalSecretmetadata: name: stripe-credentials namespace: paymentsspec: refreshInterval: 1h secretStoreRef: name: vault-backend kind: ClusterSecretStore target: name: stripe-credentials data: - secretKey: api-key remoteRef: key: payments/stripe property: api-keyAudit Checklist
Run this monthly:
-
kube-benchagainst CIS Kubernetes Benchmark - RBAC audit: any cluster-admin bindings added since last review?
- Network policy coverage: any new namespaces without a default-deny?
- Image scan results: any critical CVEs in running workloads?
- etcd encryption enabled and key rotation up to date?
- API server audit logs reviewed for anomalous access patterns?
Where to Go Next
This covers the most impactful controls. Beyond these, the next layer includes:
- Runtime security — Falco for behavioural anomaly detection inside containers
- mTLS between services — Istio or Linkerd for zero-trust service mesh
- Supply chain signing — Cosign + policy enforcement for image signing
- Audit logging — ship kube-apiserver audit logs to your SIEM
Security is a process, not a configuration. Build the audit checklist into a recurring runbook and re-run it every time your cluster topology changes.