Multi-Region Kubernetes Security Hardening
Comprehensive security hardening guide for enterprise Kubernetes clusters with zero-trust architecture, advanced RBAC, OPA Gatekeeper policies, and compliance monitoring across multiple regions.
Securing multi-region Kubernetes clusters requires a comprehensive approach that addresses multiple attack vectors and compliance requirements. This guide implements a zero-trust security model with defense-in-depth strategies.
Key Security Principles
- Zero Trust Architecture: Never trust, always verify
- Least Privilege Access: Minimal permissions for all components
- Defense in Depth: Multiple layers of security controls
- Continuous Monitoring: Real-time threat detection and response
- Compliance by Design: Built-in regulatory compliance
Security Architecture
Our security architecture implements multiple layers of protection:
Pod Security Standards provide a standardized way to enforce security policies at the pod level. We'll implement the "restricted" profile with additional custom constraints.
1. Enable Pod Security Standards
# Pod Security Standards Configuration apiVersion: v1 kind: Namespace metadata: name: production labels: pod-security.kubernetes.io/enforce: restricted pod-security.kubernetes.io/audit: restricted pod-security.kubernetes.io/warn: restricted --- apiVersion: v1 kind: Namespace metadata: name: staging labels: pod-security.kubernetes.io/enforce: baseline pod-security.kubernetes.io/audit: restricted pod-security.kubernetes.io/warn: restricted
2. Security Context Configuration
apiVersion: apps/v1 kind: Deployment metadata: name: secure-app namespace: production spec: replicas: 3 selector: matchLabels: app: secure-app template: metadata: labels: app: secure-app spec: securityContext: runAsNonRoot: true runAsUser: 1000 runAsGroup: 3000 fsGroup: 2000 seccompProfile: type: RuntimeDefault containers: - name: app image: nginx:1.21-alpine securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true runAsNonRoot: true runAsUser: 1000 capabilities: drop: - ALL add: - NET_BIND_SERVICE resources: limits: cpu: 500m memory: 512Mi requests: cpu: 100m memory: 128Mi volumeMounts: - name: tmp mountPath: /tmp - name: var-cache mountPath: /var/cache/nginx - name: var-run mountPath: /var/run volumes: - name: tmp emptyDir: {} - name: var-cache emptyDir: {} - name: var-run emptyDir: {}
Network policies provide micro-segmentation capabilities, allowing you to control traffic flow between pods, namespaces, and external endpoints with fine-grained rules.
1. Default Deny Network Policy
# Default deny all ingress and egress traffic apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: default-deny-all namespace: production spec: podSelector: {} policyTypes: - Ingress - Egress --- # Allow DNS resolution apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-dns namespace: production spec: podSelector: {} policyTypes: - Egress egress: - to: [] ports: - protocol: UDP port: 53 - protocol: TCP port: 53
2. Application-Specific Network Policies
# Frontend to Backend communication apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: frontend-to-backend namespace: production spec: podSelector: matchLabels: app: frontend policyTypes: - Egress egress: - to: - podSelector: matchLabels: app: backend ports: - protocol: TCP port: 8080 --- # Backend ingress from frontend apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: backend-ingress namespace: production spec: podSelector: matchLabels: app: backend policyTypes: - Ingress ingress: - from: - podSelector: matchLabels: app: frontend ports: - protocol: TCP port: 8080 --- # Cross-region communication apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: cross-region-sync namespace: production spec: podSelector: matchLabels: app: data-sync policyTypes: - Egress egress: - to: [] ports: - protocol: TCP port: 443 - to: - namespaceSelector: matchLabels: name: production podSelector: matchLabels: app: data-sync ports: - protocol: TCP port: 9090
Role-Based Access Control (RBAC) is fundamental to Kubernetes security. We'll implement a comprehensive RBAC strategy with least privilege principles and external identity integration.
1. Service Account Security
# Secure service account configuration apiVersion: v1 kind: ServiceAccount metadata: name: app-service-account namespace: production annotations: kubernetes.io/enforce-mountable-secrets: "true" automountServiceAccountToken: false --- # Custom role for application apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: namespace: production name: app-role rules: - apiGroups: [""] resources: ["configmaps", "secrets"] verbs: ["get", "list"] - apiGroups: [""] resources: ["pods"] verbs: ["get", "list", "watch"] resourceNames: ["app-*"] --- # Bind role to service account apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: app-role-binding namespace: production subjects: - kind: ServiceAccount name: app-service-account namespace: production roleRef: kind: Role name: app-role apiGroup: rbac.authorization.k8s.io
2. Cross-Region RBAC Configuration
# Cross-region service account apiVersion: v1 kind: ServiceAccount metadata: name: cross-region-sync namespace: production annotations: iam.gke.io/gcp-service-account: cross-region-sync@project.iam.gserviceaccount.com automountServiceAccountToken: false --- # ClusterRole for cross-region operations apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: cross-region-operator rules: - apiGroups: [""] resources: ["nodes", "namespaces"] verbs: ["get", "list", "watch"] - apiGroups: ["apps"] resources: ["deployments", "replicasets"] verbs: ["get", "list", "watch", "update", "patch"] - apiGroups: ["networking.k8s.io"] resources: ["networkpolicies"] verbs: ["get", "list", "watch"] --- # ClusterRoleBinding for cross-region sync apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: cross-region-sync-binding subjects: - kind: ServiceAccount name: cross-region-sync namespace: production roleRef: kind: ClusterRole name: cross-region-operator apiGroup: rbac.authorization.k8s.io
Open Policy Agent (OPA) Gatekeeper provides policy-as-code capabilities, allowing you to define and enforce organizational policies across your Kubernetes clusters.
1. Install OPA Gatekeeper
# Install OPA Gatekeeper kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper/release-3.14/deploy/gatekeeper.yaml # Verify installation kubectl get pods -n gatekeeper-system # Check constraint templates kubectl get constrainttemplates
2. Required Labels Constraint
apiVersion: templates.gatekeeper.sh/v1beta1 kind: ConstraintTemplate metadata: name: k8srequiredlabels spec: crd: spec: names: kind: K8sRequiredLabels validation: openAPIV3Schema: type: object properties: labels: type: array items: type: string targets: - target: admission.k8s.gatekeeper.sh rego: | package k8srequiredlabels violation[{"msg": msg}] { required := input.parameters.labels provided := input.review.object.metadata.labels missing := required[_] not provided[missing] msg := sprintf("Missing required label: %v", [missing]) } --- apiVersion: constraints.gatekeeper.sh/v1beta1 kind: K8sRequiredLabels metadata: name: must-have-security-labels spec: match: kinds: - apiGroups: ["apps"] kinds: ["Deployment"] namespaces: ["production"] parameters: labels: ["app", "version", "environment", "security-scan"]
3. Container Security Constraints
# Disallow privileged containers apiVersion: templates.gatekeeper.sh/v1beta1 kind: ConstraintTemplate metadata: name: k8spsprivileged spec: crd: spec: names: kind: K8sPSPrivileged validation: openAPIV3Schema: type: object targets: - target: admission.k8s.gatekeeper.sh rego: | package k8spsprivileged violation[{"msg": msg}] { container := input.review.object.spec.containers[_] container.securityContext.privileged msg := "Privileged containers are not allowed" } violation[{"msg": msg}] { container := input.review.object.spec.initContainers[_] container.securityContext.privileged msg := "Privileged init containers are not allowed" } --- apiVersion: constraints.gatekeeper.sh/v1beta1 kind: K8sPSPrivileged metadata: name: psp-privileged spec: match: kinds: - apiGroups: [""] kinds: ["Pod"] namespaces: ["production", "staging"] --- # Require security context apiVersion: templates.gatekeeper.sh/v1beta1 kind: ConstraintTemplate metadata: name: k8srequiresecuritycontext spec: crd: spec: names: kind: K8sRequireSecurityContext validation: openAPIV3Schema: type: object targets: - target: admission.k8s.gatekeeper.sh rego: | package k8srequiresecuritycontext violation[{"msg": msg}] { container := input.review.object.spec.containers[_] not container.securityContext.runAsNonRoot msg := "Container must run as non-root user" } violation[{"msg": msg}] { container := input.review.object.spec.containers[_] not container.securityContext.readOnlyRootFilesystem msg := "Container must have read-only root filesystem" } --- apiVersion: constraints.gatekeeper.sh/v1beta1 kind: K8sRequireSecurityContext metadata: name: require-security-context spec: match: kinds: - apiGroups: [""] kinds: ["Pod"] namespaces: ["production"]
Advanced secrets management involves integrating external secret stores, implementing automatic rotation, and ensuring secure cross-region secret replication.
1. External Secrets Operator
# Install External Secrets Operator helm repo add external-secrets https://charts.external-secrets.io helm install external-secrets external-secrets/external-secrets -n external-secrets-system --create-namespace # AWS Secrets Manager SecretStore apiVersion: external-secrets.io/v1beta1 kind: SecretStore metadata: name: aws-secrets-manager namespace: production spec: provider: aws: service: SecretsManager region: us-west-2 auth: jwt: serviceAccountRef: name: external-secrets-sa --- # Google Secret Manager SecretStore apiVersion: external-secrets.io/v1beta1 kind: SecretStore metadata: name: gcp-secret-manager namespace: production spec: provider: gcpsm: projectId: "my-project" auth: workloadIdentity: clusterLocation: us-central1 clusterName: production-cluster serviceAccountRef: name: external-secrets-sa --- # HashiCorp Vault SecretStore apiVersion: external-secrets.io/v1beta1 kind: SecretStore metadata: name: vault-backend namespace: production spec: provider: vault: server: "https://vault.company.com" path: "secret" version: "v2" auth: kubernetes: mountPath: "kubernetes" role: "external-secrets" serviceAccountRef: name: external-secrets-sa
2. External Secret Configuration
# Database credentials from AWS Secrets Manager apiVersion: external-secrets.io/v1beta1 kind: ExternalSecret metadata: name: database-credentials namespace: production spec: refreshInterval: 15s secretStoreRef: name: aws-secrets-manager kind: SecretStore target: name: database-secret creationPolicy: Owner template: type: Opaque data: username: "{{ .username }}" password: "{{ .password }}" host: "{{ .host }}" port: "{{ .port }}" database: "{{ .database }}" data: - secretKey: username remoteRef: key: prod/database property: username - secretKey: password remoteRef: key: prod/database property: password - secretKey: host remoteRef: key: prod/database property: host - secretKey: port remoteRef: key: prod/database property: port - secretKey: database remoteRef: key: prod/database property: database --- # API keys from Google Secret Manager apiVersion: external-secrets.io/v1beta1 kind: ExternalSecret metadata: name: api-keys namespace: production spec: refreshInterval: 30s secretStoreRef: name: gcp-secret-manager kind: SecretStore target: name: api-secret creationPolicy: Owner data: - secretKey: stripe-api-key remoteRef: key: stripe-api-key - secretKey: sendgrid-api-key remoteRef: key: sendgrid-api-key - secretKey: jwt-secret remoteRef: key: jwt-secret
Security monitoring provides real-time threat detection and response capabilities. We'll implement Falco for runtime security monitoring with comprehensive alerting.
1. Deploy Falco Security Monitoring
# Install Falco using Helm helm repo add falcosecurity https://falcosecurity.github.io/charts helm repo update # Create values file for Falco cat <<EOF > falco-values.yaml falco: grpc: enabled: true grpcOutput: enabled: true httpOutput: enabled: true url: "http://falcosidekick:2801" jsonOutput: true jsonIncludeOutputProperty: true falcosidekick: enabled: true config: slack: webhookurl: "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK" channel: "#security-alerts" username: "Falco" icon: ":warning:" elasticsearch: hostport: "elasticsearch:9200" index: "falco" prometheus: extralabels: "cluster=production" driver: kind: ebpf resources: requests: cpu: 100m memory: 512Mi limits: cpu: 1000m memory: 1024Mi nodeSelector: kubernetes.io/os: linux tolerations: - effect: NoSchedule key: node-role.kubernetes.io/master - effect: NoSchedule key: node-role.kubernetes.io/control-plane EOF # Install Falco helm install falco falcosecurity/falco -n falco-system --create-namespace -f falco-values.yaml
2. Custom Falco Rules
apiVersion: v1 kind: ConfigMap metadata: name: falco-custom-rules namespace: falco-system data: custom_rules.yaml: | # Custom rules for production environment - rule: Detect crypto mining desc: Detect cryptocurrency mining activities condition: > spawned_process and (proc.name in (xmrig, cpuminer, ccminer, cgminer, bfgminer) or proc.cmdline contains "stratum+tcp" or proc.cmdline contains "mining.pool" or proc.cmdline contains "cryptonight") output: > Cryptocurrency mining detected (user=%user.name command=%proc.cmdline container=%container.name image=%container.image.repository) priority: CRITICAL tags: [cryptocurrency, mining, malware] - rule: Detect privilege escalation attempt desc: Detect attempts to escalate privileges condition: > spawned_process and (proc.name in (sudo, su, doas) or proc.cmdline contains "chmod +s" or proc.cmdline contains "setuid" or proc.cmdline contains "setgid") output: > Privilege escalation attempt detected (user=%user.name command=%proc.cmdline container=%container.name image=%container.image.repository) priority: HIGH tags: [privilege_escalation, security] - rule: Detect network reconnaissance desc: Detect network scanning and reconnaissance activities condition: > spawned_process and (proc.name in (nmap, masscan, zmap, unicornscan) or proc.cmdline contains "port scan" or proc.cmdline contains "-sS" or proc.cmdline contains "-sT") output: > Network reconnaissance detected (user=%user.name command=%proc.cmdline container=%container.name image=%container.image.repository) priority: HIGH tags: [reconnaissance, network, security] - rule: Detect container escape attempt desc: Detect attempts to escape from container condition: > spawned_process and (proc.cmdline contains "docker.sock" or proc.cmdline contains "/var/run/docker.sock" or proc.cmdline contains "runc" or proc.cmdline contains "cgroups" or proc.cmdline contains "/proc/1/root") output: > Container escape attempt detected (user=%user.name command=%proc.cmdline container=%container.name image=%container.image.repository) priority: CRITICAL tags: [container_escape, security] - rule: Detect suspicious file access desc: Detect access to sensitive system files condition: > open_read and (fd.name in (/etc/shadow, /etc/passwd, /etc/sudoers, /root/.ssh/authorized_keys) or fd.name startswith "/proc/" and fd.name contains "environ" or fd.name startswith "/sys/") output: > Suspicious file access detected (user=%user.name file=%fd.name container=%container.name image=%container.image.repository) priority: HIGH tags: [file_access, security]
Advanced Security
- • Implement Istio service mesh security
- • Set up vulnerability scanning pipelines
- • Configure admission controllers
- • Implement image signing with Cosign
- • Set up security benchmarking
Compliance & Auditing
- • Implement CIS Kubernetes benchmarks
- • Set up SOC 2 compliance monitoring
- • Configure PCI DSS controls
- • Implement GDPR data protection
- • Set up audit log analysis