Planetek - Fractional VP, CTO & CIO Services | AI Training | Web Design

Three months ago, a company called me in a panic. Their Kubernetes cluster had been compromised. Cryptocurrency miners were running on every node. Their AWS bill had jumped from $8,000 to $43,000 in two weeks.

"But we followed the documentation," the CTO said. "We thought we were secure."

Turns out, they'd focused on the wrong things. They had network policies configured perfectly but left their API server exposed to the internet with default credentials. Classic.

Kubernetes security is overwhelming. There are hundreds of settings, dozens of best practices, and endless blog posts telling you everything is critical. It's not. Some things matter way more than others.

Let me tell you what actually matters based on responding to real incidents, not theoretical threats.

The API Server: Your Crown Jewel

If an attacker gets access to your Kubernetes API server, game over. They control everything. Every pod, every secret, every node. It's the keys to the kingdom.

Yet I see exposed API servers constantly. Last year alone, I found 12 clients with API servers accessible from the public internet. Twelve!

Lock Down API Access (Do This First)

Your API server should never be publicly accessible. Never. I don't care what your use case is. I've heard every excuse—"but we need to access it from anywhere" or "our CI/CD needs to reach it" or "it's just for testing." None of these require public internet access.

Use a VPN or bastion host for administrative access. When humans need to manage your cluster, they should connect through a secure tunnel first. This adds one extra step but prevents the entire internet from probing your API server for vulnerabilities.

Use service accounts with minimal permissions for automated access. Your CI/CD pipeline doesn't need cluster-admin rights. Create a service account with just enough permissions to deploy to specific namespaces. If it gets compromised, the damage is contained.

Use kubectl proxy for local development. This creates a secure tunnel from your laptop to the cluster without exposing the API server. It's built into kubectl, so there's no excuse not to use it.

If you absolutely must expose it (and you probably don't), put it behind a firewall that only allows specific IP addresses. Whitelist your office IP, your CI/CD server IP, and nothing else. And enable audit logging so you know who's accessing it and what they're doing.

Enable RBAC (And Actually Use It)

Role-Based Access Control isn't optional. It's mandatory. But here's the thing: most people enable RBAC and then give everyone cluster-admin privileges. That defeats the entire purpose.

Follow the principle of least privilege. This means giving people exactly what they need to do their job, nothing more. Developers should get access to their namespaces only—they can deploy, view logs, and debug their applications, but they can't touch other teams' resources or cluster-wide settings.

CI/CD systems get just enough permissions to deploy. They don't need to read secrets from other namespaces. They don't need to modify cluster roles. They need to create deployments and services in specific namespaces. That's it.

Monitoring tools get read-only access. Prometheus needs to scrape metrics, not modify resources. Grafana needs to query data, not create pods. Read-only access is sufficient and much safer.

Nobody gets cluster-admin unless they absolutely need it. And I mean absolutely. The person managing the cluster infrastructure? Yes. The developer who wants to "just quickly check something"? No.

I worked with a company where 47 people had cluster-admin access. Forty-seven! We got it down to 3. Guess what? Nothing broke. Nobody actually needed those permissions. They just had them because nobody had ever said no.

Learn more from Kubernetes RBAC documentation.

Audit Logging: Know What's Happening

Enable audit logging on your API server. This tells you who did what and when.

When that crypto mining incident happened, audit logs showed us exactly how the attacker got in (exposed API server), what they did (created privileged pods), and when it happened (2 AM on a Saturday).

Without audit logs, we would have been guessing. With them, we had a complete timeline.

Send your audit logs to a centralized logging system. Don't just leave them on the API server where an attacker can delete them.

Pod Security: Stop Running as Root

Here's a fun fact: most containers run as root by default. That's insane.

If an attacker compromises a container running as root, they have root access to that container. From there, they can often escape to the host node. Now they have root access to your entire node.

This isn't theoretical. Container escapes happen. I've investigated several.

Pod Security Standards (Use Them)

Kubernetes has Pod Security Standards that define three levels of security restrictions. Privileged is unrestricted and dangerous—pods can do anything, including escaping to the host. This is fine for system-level components that genuinely need host access, but nothing else.

Baseline is minimally restrictive and much better for most workloads. It prevents the most dangerous configurations while still allowing common use cases. This is where you should start.

Restricted is heavily restricted and represents current best practices. It's the most secure option and what you should aim for in production. It requires more work to implement because many applications aren't designed with these restrictions in mind, but the security benefits are worth it.

Start with baseline. Get your applications working with baseline restrictions first. Then move to restricted for production workloads. Don't try to jump straight to restricted—you'll spend weeks fighting with applications that expect to run as root or write to the filesystem.

This prevents running as root, which is the default for most containers but completely unnecessary for most applications. It prevents privileged containers that can access host resources. It blocks host network access so containers can't sniff traffic on the host. It prevents host path mounts that could expose sensitive host files. And it drops dangerous capabilities that containers rarely need.

One client had 200+ pods running in production. We applied restricted pod security standards. 180 pods failed to start. That's 90% of their workloads running with unnecessary privileges. We fixed them all in a week. None of them actually needed those privileges. They just had them because nobody said no.

Security Contexts: The Details Matter

Every pod should have a security context that:

Runs as a non-root user
Uses a read-only root filesystem
Drops all capabilities
Disables privilege escalation

Here's what that looks like:

securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  readOnlyRootFilesystem: true
  allowPrivilegeEscalation: false
  capabilities:
    drop:
      - ALL

Yes, this will break some applications. Fix the applications. Don't compromise security because your app wants to write to /tmp.

Network Policies: Control Traffic Flow

By default, any pod can talk to any other pod. That's a problem. If an attacker compromises your frontend pod, they shouldn't be able to directly access your database pod. But without network policies, they can. They can probe your entire cluster, find vulnerable services, and move laterally through your infrastructure.

Implement network policies that deny all traffic by default. This is the foundation. Start with "nothing can talk to anything" and then explicitly allow only necessary connections. It's more work upfront, but it's the only way to ensure you're not accidentally leaving doors open.

Explicitly allow only necessary connections. Your frontend needs to talk to your API. Your API needs to talk to your database. But your frontend doesn't need direct database access. Define these relationships explicitly in network policies.

Isolate namespaces from each other. Development pods shouldn't be able to reach production pods. Team A's services shouldn't be able to access Team B's services unless there's a specific business reason. Namespaces are logical boundaries—enforce them with network policies.

Restrict egress traffic too, not just ingress. Most people focus on what can reach their pods, but forget about what their pods can reach. A compromised pod shouldn't be able to make arbitrary outbound connections to download malware or exfiltrate data. Define which external services your pods legitimately need to access and block everything else.

This is defense in depth. Even if one pod is compromised, the attacker can't move laterally. They're stuck in that one pod, unable to reach other services or escape to the broader network.

Learn more about Kubernetes Network Policies.

Secrets Management: Stop Using Kubernetes Secrets

Controversial opinion: Kubernetes Secrets aren't secure enough for production.

They're base64 encoded, not encrypted. Anyone with access to etcd can read them. Anyone with read access to secrets in a namespace can decode them.

For development? Fine. For production? Use a real secrets management solution.

External Secrets Operators (The Right Way)

Use tools like AWS Secrets Manager with External Secrets Operator, HashiCorp Vault, Azure Key Vault, or Google Secret Manager. These are purpose-built secret management systems that actually understand security.

These provide encryption at rest with proper key management. Your secrets are encrypted using keys you control, not just base64 encoded. They provide audit logging of secret access—you can see who accessed which secret and when. This is critical for compliance and incident response.

They support automatic rotation. Secrets can be rotated on a schedule without manual intervention or application downtime. They provide fine-grained access control—you can specify exactly who can read which secrets, not just namespace-level access. And they offer centralized management across your entire infrastructure, not just Kubernetes.

Yes, it's more complex than Kubernetes Secrets. You need to set up the secret manager, configure the External Secrets Operator, and modify your applications to use it. But it's also actually secure, which Kubernetes Secrets are not.

One client had database credentials stored as Kubernetes Secrets. A developer with namespace access accidentally committed them to a public GitHub repo. Oops. With a proper secrets manager, that wouldn't have been possible. The developer would have had access to use the secret, not read it.

Encrypt etcd (At Minimum)

If you must use Kubernetes Secrets, at least encrypt etcd at rest.

This prevents someone with filesystem access to your control plane from reading secrets directly from etcd.

It's not perfect, but it's better than nothing.

Image Security: Know What You're Running

You're running code from the internet in your production environment. Think about that.

Every container image is code. Code that could have vulnerabilities. Code that could be malicious. Code that you're trusting with your data.

Scan Everything (No Exceptions)

Scan every container image for vulnerabilities before deploying it. Use tools like Trivy, Snyk Container, or Anchore. These tools scan your images against databases of known vulnerabilities and tell you exactly what's wrong.

Integrate scanning into your CI/CD pipeline. Don't make scanning optional or manual. Every image that gets built should be scanned automatically. If an image has critical vulnerabilities, don't deploy it. Block the deployment, fix the vulnerabilities, then try again.

I scanned a client's production images last month. Found 347 high and critical vulnerabilities. Three hundred forty-seven! Some of them were 3 years old. They'd been running vulnerable images in production for years because nobody was checking. Every one of those vulnerabilities was a potential entry point for attackers.

Use Minimal Base Images

The smaller your image, the smaller your attack surface. Every package in your image is a potential vulnerability. Fewer packages means fewer vulnerabilities.

Don't use ubuntu:latest as your base image. It's 77MB and contains hundreds of packages you don't need. Most applications don't need a full Ubuntu installation. They need a runtime environment and their application code. That's it.

Use alpine:latest instead. It's 7MB with minimal packages. Alpine is designed specifically for containers—small, secure, and containing only what's necessary. For many applications, Alpine is all you need.

Or use distroless images, which are even smaller. Distroless images contain only your application and its runtime dependencies. No shell, no package manager, no utilities. This makes them incredibly secure because there's almost nothing for an attacker to exploit. If they compromise your application, they can't run commands or install tools because those tools don't exist in the image.

One client reduced their average image size from 500MB to 50MB by switching to distroless. That's 90% smaller. This means faster deployments because there's less to download, smaller attack surface because there are fewer packages, and lower storage costs because you're storing less data.

Image Signing and Verification

How do you know the image you're deploying is the image you built? Without signing, you don't. An attacker could compromise your registry and replace your images with malicious versions. Or they could perform a man-in-the-middle attack during image pull. Image signing prevents both of these attacks.

Use image signing with tools like Sigstore or Notary. These tools cryptographically sign your images during the build process. The signature proves that the image came from you and hasn't been tampered with.

This prevents deploying tampered images. If someone modifies your image after you built it, the signature won't match and the deployment will fail. It prevents supply chain attacks where malicious code is injected into your build process. And it prevents unauthorized image modifications—only people with the signing key can create valid images.

Configure admission controllers to only allow signed images in production. If it's not signed, it doesn't run. This is a hard requirement that can't be bypassed. Development environments can be more lenient, but production should be locked down.

Runtime Security: Detect the Unexpected

Even with perfect configuration, things can go wrong. You need runtime security to detect and respond to threats.

Falco: Your Security Camera

Falco monitors your cluster for suspicious behavior in real-time. It watches for unexpected process execution—if a container that normally runs a web server suddenly starts running a shell, that's suspicious. It detects suspicious network connections—if your application pod suddenly connects to a known malicious IP, that's a problem.

It monitors file system modifications. If someone starts modifying system files or accessing sensitive data, Falco alerts you. It catches privilege escalation attempts—if a process tries to gain root access when it shouldn't have it, you'll know immediately. And it can detect container escapes—attempts to break out of the container and access the host system.

It's like a security camera for your cluster. It won't prevent attacks, but it will tell you when they're happening. And that's critical because the faster you detect an attack, the less damage it can do.

I've seen Falco detect crypto miners starting in compromised pods within seconds of execution. I've watched it catch attackers trying to escape containers in real-time. I've seen it alert on malware downloading additional payloads before the malware could execute. And I've seen it catch unauthorized access to sensitive files that would have gone unnoticed otherwise.

All in real-time. That's the difference between detecting a breach in minutes versus months.

Admission Controllers: The Gatekeeper

Admission controllers intercept requests to the API server before objects are created. They can validate, mutate, or reject requests. This is your last line of defense before something gets deployed to your cluster.

Use them to enforce policies automatically. No privileged pods—if someone tries to deploy a privileged pod, the admission controller rejects it with a clear error message. All images must be from approved registries—you can block images from Docker Hub or other public registries and only allow images from your private registry.

All pods must have resource limits. Without limits, a single pod can consume all resources on a node and starve other pods. The admission controller can enforce that every pod has CPU and memory limits defined. All pods must have security contexts—the security settings we discussed earlier should be mandatory, not optional.

Tools like OPA Gatekeeper or Kyverno make this easy. You write policies in a declarative language, and they're enforced automatically.

One client used admission controllers to enforce that all production pods must run as non-root, have resource limits, come from their private registry, and have security contexts defined. Developers couldn't deploy non-compliant pods even if they tried. The admission controller would reject them with a helpful error message explaining what was wrong and how to fix it.

Supply Chain Security: Trust But Verify

Your application depends on dozens or hundreds of third-party components. How do you know they're safe?

Software Bill of Materials (SBOM)

Generate an SBOM for every image. This is a complete list of all components in your image.

When a new vulnerability is announced (like Log4j), you can instantly check if you're affected by searching your SBOMs.

Without SBOMs, you're manually checking every application. With SBOMs, it's automated.

Tools like Syft generate SBOMs automatically.

Verify Dependencies

Don't just trust that npm package or Docker image. Verify it.

Use dependency scanning tools to check for:

Known vulnerabilities
Malicious packages
License compliance issues
Outdated dependencies

One client discovered they were using a compromised npm package that was exfiltrating environment variables. Dependency scanning caught it before it reached production.

The Reality Check: You Can't Do Everything

I've thrown a lot at you. If you try to implement everything at once, you'll fail.

Prioritize based on risk:

Critical (Do This Week):

Lock down API server access. This is the most important thing you can do. If your API server is publicly accessible, fix that today. Not tomorrow, today. Enable RBAC with least privilege—review who has access to what and remove unnecessary permissions. Scan container images for vulnerabilities and set up automated scanning in your CI/CD pipeline. Enable audit logging so you have visibility into what's happening in your cluster.

Important (Do This Month):

Implement pod security standards. Start with baseline and work toward restricted. Set up network policies to control traffic flow between pods. Deploy runtime security monitoring with tools like Falco. Use external secrets management instead of Kubernetes Secrets for production workloads.

Good to Have (Do This Quarter):

Implement image signing and verification. Generate SBOMs for all your images. Deploy advanced admission controllers with comprehensive policies. Create comprehensive security policies that cover all aspects of your cluster security.

Start with the critical items. They'll prevent 80% of attacks with 20% of the effort.

Measuring Success: Know If You're Secure

How do you know if your security efforts are working? Measure these:

Security Incidents

Track incidents over time. This should trend downward.

If it's not, either your security isn't working or your detection is improving (which is actually good).

Vulnerability Remediation Time

How long from discovering a vulnerability to fixing it?

Target:

Critical: 24 hours
High: 1 week
Medium: 1 month

If you're slower than this, you're accumulating security debt.

Policy Compliance Rate

What percentage of your pods comply with security policies?

Target: 100% in production, 95%+ in non-production.

If you're below this, you have work to do.

Mean Time to Detect (MTTD)

How long does it take to detect a security incident?

Target: Under 5 minutes for critical issues.

The faster you detect, the less damage an attacker can do.

The Bottom Line

Kubernetes security is complex, but it's not impossible.

Focus on the fundamentals:

Secure your API server
Run pods with minimal privileges
Scan and verify images
Monitor runtime behavior
Manage secrets properly

Do these well, and you'll be more secure than 90% of Kubernetes deployments.

Ignore them, and you're one misconfiguration away from being the next breach headline.

Need help securing your Kubernetes clusters? Let's talk. I've secured clusters for companies from startups to Fortune 500s. I can help you prioritize what matters and avoid wasting time on what doesn't.

"But we followed the documentation," the CTO said. "We thought we were secure."

Turns out, they'd focused on the wrong things. They had network policies configured perfectly but left their API server exposed to the internet with default credentials. Classic.

Let me tell you what actually matters based on responding to real incidents, not theoretical threats.

The API Server: Your Crown Jewel

If an attacker gets access to your Kubernetes API server, game over. They control everything. Every pod, every secret, every node. It's the keys to the kingdom.

Yet I see exposed API servers constantly. Last year alone, I found 12 clients with API servers accessible from the public internet. Twelve!

Lock Down API Access (Do This First)

Use kubectl proxy for local development. This creates a secure tunnel from your laptop to the cluster without exposing the API server. It's built into kubectl, so there's no excuse not to use it.

Enable RBAC (And Actually Use It)

Role-Based Access Control isn't optional. It's mandatory. But here's the thing: most people enable RBAC and then give everyone cluster-admin privileges. That defeats the entire purpose.

Monitoring tools get read-only access. Prometheus needs to scrape metrics, not modify resources. Grafana needs to query data, not create pods. Read-only access is sufficient and much safer.

Nobody gets cluster-admin unless they absolutely need it. And I mean absolutely. The person managing the cluster infrastructure? Yes. The developer who wants to "just quickly check something"? No.

Learn more from Kubernetes RBAC documentation.

Audit Logging: Know What's Happening

Enable audit logging on your API server. This tells you who did what and when.

Without audit logs, we would have been guessing. With them, we had a complete timeline.

Send your audit logs to a centralized logging system. Don't just leave them on the API server where an attacker can delete them.

Pod Security: Stop Running as Root

Here's a fun fact: most containers run as root by default. That's insane.

If an attacker compromises a container running as root, they have root access to that container. From there, they can often escape to the host node. Now they have root access to your entire node.

This isn't theoretical. Container escapes happen. I've investigated several.

Pod Security Standards (Use Them)

Baseline is minimally restrictive and much better for most workloads. It prevents the most dangerous configurations while still allowing common use cases. This is where you should start.

Security Contexts: The Details Matter

Every pod should have a security context that:

Runs as a non-root user
Uses a read-only root filesystem
Drops all capabilities
Disables privilege escalation

Here's what that looks like:

securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  readOnlyRootFilesystem: true
  allowPrivilegeEscalation: false
  capabilities:
    drop:
      - ALL

Yes, this will break some applications. Fix the applications. Don't compromise security because your app wants to write to /tmp.

Network Policies: Control Traffic Flow

This is defense in depth. Even if one pod is compromised, the attacker can't move laterally. They're stuck in that one pod, unable to reach other services or escape to the broader network.

Learn more about Kubernetes Network Policies.

Secrets Management: Stop Using Kubernetes Secrets

Controversial opinion: Kubernetes Secrets aren't secure enough for production.

They're base64 encoded, not encrypted. Anyone with access to etcd can read them. Anyone with read access to secrets in a namespace can decode them.

For development? Fine. For production? Use a real secrets management solution.

External Secrets Operators (The Right Way)

Encrypt etcd (At Minimum)

If you must use Kubernetes Secrets, at least encrypt etcd at rest.

This prevents someone with filesystem access to your control plane from reading secrets directly from etcd.

It's not perfect, but it's better than nothing.

Image Security: Know What You're Running

You're running code from the internet in your production environment. Think about that.

Every container image is code. Code that could have vulnerabilities. Code that could be malicious. Code that you're trusting with your data.

Scan Everything (No Exceptions)

Use Minimal Base Images

The smaller your image, the smaller your attack surface. Every package in your image is a potential vulnerability. Fewer packages means fewer vulnerabilities.

Image Signing and Verification

Runtime Security: Detect the Unexpected

Even with perfect configuration, things can go wrong. You need runtime security to detect and respond to threats.

Falco: Your Security Camera

All in real-time. That's the difference between detecting a breach in minutes versus months.

Admission Controllers: The Gatekeeper

Tools like OPA Gatekeeper or Kyverno make this easy. You write policies in a declarative language, and they're enforced automatically.

Supply Chain Security: Trust But Verify

Your application depends on dozens or hundreds of third-party components. How do you know they're safe?

Software Bill of Materials (SBOM)

Generate an SBOM for every image. This is a complete list of all components in your image.

When a new vulnerability is announced (like Log4j), you can instantly check if you're affected by searching your SBOMs.

Without SBOMs, you're manually checking every application. With SBOMs, it's automated.

Tools like Syft generate SBOMs automatically.

Verify Dependencies

Don't just trust that npm package or Docker image. Verify it.

Use dependency scanning tools to check for:

Known vulnerabilities
Malicious packages
License compliance issues
Outdated dependencies

One client discovered they were using a compromised npm package that was exfiltrating environment variables. Dependency scanning caught it before it reached production.

The Reality Check: You Can't Do Everything

I've thrown a lot at you. If you try to implement everything at once, you'll fail.

Prioritize based on risk:

Critical (Do This Week):

Important (Do This Month):

Good to Have (Do This Quarter):

Start with the critical items. They'll prevent 80% of attacks with 20% of the effort.

Measuring Success: Know If You're Secure

How do you know if your security efforts are working? Measure these:

Security Incidents

Track incidents over time. This should trend downward.

If it's not, either your security isn't working or your detection is improving (which is actually good).

Vulnerability Remediation Time

How long from discovering a vulnerability to fixing it?

Target:

Critical: 24 hours
High: 1 week
Medium: 1 month

If you're slower than this, you're accumulating security debt.

Policy Compliance Rate

What percentage of your pods comply with security policies?

Target: 100% in production, 95%+ in non-production.

If you're below this, you have work to do.

Mean Time to Detect (MTTD)

How long does it take to detect a security incident?

Target: Under 5 minutes for critical issues.

The faster you detect, the less damage an attacker can do.

The Bottom Line

Kubernetes security is complex, but it's not impossible.

Focus on the fundamentals:

Secure your API server
Run pods with minimal privileges
Scan and verify images
Monitor runtime behavior
Manage secrets properly

Do these well, and you'll be more secure than 90% of Kubernetes deployments.

Ignore them, and you're one misconfiguration away from being the next breach headline.

Kubernetes Security: What Actually Matters in 2026

The API Server: Your Crown Jewel

Pod Security: Stop Running as Root

Secrets Management: Stop Using Kubernetes Secrets

Image Security: Know What You're Running

Runtime Security: Detect the Unexpected

Supply Chain Security: Trust But Verify

The Reality Check: You Can't Do Everything

Measuring Success: Know If You're Secure

The Bottom Line

Kubernetes Security: What Actually Matters in 2026

The API Server: Your Crown Jewel

Pod Security: Stop Running as Root

Secrets Management: Stop Using Kubernetes Secrets

Image Security: Know What You're Running

Runtime Security: Detect the Unexpected

Supply Chain Security: Trust But Verify

The Reality Check: You Can't Do Everything

Measuring Success: Know If You're Secure

The Bottom Line