API Keys in GitHub — How Leaked Credentials Cause Cloud Breaches

TLDR: Automated bots scan GitHub continuously, 24 hours a day, for AWS keys, API tokens, database credentials, and private keys. The median time between a credential being committed to a public repository and first being used by an attacker is under 60 seconds according to GitGuardian’s 2025 State of Secrets Sprawl Report. This post covers how it happens, why deleting the file doesn’t fix it, how to scan your own repositories right now, and how to implement secrets management that prevents it from ever occurring.

The 16-Minute Timeline

The commit was pushed at 11:47 PM. A developer, working late, committed a config file to a public GitHub repository. The file contained an AWS access key — AKIA... format, unmistakable to any scanner.

By 12:03 AM — 16 minutes later — an automated bot had found the key, authenticated to AWS, and began making API calls. Not probe calls. Production calls. EC2 RunInstances requests, spinning up GPU-optimised instances across multiple regions optimised for cryptocurrency mining.

The developer woke up at 7 AM to an AWS notification about unusual billing activity. By that point, 340 instances were running across 6 AWS regions. AWS suspended the account at around the same time. The 36-hour bill: $80,000.

AWS partially credited the account after the developer filed a detailed incident report demonstrating the key was leaked. Partial. The startup absorbed a five-figure cost from one late-night commit.

This isn’t a cautionary tale — it’s a routine occurrence. GitGuardian detected over 12.8 million secrets exposed in public GitHub commits in 2023 alone. The vast majority of those developers didn’t know until it was too late.

How Automated Scanners Work

The bots scanning GitHub for credentials are not sophisticated. They don’t need to be.

AWS access keys follow a predictable format — they begin with AKIA followed by 16 uppercase alphanumeric characters. A simple regular expression finds them instantly. The same applies to Google Cloud service account JSON files, Stripe secret keys (sk_live_...), GitHub personal access tokens, Twilio auth tokens, and dozens of other credential formats — all have recognisable patterns that can be matched with a regex in milliseconds.

Security researchers and tools like TruffleHog and GitGuardian scan for these patterns legitimately — to help developers find and remediate exposures. Attackers run the same patterns continuously against GitHub’s public commit stream via the GitHub Events API, which broadcasts every public push in real time.

The entire discovery-to-exploitation cycle is automated. The attacker may never manually review the credential they use. A bot finds it, validates it with a lightweight AWS API call (sts:GetCallerIdentity — one of the cheapest calls available), and adds it to a queue for exploitation. If it’s valid, the exploitation begins automatically.

Why Deleting the File Doesn’t Help

This is the most common mistake after a credential exposure is discovered: the developer deletes the file, or removes the key from the file, commits the fix, and considers the problem resolved.

Git is append-only. Every commit is permanently recorded in the repository’s history. The deletion commit adds a new record that the key is gone — the original commit where the key appeared is still fully accessible to anyone who clones the repository or browses the commit history.

# Anyone can recover the exposed key from history
git log --all --full-history -- config.js
git show <commit-hash>:config.js

Even after a push, GitHub’s repository history is cached by GitHub’s own infrastructure and, depending on timing, may have already been cloned by automated systems. Pushing a fix commit does nothing to protect against a bot that found the key in the 60 seconds between your original push and your correction.

The only correct response to an exposed credential is to rotate it immediately — invalidate the old key and generate a new one. The repository history is permanent and cannot be fully sanitised without a rewrite using git filter-repo, which itself requires coordination across everyone who has cloned the repository.

How to Scan Your Own Repositories Right Now

Before implementing preventive controls, scan your existing history. Keys committed months or years ago may still be active and may have been silently in use by an attacker.

TruffleHog scans the entire commit history of a repository for secrets:

# Install and scan a local repository
pip install trufflehog3
trufflehog3 git file://. --json

# Scan a remote GitHub repository
trufflehog3 git https://github.com/yourorg/yourrepo

GitGuardian offers a free tier that integrates with GitHub and scans your repositories and their histories, sending alerts for any secrets it finds. It covers over 350 secret types across all major services. Setup takes under 10 minutes at dashboard.gitguardian.com.

GitHub’s native secret scanning — if your repository is on GitHub, navigate to Settings → Security → Secret scanning. GitHub scans repositories for known secret formats and alerts repository owners. For public repositories this is enabled automatically; for private repositories it requires GitHub Advanced Security on paid plans.

Run at least one of these against every repository that has ever been public, or that any external contributor has had access to.

How to Properly Manage Secrets

The correct approach is to ensure secrets never exist in your codebase in the first place.

Environment variables are the minimum baseline. Secrets are stored in the operating environment and accessed via process.env.AWS_ACCESS_KEY (Node), os.environ['AWS_ACCESS_KEY'] (Python), or the equivalent for your stack. They never appear in code. The .env file used for local development is added to .gitignore on day one and never committed.

AWS Secrets Manager is the correct solution for production. Secrets are stored in AWS, versioned, access-controlled via IAM, and automatically rotated for supported services (RDS, Redshift, and others). Applications retrieve secrets at runtime via SDK calls rather than environment variables, eliminating the risk of accidental exposure through deployment pipelines or CI/CD logs.

import boto3
import json

client = boto3.client('secretsmanager', region_name='ap-south-1')
secret = client.get_secret_value(SecretId='prod/myapp/database')
credentials = json.loads(secret['SecretString'])

HashiCorp Vault is the provider-agnostic alternative — useful if you’re multi-cloud or want to centralise secrets management across AWS and non-AWS services. Vault’s documentation covers setup for most deployment scenarios.

For CI/CD pipelines, use your platform’s native secrets storage — GitHub Actions Secrets, GitLab CI/CD Variables, or equivalent — rather than hardcoding values in workflow files.

Pre-Commit Hooks: Catching Secrets Before They’re Pushed

The most effective prevention is stopping the commit before it reaches the remote repository.

detect-secrets by Yelp is a lightweight pre-commit hook that scans staged files for secrets before allowing a commit to proceed:

# Install
pip install detect-secrets

# Generate baseline (marks known non-secrets as safe)
detect-secrets scan > .secrets.baseline

# Add to pre-commit config
# .pre-commit-config.yaml
repos:
  - repo: https://github.com/Yelp/detect-secrets
    rev: v1.5.0
    hooks:
      - id: detect-secrets
        args: ['--baseline', '.secrets.baseline']

git-secrets is AWS’s own tool, specifically designed to prevent AWS credentials from being committed:

brew install git-secrets       # macOS
git secrets --install          # install hooks in current repo
git secrets --register-aws     # add AWS credential patterns

Both tools can be enforced across a team by including them in the project’s .pre-commit-config.yaml and requiring pre-commit hooks as part of the development environment setup.

What to Do If You’ve Already Exposed a Key

Speed matters. Every minute the key remains valid is another minute of potential exploitation.

Step 1: Rotate immediately. In the AWS console, navigate to IAM → Users → [user] → Security Credentials → Access Keys. Create a new key, update all systems using the old key, then deactivate and delete the old one. Do not deactivate before the new key is in use — that creates an outage. Do not leave both keys active longer than necessary.

Step 2: Audit CloudTrail for the exposure window. In CloudTrail → Event History, filter by the compromised Access Key ID and review every API call made with it — especially RunInstances, CreateUser, AttachUserPolicy, GetObject on sensitive buckets, and any calls from unfamiliar IP addresses or regions.

Step 3: Check for backdoors. A sophisticated attacker who has had access to your AWS account may have created new IAM users, roles, or Lambda functions to maintain persistence after the original key is rotated. Review IAM → Users and IAM → Roles for any entities created during the exposure window.

Step 4: File an AWS support ticket. If significant charges were incurred, AWS has a process for reviewing and crediting costs from clear credential theft cases. Document the commit timestamp, the CloudTrail evidence, and the remediation steps taken. Outcomes vary but AWS does credit legitimate cases.

Get Your Secrets Management Reviewed

A single committed credential is one of the highest-risk events in a startup’s security history — not because of what it costs immediately, but because of what may have happened during the time between exposure and discovery that you don’t know about yet.

A cloud security review covers your secrets management practices, IAM configuration, CloudTrail coverage, and the monitoring gaps that would prevent you from detecting a compromise quickly. If you want to understand your current exposure, get in touch or explore our services. More on how we work is on our about page.

More security guides for developers and founders on the Kuboid blog.