One leaked credential can silently compromise your entire SaaS stack. Find out the 6 critical risks you need to know.

TRUFFLEHOG

COMPANY

RESOURCES

One leaked credential can silently compromise your entire SaaS stack. Find out the 6 critical risks you need to know.

Ben Zimmermann

The Dig

April 20, 2026

Thousands of Live Secrets Found Across Four Cloud Development Environments

Ben Zimmermann

April 20, 2026

TL;DR I scanned 22 million public Cloud Development Environment projects across CodeSandbox, StackBlitz, CodePen, and JSFiddle with TruffleHog, found 8,792 verified, unique secrets, and made over $20,000 in bounties along the way. The most impactful finding was a GitHub employee token with write access to github/github.
This guest post by Ben Zimmermann was developed through Truffle Security's Research CFP program. Ben is a security researcher focused on credential exposure and secret scanning at scale

Prior secret scanning research has heavily focused on Git platforms like GitHub, GitLab, and Bitbucket. But there is an entire class of development platform that has received zero systematic attention: Cloud Development Environments (CDEs).

What are Cloud Development Environments?

Cloud Development Environments (or CDEs) such as CodeSandbox, StackBlitz, CodePen, and JSFiddle let developers write and run code directly in the browser. They are used for prototyping, learning, sharing demos, and building full applications. Unlike Git platforms, CDEs have no native secret scanning integrations, no push protection, and no partner programs to automatically revoke leaked credentials. When a developer pastes an API key into a public project, it persists unless they manually delete it or change its visibility.

A live example: AWS credentials sitting in a public CodeSandbox project. You can view it at https://codesandbox.io/p/sandbox/jvkfty. Canary tokens used for demonstration.

This research set out to answer a simple question: are CDEs leaking credentials at scale? The answer is yes.

CodeSandbox stands out with one verified secret for every 1,299 sandboxes. This makes sense: CodeSandbox is the most full-featured of the four platforms, often used for building complete applications with backend services, environment files, and third-party integrations. Developers often treat sandboxes like private workspaces, even when they are publicly accessible.

CodePen had the lowest density, which also makes sense. Pens are typically small front-end snippets, less likely to include backend credentials. But at 10 million pens, even a low density produced nearly 1,000 live secrets.

Using sandbox creation metadata from CodeSandbox, the oldest sandbox with a live secret dated back to April 2018. The data also shows secret leakage increasing year over year, with 2025 seeing more than double the verified secrets compared to 2024.

Discovering Public Projects

Each platform required a different enumeration strategy. None of these platforms expose a simple "list all projects" API, so I had to find creative ways to discover content at scale.

CodeSandbox

I discovered that the platform uses a public Algolia search index to power its search functionality. This index contains metadata for every public sandbox, including sandbox IDs and creation timestamps. I wrote a script that queries this index day by day, paginating through all results for each date range to enumerate every public sandbox on the platform.

def enumerate_day(day, *, query="", hits_per_page=1000, session=None): start_ts, end_ts = _timestamp_bounds(day) 
params = { 
"x-algolia-application-id": ALGOLIA_APPLICATION_ID, 
"x-algolia-api-key": ALGOLIA_API_KEY, 
} 
hits = [] 
for page in range(MAX_PAGES): 
encoded = _build_params(query, page, hits_per_page, start_ts, end_ts) response = sess.post( 
ALGOLIA_URL, 
params=params, 
json={"requests": [{"indexName": ALGOLIA_INDEX, "params": encoded}]}, ) 
page_hits = response.json()["results"][0].get("hits", []) 
hits.extend(page_hits) 
if len(page_hits) < hits_per_page:
break 
return hits

This yielded 8,362,053 sandbox IDs. From there, I downloaded each sandbox's source code through CodeSandbox's API.

CodePen

There is no public index, so I took a different approach: social graph crawling. I captured the GraphQL endpoints backing CodePen's trending and search pages, used those to gather initial seed users, then recursively crawled each user's followers and following lists. This snowball effect expanded to 570,000 unique users. From there I pulled every public pen for each user, totaling 10,296,169 pens.

def _expand_followers(self, owner_id, cursor): 
payload = followers_payload(owner_id, self.follower_limit, cursor) 
resp = self.session.post(self.cfg.graphql_url, json=payload, timeout=20) data = resp.json().get("data", {}).get("ownerFollowers", {}) 
owners = data.get("owners", []) or [] 
next_cursor = data.get("pageInfo", {}).get("cursorEnd") 
inserted = self.storage.add_owners( 
(entry["id"], entry.get("username", ""), f"followers:{owner_id}") 
for entry in owners if entry.get("id") 
) 
if not next_cursor: 
self.storage.mark_followers_complete(owner_id) 
else: 
self.storage.update_follower_cursor(owner_id, next_cursor)

JSFiddle & StackBlitz

Neither platform has a public search index or a social graph to crawl. Instead, I used GitHub usernames as a bridge. Developers frequently reuse the same username across platforms, so a known GitHub username is a reasonable guess for a JSFiddle or StackBlitz profile. I pulled 7.2 million GitHub usernames from BigQuery's public GitHub dataset (githubarchive) and checked them against each platform's profile pages to find valid accounts. For JSFiddle, once I had valid users I hit their public API endpoint to list and download their fiddles, yielding 608,258 fiddles. For StackBlitz, I used the same username correlation to enumerate 3,014,469 projects.

In total, I enumerated and downloaded over 22.2 million projects across all four platforms.

Scanning with TruffleHog

All project content was downloaded to a VPS locally, and from there I scanned everything using TruffleHog with the --only-verified flag. I added a hook to notify me through a Discord webhook whenever a download completed. For platforms like CodeSandbox, I used proxy rotation with round-robin cycling to distribute requests across 8 million downloads:

if proxies: 
sessions = [] 
for proxy in proxies: 
s = requests.Session() 
s.proxies = {"http": proxy, "https": proxy} 
adapter = HTTPAdapter(pool_connections=workers, pool_maxsize=workers) for scheme in ("https://", "http://"): 
s.mount(scheme, adapter) 
sessions.append(s) 
_rr = cycle(sessions) 
_rr_lock = threading.Lock()
def next_session(): 
with _rr_lock: 
return next(_rr)

Results were deduplicated by secret value to avoid inflating counts. In total, this amounted to terabytes of source code across 22 million projects

LLM-Assisted Triage

The 8,792 leaked secrets spanned dozens of SaaS and cloud providers, which meant I needed an efficient way to analyze them and identify the most critical findings. Unlike Git platforms where committer email addresses help attribute secrets to individuals or organizations, CDEs typically don't expose this metadata, making triage harder.

I used Claude Code to write Python scripts that would pull metadata for each secret, things like account info, permission scopes, accessible resources, and organization details. A typical prompt looked like:

Generate a structured Python script that takes {service}.jsonl as input, calls the {service} API to pull metadata (caller identity, permission scopes, accessible resources, organization info), and streams results to an output file. Use Firecrawl MCP to reference {service}'s API documentation for best practices.

Firecrawl is an MCP server that lets Claude Code search and extract web content, so it could pull up-to-date API docs for each service before generating the final scripts.

These scripts produced structured output that I could then feed into a local instance of gpt-oss-20b. The local model would intake the metadata and surface findings that warranted deeper manual investigation, like tokens with admin scopes or credentials tied to large organizations. All flagged findings were then manually verified before disclosure.

For high-impact findings I disclosed directly to the affected organizations. For the rest, I worked with Truffle Security to coordinate bulk outreach. Truffle Security helped facilitate contact with SaaS providers including AWS, GitHub, Anthropic, OpenAI, MongoDB, Stripe, SendGrid, Twilio, and others to revoke their clients' exposed credentials. Truffle Security also initiated outreach to all four CDE platforms to share the findings and discuss potential mitigations.

Access to github/github

On CodeSandbox, I found a public sandbox containing a GitHub OAuth token belonging to a GitHub employee, inside an index.ts file.

The token had repo, workflow, codespace, gist, and read:org scopes. When I tested it against the GitHub API, the response confirmed push access to github/github, the private repository that contains GitHub.com's production source code.

Repository ID 3, created on October 29, 2007, description: "You're lookin' at it." The token granted access to over 74,000 repositories across 26+ organizations, including Microsoft, Azure, GitHub Actions, and GitHub's internal early-access and interview organizations. With workflow permissions on top of write access, this token could have been used to modify GitHub Actions pipelines, inject code into GitHub's production codebase, or pivot into downstream supply chain attacks.

I reported this through GitHub's bug bounty program on HackerOne. GitHub triaged and resolved the issue, and awarded a $20,000 bounty.

Other Notable Findings

On one of the CDE platforms, I discovered a GitHub personal access token belonging to a Home Depot employee. The token provided admin access to 64 repositories and push access across 664 repositories total, covering internal infrastructure, authentication systems, and secrets management. The token had been publicly exposed for approximately one year.

Separately, I found an SSH private key on another CDE platform that authenticated as a Red Hat employee with write access to eclipse-che/che, the direct upstream repository for Red Hat OpenShift Dev Spaces. I confirmed write capability by running:

ssh -i key [email protected] git-receive-pack eclipse-che/che.git

Instead of returning a read-only error, GitHub returned the full ref advertisement with delete-refs capability, confirming the key could push to and modify the repository. A threat actor with this key could have pushed malicious code into Red Hat's commercial product. I reported this to both Red Hat and the Eclipse Foundation, and Red Hat added me to their security acknowledgment page.

Takeaways

CDEs are a blind spot. Unlike Git platforms, none of the four platforms scanned have secret scanning, push protection, or partner revocation programs. There is no automated detection or revocation to catch a pasted API key.

Disclosure at this scale is its own challenge. Coordinating revocation across dozens of SaaS providers and four CDE platforms required automation, bulk outreach, and direct engagement with security teams. Without Truffle Security's help facilitating contact with providers, most of these credentials would still be live.

The type of secret correlates directly with what each platform supports. CodeSandbox allows full backend environments, so credentials tend to be high-impact: database connections, cloud keys, and service accounts. CodePen is frontend-only, so leaks skew toward public API keys for weather or mapping services. The more capable the platform, the more sensitive the credentials found on it.

The pattern across all of this research is consistent. Wherever developers write code, secrets typically follow. Git platforms have started to build defenses. CDEs have not.

For developers, the immediate step is to audit any public CDE projects and rotate credentials that may have been exposed. Going forward, treat public sandboxes the same way you would a public GitHub repository. Never paste real credentials into one, use the platform's built-in environment variable UI where available, and check your project visibility before sharing. For the platforms themselves, secret scanning on publish, push protection, and partner revocation programs are all solved problems on Git platforms and could be adopted here.