tl;dr There are thousands of live API credentials and passwords in public GitHub comments. Unlike accidentally committing a secret to git, GitHub users are inserting passwords into text boxes and publicly posting them for all to see. TruffleHog now supports scanning GitHub issues, pull requests and comments.
Developers accidentally commit secrets to git repositories constantly. And it makes sense – an engineer working on a new integration hardcodes an API key and forgets to remove it before committing their changes. Fortunately, there’s an entire ecosystem of tooling to help prevent these types of leaks. But what about outside of git commits?
How often are GitHub users leaking secrets in other places?
Turns out it’s quite often.
Using TruffleHog, we sampled a small subset of GitHub’s public Pull Request and Issue comments and discovered 721 live API keys and passwords. By extrapolation, we can reasonably conclude that many thousand live secrets currently exist in public comments.
In the past, we’ve researched secrets leakage in code repositories, but this research didn’t focus on code.
Instead, we exclusively reviewed cases where users, often knowingly, copied and pasted their credentials into a text box and clicked “Comment”
While this isn’t unsurprising, it’s important to keep in mind that although tools like TruffleHog can help with secrets leakage, we must continue to educate developers on the importance of guarding secrets.
Human users authored nearly all comments (97%) containing leaked secrets in our dataset. While we identified a few cases of automated jobs (bots) publicly leaking secrets, they did not represent a statistically significant amount.
GitHub’s comment metadata indicates the commenting author’s relationship to the repository. A commenter can be the repository owner, a member of the repository owner’s organization, a collaborator/contributor, or have no relation.
The majority of commenters had no relation to the repository.
In addition, we found several examples of users with no association to the repository leaking keys while looking for support.
Below, we graphed all instances of a specific provider’s secrets leaking in PRs and Issues on their own repositories.
One third of all repositories containing secrets in PR and/or Issue comments also had secrets lurking in git history. While not surprising, this establishes a pattern of insecurity across the repository’s community, since the same individual did not necessarily leak both secrets.
Whether the user realized their mistake, or a fellow developer reminded them, many users edited their original comment to remove the exposed secret. Unfortunately, unless the user deleted their comment, the prior edits remained in the comment history.
In the example below, the user edited their original comment to remove the exposed key.
The drop down next to the “edited” tag revealed prior versions, including the original containing the valid API key.
This is a common behavior we’ve seen in other platforms too, such as wiki pages.
About 10% of all live secrets lived in past comment versions.
GitHub comments are markdown-rendered. Users can input any markdown-compatible text, including code blocks. The majority of comments containing leaked secrets did not leak inside of a code block (much to our surprise!). Instead, users manually typed (or copy/pasted) their secret directly into a plain text box and treated the secret like any other word. This further cements the idea that leaking secrets in comments is a fundamentally different behavior than leaking secrets in code.
A Pull Request provides developers with the opportunity to suggest changes to a code base. On GitHub, multiple developers often engage in a conversation about the suggestions via comments. The comment field is an HTML text box; users can upload files, link to new sites, quote lines of code, and generally add any text they want.
Developers use GitHub issues to file bug reports, submit new feature requests, or engage with the repository maintainers when they do not want to submit a Pull Request.
Similar to the PR comment field, the Issue comment field allows users to add most types of HTML content, including freeform text.
Our goal was to evaluate how often developers leak secrets by typing them into a textbox (not committing them to git). We built our data set by downloading a sample of a couple hundred million GitHub comments dating from 2012 to 2022. There are billions of public comments, so we only sampled a small portion of the total volume. We included new Issue posts, Issue Comments and Pull Request Comments.
We did not scan any code referenced inside a Pull Request comment, since referenced code comes directly from the git committed code base. Instead, we only scanned text inputted by a user.
After pulling down each comment, we ran our open-source secret scanner, TruffleHog, against the comment text.
We only included secrets that TruffleHog verified as live. Our headline would have been a lot cooler if we included expired API keys (Hundreds of Thousands of Comments Leak API Keys), but expired secrets rarely present a meaningful security vulnerability. We’re laser-focused on reducing false positives and only reporting meaningful results to our users.
After scanning each text block, we compiled metadata about the comments containing a live API key/secret and analyzed the results.
We attempted to notify all impacted parties. Like most of our research projects, this always proves tricky. After identifying an email address for most of the individuals with exposed keys, we sent an email that looked like this.
Our outreach was met with a variety of responses:
Some disputed the validity of our claims:
Some thanked us for informing them:
Most simply didn’t respond or act on our message. Unfortunately, thousands of keys remain publicly exposed.
There are tens of thousands of secrets lurking in public GitHub PR and Issue comments. By contrast, Truffle Security sees 1800 new secrets leaked in GitHub git pushes every day.
It’s much more likely that developers are committing secrets to a codebase than commenting them on public repositories. That said, if you maintain a public repository, we recommend regularly sweeping the Issue and PR comments for the presence of secrets.
We recently open-sourced a new feature in TruffleHog that enables users to scan their public repositories for secrets in Issues and PRs.
When you use TruffleHog’s
github module, you can pass in a repository URI as well as the flags
--pr-comments to scan all Issue and PR comments and descriptions.
As an example, here’s how you would scan TruffleHog’s
test_keys GitHub repository:
trufflehog github --repo https://github.com/trufflesecurity/test_keys --issue-comments --pr-comments --only-verified
And here’s a sample of a secret found in an Issue comment (don’t worry it’s just a test key):
--pr-commentsflag does not scan the code changes associated with the PR. It only scans the initial PR description and all user comments (which could include user-inserted code blocks).
In addition to regular repository scanning, we recommend the following:
1. Review the output of any automated tooling that comments on PRs or Issues (such as a GitHub Actions bot) and ensure all secrets are masked.
2. Review all of your own comments to ensure they do not contain secrets in past edits.
3. If you’re a SaaS provider, ensure your users aren’t inadvertently leaking their keys in Issue support requests.
If you inadvertently expose a credential, we recommend immediately rotating that key. Simply deleting the GitHub comment containing the key is insufficient for several reasons: (1) A threat actor could already have a copy of that key, (2) GH Archive could contain that key, despite attempts to delete it, (3) Editing a comment does not always delete the previous version containing it. The only way to immediately invalidate the key and render it useless to any threat actor in possession of that key is to rotate it.