Over 13 years ago, Stack Overflow user Palec asked why his commit was still viewable on GitHub despite deleting it from all branches on his repository.
The answer he got was to open up a support ticket on GitHub, and hope for the best.
Today, 13 years later, you can still download the data from his commit, even though it’s not a part of the repository.
GitHub’s own documentation calls this out saying sensitive data will stay exposed even after it’s deleted from the repository:
Wouldn’t it be cool if TruffleHog could scan these deleted files for secrets?
Well, back in October of 2023 a member of our community actually authored a contribution to do just that. Unfortunately, it took longer than expected to merge in.
A consequence of that is security researchers at Aqua Security published a blog calling out what they described as Phantom secrets (secrets in deleted commits) and mentioned TruffleHog was not capable of supporting scanning/detecting these commits. Ouch. Fear not though, today we’re happy to announce it’s now natively supported.
We pushed an update to TruffleHog that detects secrets in many deleted commits, such as commits in deleted branches. This means you might see secrets reported on your GitHub, Gitlab, etc. repositories that are not navigable by the provider’s UI through the history button (or the default git clone
operation).
What’s Improved?
We recently merged a PR that clones all git refs from git-based sources, such as GitHub, GitLab, etc. What this means in practice is that TruffleHog may report secrets located in commits in “deleted” branches.
Consider the following scenario: a developer works on a new feature branch and accidentally commits a secret. Luckily, they notice the secret, and overwrite it in a subsequent commit. The git history looks like this:
When the engineering team merges the branch, they run “Squash & Merge”, which collapses all commits into one commit for merging into main
. This merging process does not add the secret into the main
branch. Then, as is common practice, the developer deletes the feature branch.
Prior to this update, since the branch was deleted, TruffleHog would not identify these secrets. How would TruffleHog even get that data? It’s not located in main
and no branches with that secret exist anymore. Right? Apparently not.
Our team discovered when you open a Pull Request, the version control service provider (ie: GitHub, Gitlab, etc) creates a “pseudo-branch” on the server to determine what would happen when you try to merge the comparison branch into the base branch. These “pseudo-branches” are prefixed by refs/pull/
and can be seen by running the command git ls-remote
against a remote repository.
When you git clone
a repo, those pull request “pseudo-branches” are not fetched by default, since they are not located under refs/head
. However, they are accessible by adding an additional git clone
configuration option:
This command tells the git to get all of the pull request refs and download the data (read: commits) needed to reconstruct them. This enables TruffleHog to scan commits that previously were squashed/merged/deleted during a Pull Request.
Note: If you’re using the git configuration option outside of TruffleHog, please use caution since the “+” means Git will update the local ref even if it results in a non-fast-forward update (i.e., it might overwrite changes).
What does it mean for you?
For a long time we have been championing, the best and only way to remediate leaked secrets is rotation/revocation. We launched https://howtorotate.com to help with this, and these still accessible deleted commits, further solidifies there is no other solution to a leaked secret. It must be revoked.
What’s still missing?
Scanning these deleted branches surfaces a ton of secrets, but there’s also secrets in deleted commits that were squashed/force pushed in existing branches. These commits will not get pulled down when including the extra clone flags, but they are still available in the GitHub UI, and REST API.
For example, a blog in February described a way to enumerate these commits with the GitHub events API, and pull them down with the REST API. This method is imperfect, because the events API will only save the first 10 commits in a push event, and also is subject to severe rate limiting, as each commit needs to be fetched individually.
We’re still researching the best way to get coverage for the squashed commits, and of course review pull requests/ideas from the community if you have any.