In 2019, when our team founded Truffle Security, the state-of-the-art in secret scanning was entropy checks and a smattering of regular expressions. Does this string look like a real key? If so, report it. Is that credential valid? Don’t know.
This method created an incredibly high false-positive to true-positive ratio. Scanning a large organization’s IT stack for leaked secrets created a mountain of data. Security teams demanded a much more precise and tuned secret scanner to report only live (valid) exposed credentials.
Enter secret verification.
What is Secret Verification?
If you were asked to determine whether a particular string is a valid API key, you’d likely review the relevant SaaS provider’s documentation and then execute a cURL command like this:
If the server response is HTTP 200 OK
, you’d record it as a valid API key. If not, you’d record it as invalid.
Secret verification is programmatically checking that a credential can be used to authenticate to the issuing service. TruffleHog provides automatic verification for every single secret detector.
Secret Verification Challenges
TruffleHog detects more than 800 different key types. So, how do we engineer secret verification at scale? Does our engineering team build the equivalent of a simple cURL command for each type of secret and move on to the next one?
Unfortunately, secret verification is significantly more complex than it appears.
Endpoint Selection
The endpoint chosen to verify against matters. At Truffle Security, our goal is to keep verification stateless. We don’t alter data, we don’t create new resources that cost users’ money; we simply check that the credential can authenticate (or not) and move on.
Consider how we verify valid credentials for the secret management platform, Doppler.
We make a simple GET
request to the /v3/me
endpoint with an authorization token. That endpoint cannot alter the user’s data in any way; instead, it merely returns basic profile information about the user’s account.
Some security-forward organizations, like Tailscale, have released secret scanning verification endpoints, solely for the purpose of offering a stateless, data-minimizing endpoint to verify secrets against. We applaud those efforts and encourage all organizations to follow suit.
HTTP Responses
Expecting HTTP 200 OK
responses for valid keys is not sufficient for many service providers.
Consider SaaS providers, like Zenscrape, who rate limit their users’ accounts. Under normal usage, the API will return 200 OK
when receiving a valid credential. However, if an account has reached their rate limit, Zenscrape will return 429 Too Many Requests
. In this case, both 200
and 429
status codes indicate a valid credential.
Or consider another provider, like Braintree, that almost always returns 200 OK
responses. The JSON inside the response governs whether the suspected API key is actually valid.
There’s a lot of nuance across providers that prevent secret verification from being built solely by simple HTTP responses.
Network Errors or Invalid Credential?
The majority of verification requests reach the appropriate servers and TruffleHog can reliably return verification data. But what happens when the verification request fails to reach the SaaS provider’s servers? Does that mean the credential is invalid? Of course not. It has no bearing on the validity of the credential; however, if we’re looking for a HTTP 200 OK
response and receive a 500 Internal Server Error
, how should TruffleHog respond?
We’ve been actively working on an elegant solution to deal with reporting network errors. More details to come soon!
APIs Change
SaaS providers like to improve their products, which often means changing their API. Staying current on which endpoints produce valid, authenticated results requires closely monitoring API provider documentation as well as swiftly reacting to unexpected scan results.
One API change in particular adds complexity to the verification process: new key types. We applaud providers that introduce more secure authentication mechanisms as well as those that uniquely prefix their keys. However, any upstream API authentication change requires TruffleHog to support multiple key types, which essentially requires maintaining an entirely new secret detector.
Testing
Each secret detector verification function requires multiple tests. At a minimum, we need one test that includes a valid secret and a second one with an invalid secret. How do we maintain nearly 800 separate tests for APIs that can change at any point? How do we maintain valid test secrets for 800 SaaS products that we don’t need?
Constant care and attention. It’s a large undertaking, which will keep growing as we add new detectors. But adequate testing coverage is critical to keeping our industry-leading false positive rates low.
Complex Detectors
A large portion of TruffleHog’s detectors are for REST APIs. Those implementations are usually straightforward to build. However, some credential verification is much more complex.
For example, consider a secret detector for a database connection, like Postgres or MSSQL. First, we need to construct a database connection string. Second, we need to load a Database driver (which creates a new dependency). Third, we need to ping the database. What if the database returns an SSL error? That doesn’t mean the credentials are bad; let’s adjust our connection string and retry. And so on.
The more custom detectors involve significantly more development resources and maintenance.
So, how do we do it?
At Truffle Security we have a full team of world-class engineers working on these challenges everyday. Complementing our team is an entire community of security engineers, developers, bug bounty hunters and others that help add new secret detectors, update failing ones and ensure our community has access to the most powerful and reliable secret scanner. That’s the power of open-source. And that’s why TruffleHog will stay core to open-source.