TRUFFLEHOG

COMPANY

RESOURCES

Karim Rahal

THE DIG

July 28, 2023

How Secrets Leak in CI/CD Pipelines

How Secrets Leak in CI/CD Pipelines

Karim Rahal

July 28, 2023

Secrets leak in CI/CD pipelines routinely.

Continuous integration/deployment (CI/CD) workflows typically require developers to provide valid credentials for the third party resources their pipeline interacts with. Want to automatically deploy code changes to an EC2 instance? Provide an AWS access key. Want to deploy an artifact to NPM? Provide an NPM API key. 

Instead of hardcoding cleartext secrets into a Git repository, developers often use CI/CD platforms’ built-in functionality to inject secrets at runtime. For example, CircleCI and Travis CI users can configure jobs with pre-set environment variables containing their API keys and passwords. GitHub Action users can add a “secrets” workflow object. Some developers choose to go outside the CI/CD platform and use a third-party secrets manager like Hashicorp Vault or AWS Secrets Manager.


Injecting Secrets in Jenkins Pipelines


Unlike private repositories, CI/CD pipelines running against open-source projects, like the official Python project, expose CI/CD job log files. Malicious actors can easily parse the log output and look for exposed secrets. How would secrets leak into the log files?  Consider two examples: a --verbose flag to curl could expose a sensitive header or a test script could print all environment variables. There are immeasurable ways that secrets leak into CI/CD logs.

As a protection measure, some CI/CD platforms attempt to mask injected secrets data in their job logs. Moreover, a couple platforms actively scan all log data for potential leaks, regardless of whether the value came from an environment variable or secrets object. Travis CI, for instance, runs open-source secret scanners to compare log strings against common formats of API keys, tokens, and other sensitive information.


CircleCI job log masking a secret NPM_TOKEN value


On the surface, attackers should fail to find leaked secrets in public job logs; however, CI/CD secret protection is far from perfect. Output and secret formatting confuses CI/CD masking efforts; second-order secrets fly under the radar; and artifacts can carry sensitive information within them. The result: a sprawl of private information exposed to malicious actors.

Output and Secret Formatting is Confusing

CI/CD jobs that significantly transform secrets and then print them to a log file might evade redaction. For example, the following CircleCI job echoes the injected NPM_TOKEN environment variable and then pipes it into rev, which reverses the string. 


- run: 
        name: Authenticate with npm 
        command: | 
                echo "TOKEN VALUE: $NPM_TOKEN" | rev 
                npm set "//registry.npmjs.org/:_authToken=$NPM_TOKEN"


After pushing a commit, the CircleCI job log leaks the npm token (in reverse order).


CircleCI job log with an npm account token exposed in reverse order.


In addition, the format of stored secrets can lead to leaks. GitHub, for instance, recommends against grouping secrets in structured data formats like JSON (such as the way GCP exposes service account keys), as it makes leak detection difficult. CI/CD platforms usually check for exact string matches, so they might fail to detect a partial leak from structured data.

Full string leaks can also confuse platform redactors. Despite Travis CI actively scanning logs for sensitive leaks, public log files still reveal secret data. For example, in the public log file shown below, a valid GitHub personal access token is viewable in cleartext.


This Travis CI build log leaked a GitHub Personal Access Token in the “Setting environment variables from repository settings” step.


Travis CI exposed these values in its environment variable initialization logs—so it’s likely that the developers injected the environment variables but failed to mark them to be hidden. Shockingly, the GitHub token had a myriad of privileges:


Terminal output of a response from GitHub’s API showing this token has various admin scopes enabled, in addition to the repo, user, and workflow scopes

ENCODING

Encoding creates additional complexity for secret identification and redaction in CI/CD logs. Sometimes developers base64-encode a secret or JSON blob in their environment variables; the decoded secret will in no way match the starting value. The reverse can happen too, in cases like basic authentication, where a password is loaded in in cleartext, and then it’s encoded for actual authentication. Most CI/CD redaction systems do not decode data when reviewing logs for leaked secrets.

It’s worth noting TruffleHog automatically base64-decodes any base64 it discovers while looking for secrets.

Second-Order Secrets

If a developer wanted to include a secret without using an environment variable or secrets object, how would they do it? 

They wouldn’t (or shouldn’t) put sensitive information in a public repository file. They could store all confidential data in an encrypted repository file and inject the decryption key in a CI/CD environment variable.

However, CI/CD platforms that don’t redact secrets outside of environment variables would fail to redact any of that file’s decrypted secrets data. The platform is only aware of—and would only mask—the decryption key since it’s in an environment variable. Below is an example of this type of second-order secret leaking in a CircleCI log:


CircleCI job log of a npm account token leaking from an encrypted file.


Another option to inject sensitive data is to use a third-party secrets storage solution, such as Vault by Hashicorp. Unfortunately, this method could also leak secrets on platforms unaware of external secrets.

Interestingly, in a blogpost on Security best practices for CI/CD, CircleCI mentions both methods above (encrypted secrets files and third-party secret storage) as possible configuration options. However, the post does not discuss the security risks that those alternative storage methods introduce.

Separately, OAuth refresh tokens and AWS keys are often traded for temporary access tokens and signed authentication URLs, respectively. Most of those second order credentials evade detection as well.

Leaky Artifacts

Leaks are not confined to logs. CI/CD pipelines often produce and upload artifacts like GitHub releases or npm packages. A typical workflow tests, builds, and publishes a package. What happens when a workflow exports a secret into the public artifact?

To illustrate this example, we created a simple Node.js app that interfaces with the ChatGPT API to produce a food recommendation. The application ingests a configuration file containing a cuisine and an OpenAI API key. Using that information, the application returns a food recommendation from ChatGPT:


A terminal showing an example output of the food recommendation tool.


To publish the package to npm, we specified the following CircleCI configuration file:


version: 2.1 
jobs: 
  test_and_publish: 
    docker: 
      - image: cimg/node:lts 
    steps: 
      - checkout 
      - run: 
          name: Install Dependencies 
          command: npm install 
      - run: 
          name: Run tests 
          command: | 
            echo "{\"chatgpt_api_key\": \"$CHATGPT_API_KEY\", \"cuisine\": \"Italian\"}" > test-config.json 
            npm test 
      - run: 
          name: Authenticate with npm 
          command: | 
            npm set "//registry.npmjs.org/:_authToken=$NPM_TOKEN" 
      - run: 
          name: Publish npm package .
          command: npm publish 
          
workflows: 
  version: 2 
  test_and_publish: 
    jobs: 
      - test_and_publish


The job (1) creates a test configuration file with the API key and cuisine, (2) executes tests, (3) authenticates to npm with a token, and (4) publishes the package.

This setup appears secure until you realize that the npm publish command publishes the test-config.json file. In other words, the workflow leaks the API key into the public package!


A npm package page for “foode-gpt-demo” with the test configuration file, along with the ChatGPT API token, publicly exposed.


Importantly, secret redaction doesn’t apply here—the value isn’t logged.

Mitigation

First, identify what logs and jobs expose secrets. Don’t forget to scan old logs too!

TruffleHog supports scanning current and previous CircleCI logs for this exact purpose. Use the following command:

 

trufflehog circleci –token


Running it on the test account, TruffleHog detected the npm token purposefully exposed in the demo above.


The command line output of running TruffleHog on the test account. It detected the leaked npm account token.


TruffleHog can also be run against other CI/CD platforms, please see our documentation.

Below are some additional recommendations to prevent secrets from leaking in  CI/CD jobs:

  • Put separate tasks in separate jobs.  CI/CD tools isolate jobs. This means that saving a secret in one job does not make it accessible to another. In the example ChatGPT package, if we isolated the workflow test steps from the public steps, the API key would not have leaked into the public artifact. 

  • Practice good secret hygiene. Rotate your secrets frequently, assign them the least privilege possible, and limit exposure to different processes/steps.

  • Properly store secrets. Use platform environment variable / secret injection, and do not combine secrets into structured data. Take advantage of in-platform masking in case a leak does happen.

  • Protect your secrets from output logs. This is easier said than done. Generally, avoid using secrets in debug statements, running code that prints environment variables, or executing commands with a verbose flag.

  • Pick a standard workflow over a custom one. GitHub Actions allows importing workflows. Similarly, CircleCI supports “orbs”, which are pre-built steps that can handle secrets securely. For example, CircleCI’s Slack orb allows developers to interact with Slack, and will safely handle a SLACK_ACCESS_TOKEN environment variable.

  • Proactively scan your logs. Like we did with TruffleHog, it’s important to continuously look for leaks in your job logs, no matter how pristine your workflow configuration may be. You never know what inadvertent effects some commands may have on secret leakage.

Conclusion

CI/CD workflows are an indispensable part of modern software development. They enable projects to test pull requests, build new releases, scan for security vulnerabilities, and more. However, when developers make job logs and artifacts public, an organization must monitor all workflow output for secrets leakage. CI/CD platforms may help bandage a leak via redaction, but that’s not always sufficient. With unbounded secrets sprawl—obscure formats, second-order secrets, and leaky artifacts—organizations face significant secrets exposure. Following secure secret and workflow practices helps prevent API keys and account tokens from ending up in the wrong hands.