This post is part of a series documenting how to incorporate TruffleHog into your CI/CD pipeline. If you use GitHub Actions or Circle CI, please see those posts. If you use another tool, please let us know and we’ll do our best to write a post for your use case.
In this post, we’ll walk through how to incorporate TruffleHog (Open-Source or Enterprise) into an Azure Pipeline. Click here to skip the tutorial and check out the code.
Let’s start by understanding the code required to prepare an Azure Pipeline to run TruffleHog.
The Pipeline File
To run an Azure Pipeline workflow, developers must place a file named azure-pipelines.yml
in the main directory of their repository.
PIPELINE TRIGGERS
The first lines in our Azure Pipeline YAML file define when we will run TruffleHog.
At a minimum, we recommend running TruffleHog against all PRs as well as all pushes directly to the production branch (main
in this example).
Note: if you host your code on Azure Repos, you cannot use the pr
directive. Instead, you’ll need to set up a “Build Validation” manually inside your Azure project. If you use GitHub, GitLab or any other supported VCS hosting provider, Azure says you can use the pr
directive.
DEFINE A JOB
Next, we’ll create a job named “SecretsCheck”.
This job will contain all of the code required to run TruffleHog against PRs and pushes.
jobs: - job: SecretsCheck pool: vmImage: ubuntu-latest
Before executing any code, we have to specify which OS we want to run it on (see the pool
attribute in the screenshot above).
We chose ubuntu (and specifically the latest
version). Why? We need cURL and jq in the next step and Ubuntu has both of those applications pre-installed.
You can use any OS that supports running TruffleHog. If your selected OS does not come with cURL and jq, you can either (a) install those tools, or (b) adjust the code in the next step.
Within an Azure Pipeline “job”, you define “steps”. Each “step” takes some action within the virtual environment you selected for that job (such as running a script or installing software). The next 3 steps prepare our git repository for scanning and then invoke TruffleHog.
STEP 1: COUNT THE COMMITS
In our first step, we execute a bash script that counts the number of commits present in the PR or Push. Unfortunately, Azure does not provide a built-in variable to reference this value in a pipeline (as far as we can tell!).
Developers must query the pipeline API to get the commit count. Using the System.AccessToken, this script executes an authenticated query against the pipeline URI for the “changes” information related to this workflow. We parse the count
value from the JSON response, which is an integer value representing the count of commits in the PR or Push, and set it as an environment variable named ChangesCount
.
Interestingly, when executing a workflow against a PR (not a push), Azure creates an additional commit in the context of our CI/CD pipeline. It’s usually titled something like “Merge pull request <#> from <branch> into main”. This additional commit is not included in the count of changes retrieved from the API. As a result, we must increment our ChangesCount
variable by one to account for the additional commit in our git history.
In the example below, we made two commits and then created a PR. Running git log
during CI/CD execution revealed one commit more current than the last commit in the PR. The ChangesCount
value from the pipeline API was 2; however, we needed to increment this value to 3 to properly scan all commits in the PR.
Example of an Azure-Generated Commit in our CI Runtime
Finally, we export our ChangesCount
variable for use outside of this step. Just like a Python function, our script has a local context. To use the ChangesCount
variable outside of this step, we export it using the odd-looking ###vso[task.setvariable]
syntax. Future steps can now reference the number of commits being reviewed by this pipeline by invoking the $(ChangesCount)
variable.
STEP 2: GIT CHECKOUT
So why did we go through all of that trouble to get the number of commits present in the PR or Push?
Efficiency! A naive approach to adding TruffleHog into CI would clone the entire repository and run TruffleHog against the whole git history. Depending on the size of the repository, this could take several minutes (instead of seconds). Also, every pipeline execution would replicate previous TruffleHog scanning, which is a waste of computing resources and time.
Instead, our goal is to efficiently scan only the difference between the base branch and the PR or Push. This cuts down TruffleHog’s scanning time to seconds. To accomplish this, we modified Azure’s built-in “Git Checkout” task by customizing the fetchDepth.
Azure’s Git Checkout Process
Azure’s default “Git Checkout” step does the following:
Initialize a new, empty git repository on the temporary VM.
Create a remote connection between your code (hosted in Azure Repos or elsewhere) and the new git repository.
Fetch the relevant git data from the remote repository (ie: where your code lives).
Check out the most recent commit on the remote origin, thus forcing the local git status into a detached HEAD state.
As tempting as it is to use git
commands to figure out how many commits there have been in the Push or PR since the base branch, it’s not possible to do this with the default “Git Checkout” step. You would need to fully clone the repository to establish a reference to the base branch (ex: main
).
Alternatively, you could develop a custom “Git Checkout” command, but since that seemed like a lot of work (and likely less stable), we decided to customize Azure’s built-in command.
Shallow Fetch
Azure provides users with a fetchDepth
argument for the “Git Checkout” step. This enables developers to specify a value for the --depth
argument in the git fetch
command that is run during checkout (step 4 in Azure’s Git Checkout process).
Here is the official Git documentation about the --depth
flag:
Git Fetch Depth Documentation Screenshot
Essentially, the fetchDepth
argument limits how far back in git history Azure grabs commits. This enables developers to only clone down the commits between the base branch and the current Push or PR commit. Azure calls this process Shallow Fetching.
Here’s the YAML code to accomplish shallow fetching:
The YAML code uses the ChangesCount
variable as a value for the fetchDepth
argument. This prevents the entire git history from cloning onto the CI/CD VM and limits the git history reviewed by TruffleHog to the relevant PR or Push commits.
STEP 3: RUN TRUFFLEHOG
In CI/CD pipelines, we recommend running TruffleHog Open-Source from Docker. It’s easier than installing Go and compiling from scratch. But you’re welcome to run it however you’d like.
Above we’re running a fancy docker command that mounts our git repository inside the docker container and checks it for secrets. Here’s a breakdown of that code:
docker run
: run a command in a new container
--rm
: automatically remove the container when it exits
-v “$(pwd)”:/tmp
: mount our Ubuntu machine’s current working directory (where our git code is) into the /tmp folder in our docker container
ghcr.io/trufflesecurity/trufflehog:latest
: use the latest version of TruffleHog’s docker image
--only-verified
: only report secrets that are verified to be current/valid
--fail
: if a secret is discovered, exit the program (which will fail the pipeline)
--no-update
: don’t reach out to our update server (you’re already using the latest version + this would only slow things down)
git file:///tmp/
: look through our git repository located at /tmp/.git in the Docker container
Now that we’ve reviewed the YAML code, let’s get this working inside your Azure project.
Configuring the Pipeline
We’ll assume you already have a project + repository set up in Azure.
The first step is to add an azure-pipelines.yml
file.
If you don’t have a pipeline setup already, please copy/paste this version. Please commit your changes.If you already have a pipeline file, copy the SecretsCheck
job into your existing file. Please commit your changes.
If this is your first time setting up an Azure Pipeline in this project, you’ll need to click on “Pipelines” and then “Create Pipeline” (as shown below).
Creating a Pipeline in Azure
You should be forwarded to the “Review” stage, since we already committed a YAML file to our repository. Click the “Run” button. If you already had a pipeline setup, please manually trigger a “Run” to ensure the new job was added correctly.
Starting the First Pipeline Run
You should see all green checks from the SecretsCheck job.
TruffleHog Successfully Completing a Scan in an Azure Pipeline
If this failed, please review the error message and this tutorial to ensure you implemented it correctly. If for some reason it’s still failing, please let us know. We’re happy to help.
Azure Repos User?
Above, we mentioned that code hosted in Azure Repos cannot use the pr
directive to run TruffleHog during PRs. Instead, you need to set up a Build Validation.
To get started, click on “Branches”, then the three dots to the right of whichever base branch you want to run TruffleHog on during a PR, and then “Branch Policies”.
Accessing Branch Policies in Azure
Click the “+” sign under “Build Validation”.
Adding a Build Validation to Azure
Change “Build Expiration” to “Immediately when main is updated”, name your Build Validation, and then click “Save”.
Adding an Azure Build Policy
That’s it! Test it out by creating a new PR containing a Canary Token and ensure TruffleHog catches the leaked secret.
Note: If you get an error message when saving your Build Validation, click “Cancel”, refresh the page and try again. That almost always worked for us.