Truffle Security Raises $25 Million Series B to Expand NHI Security

TRUFFLEHOG

COMPANY

RESOURCES

Truffle Security Raises $25 Million Series B to Expand NHI Security

Eduard Agavriloae & Matei Josephs

The Dig

July 18, 2025

Guest Post: GCP CloudQuarry: Searching for Secrets in Public GCP Images

Eduard Agavriloae & Matei Josephs

July 18, 2025

This guest post by Eduard Agavriloae and Matei Josephs, two expert cloud security researchers, was developed through Truffle Security’s Research CFP program. We first connected with Eduard and Matei after their well-received DEF CON 32 talk, AWS CloudQuarry: Digging for secrets in public AMIs, where they used TruffleHog to identify hundreds of live secrets in public AWS Images. In this follow-up, they expand their research to Google Cloud Platform (GCP).

tl;dr: We scanned 8,400+ public GCP images and did not find a single exposed secret! That’s a dramatic reversal compared to the hundreds we found in AWS AMIs and dozens in Azure Public images. GCP’s curated, tightly- controlled image marketplace has seemingly eliminated secret exposure in its cloud images.

Introduction

At DEF CON 32 last year, we presented our findings that public Amazon Machine Images (AMIs) leak hundreds of valid secrets. These images provide the software required to set up and boot an Amazon EC2 instance.

Our friend, Stefan Tita, went on to conduct similar research into Azure, looking for secrets in Public Azure Images. There were a lot fewer public images in Azure, but Stefan still found high-impact live secrets, such as:

7 AWS Access Keys (one of which seemed to be of an administrator)
2 GitHub personal access tokens
1 SendGrid access token

With this in mind, we decided to complete the trinity by researching public GCP images. This is GPP CloudQuarry.

Background

Google Cloud Platform takes a very restrictive approach to publishing images. Only marketplace vendors and approved publishers can make images public. This stands in contrast to AWS, which allows any user to make an image public. That’s probably why we found 3 million public AMIs on AWS, and only 8,437 on GCP. This discrepancy immediately suggested we might see different results compared to the more open ecosystems of AWS and Azure.

Research Methodology

Image Collection and Processing

Similar to our previous AWS AMI research, we took the following approach to analyze GCP's public images:

Image Discovery: List all public images, including deprecated images, and save the metadata
Disk Creation: Create disks based on each public image
Mounting Process: Attach each disk to our analysis VM and mount the partitions
File Extraction: Run targeted find commands on mounted partitions
Data Storage: Transfer identified files to an S3 bucket for analysis
Cleanup: Detach and delete disks to prepare for the next image

File Detection Strategy

Working with the Truffle Security team, we compiled a comprehensive list of file extensions commonly associated with secrets. We selected the 150 most popular extensions from that list and used them to construct our find commands, which generated the file paths for the files we would actually scan with TruffleHog. This approach allowed us to bypass scanning system default directories and other irrelevant files, significantly improving the efficiency and focus of our scanning process.

We did a test using our find commands and extracted over 50 MB of files from almost every image. This took, on average, 1-3 minutes per image. Cross-checking the extracted files from each image and the list of file extensions, we had data corresponding to almost every file type.

Technical Implementation

Our automation consisted of two main components:

Python Script: Handled GCP API interactions for disk creation, attachment, and cleanup using the Google Cloud Compute API.

Bash Script: Managed the mounting process and file extraction, with logic to handle different filesystem types (NTFS, ext4, etc.) and automatically identify mountable partitions.

For those interested in the actual source code, here’s a link to a GitHub Gist containing most of the code we used for our analysis. The file named scanner.py contains the script that we ran in a GPC VM to identify, create, attach, and clean up the images. The file named mount_and_scan.sh contains the mounting and sensitive file searching logic that was run locally in the same VM.

Our file extraction process specifically targeted the following types of files:

Configuration files (.env, .config, web.config)
Authentication directories (.aws, .ssh)
Development files (.git directories, various code files)
Database and application configuration files
…many more too…

As with our AWS research, we generated a txt file containing absolute file paths for the targeted files for further analysis. In total, we gathered over 100 GB of files from 8,437 images.

Next, we downloaded each file locally and scanned for secrets using TruffleHog, our companion for previous research activities.

This was our TruffleHog command:

Results and Analysis

Processing Statistics

Total images retrieved: 12,893
Successfully processed images: 8,437
Total files extracted: 147,945,272 files
Data volume: Over 100GB of potentially sensitive files

Secret Detection Results

Despite our comprehensive approach and the massive volume of data collected, our analysis yielded a surprising result: zero secrets were identified.

We ran multiple scans with different TruffleHog configurations and conducted manual verification of the automation process. The consistent lack of findings suggests that GCP maintains strict validation policies for images before making them available to customers.

Technology Distribution

Our analysis revealed the most common technologies found across GCP images:

Technology	File Count
Systemd	2,499,508
SSL/TLS	1,437,697
Shell	761,673
Web	565,624
Python	387,535
SSH	81,468
Database	73,279
C/C++	47,429
MySQL/MariaDB	21,714
PostgreSQL	18,386

File Extension Analysis

The most prevalent file extensions provided insight into the composition of GCP images:

Extension	Count
.file	39,270,948
.xz	12,391,397
.dirtree	6,151,256
.mo	3,487,962
.0	1,964,226
(no extension)	57,188,148

Image Diversity

Smallest image: fedora-coreos-41-20240916-1-0-gcp-aarch64 with 1,009 files
Largest image: fedora-coreos-41-20240327-91-0-gcp-aarch64 with 64,527 files

Limitations and Future Research

Our research intentionally excluded images published under paid licenses due to budget constraints. These commercial images might represent the most promising avenue for future secret discovery research on GCP.

The restriction to marketplace vendors and approved publishers likely contributes significantly to the absence of secrets in publicly available images, as these entities presumably undergo more rigorous vetting processes.

Cross-Platform Comparison

Our comprehensive research across all three major cloud providers revealed a clear trend:

AWS: Highest number of secrets due to open public image creation and a large user base
Azure: Moderate findings with a few critical secrets identified
GCP: Zero secrets found, likely due to strict publication policies

This progression aligns with each platform's approach to public image management, from AWS's relatively open system to GCP's curated marketplace model.

Conclusions

While we expected a low rate of secret discovery in GCP images, the complete absence of findings was unexpected and noteworthy. This result highlights the effectiveness of GCP's validation policies and suggests that their curated approach to public images provides significant security benefits. Perhaps part of their curation process involves running TruffleHog against the images?!?

This research further demonstrates that negative results can be just as valuable as positive findings in cybersecurity research. The absence of secrets in GCP's public images provides important context for understanding the security posture across cloud platforms and validates the effectiveness of restrictive publication policies.

Acknowledgments

We want to give a shoutout to Truffle Security for their initiative in supporting this research and for building TruffleHog, an amazing tool for secret detection that made this comprehensive analysis possible.

This research completes our trilogy of public cloud image analysis across AWS, Azure, and GCP. While the topic has satisfied our curiosity, we encourage others to conduct periodic research to monitor the evolving state of secrets in public cloud images.

And as always, feel free to contact us if you have questions about the research!