Haoxi Tan

The Dig

December 13, 2023

Why did 1 GitHub Repo leak 5,000 Live GCP Keys?

Why did 1 GitHub Repo leak 5,000 Live GCP Keys?

Haoxi Tan

December 13, 2023

Why did more GCP (Google Cloud Platform) keys leak onto GitHub in 2022 than any other key type? The answer is perhaps unsurprisingly related to the ebb and flow of cryptocurrency mining, and its relative popularity in 2022.

Investigating Cloud Abuse and Crypto-mining on Github

A few months ago, we tweeted that we observed a disproportionately large number of GCP keys leaking on GitHub repositories when compared to other cloud providers. Then suddenly, GCP key leakage dropped down to the same levels we see for AWS. We dug into the details and came across one Github repo (https://github.com/DinhPhuocLong/caodangbaoloc) that had a whopping 5,000+ unique (and previously live) GCP keys.

Screenshot of Caodangbaoloc Repo

The repo had 1 commit, 15 account folders and more than 5,000 GCP keys.

Each account folder had the same structure:

├── plot
├── upload
├── vps01
│ ├── setup
│ ├── 1.json
│ ├── 10.json
│ ├── 11.json
│ ├── 12.json
│ ├── 13.json

├── vps02
│ ├── setup
│ ├── 1.json
│ ├── 10.json
│ ├── 11.json

There was always a plot file, an upload file, and a list of numbered vps folders.

Each vps folder had a setup file and several JSON files. The JSON files contained GCP service account private keys. For example:

  "type": "service_account", 
  "project_id": "saf-0nk65uhs96izhcct9t-qm9n0bl", 
  "private_key_id": "f0100940013622fe2dcc75b90bbad90c7f4e8fa7", 
  "private_key": "-----BEGIN PRIVATE KEY-----\n…..(SNIPPED)...\n-----END PRIVATE KEY-----\n", 
  "client_email": "mfc-be4yybt58c2ge4t8lkc5fxl5at@saf-0nk65uhs96izhcct9t-qm9n0bl.iam.gserviceaccount.com", 
  "client_id": "102752925430184835432", 
  "auth_uri": "https://accounts.google.com/o/oauth2/auth", 
  "token_uri": "https://oauth2.googleapis.com/token", 
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", 
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/mfc-be4yybt58c2ge4t8lkc5fxl5at%40saf-0nk65uhs96izhcct9t-qm9n0bl.iam.gserviceaccount.com"

The Setup Files

The folder names starting with vps very likely corresponded to the term Virtual Private Servers, which is what virtual compute instances in the cloud are called.

Each account folder’s setup script contained bash commands seemingly designed to execute upon login to the relevant VPS.

Screenshot of Setup Script Rclone Config

The setup script installed tools, such as  rclone and chia-plotter. Chia-plotter is the equivalent of xmrig (arguably one of the most popular pieces of cryptocurrency mining software) for the Chia network (a blockchain and digital currency network). 

Screenshot of Setup Script File System Formatting

The setup script also reformatted file systems on multiple devices (setting up software RAID on multiple NVMe drives), and assumed the system had 110G of RAM to make a tmpfs on /mnt/ram. This implies the setup script ran on very expensive dedicated cloud instances.

The plot twist

Mining Chia coins, unlike Bitcoin (which uses Proof of Work) or Ethereum (which uses Proof of Stake), uses Proof of Space and Time. In other words, it’s not just about how fast you can compute or how many coins you already have; the Chia network requires a massive amount of storage space for a miner to be competitive.

This might be why the operator used GCP. Google Business Plus and Enterprise plans come with at least 5TB of storage space per user.

Screenshot of Google Workspace Drive Storage Limits

During a trial period, this could be potentially free.

After the setup script configured rclone for each of its 40 GCP credentials, it started the plotting process. 

Setup Script Starting Chia Plotting

Plotting is one of the two main processes required to mine Chia coin.

Chia Mining Process

The plot script invoked the chia-plotter tool with several flags:

Screenshot of the Plot Script

  • The -f flag specified the actor’s farmer key

  • The -p flag specified the (mining) pool key

  • The -n flag specified the quantity of plots being created. Setting this value to 9999 indicates that the actor attempted to create as many plots as possible.

  • The -r flag specified the number of threads running chia-plotter. Setting this value to 32 confirms this script ran on high capacity systems (and not your average single-core $5 VPS).

  • The -2 flag specified a temporary directory where most of the plotting write operations occur. This was set to a drive corresponding to 110GB of RAM.

  • The -t flag specified another temporary directory where some of the plotting write operations occur. This was set to the mounted NVMe RAID device. 

This combination of command line arguments passed to the chia-plotter tool indicates that the actor intended to plot chia in an extremely efficient manner on many very high capacity VPS nodes.

Next, the setup script created a cron job that runs the upload script every second.

Screenshot of the Upload Script

The upload script checks for completed plots and then uploads them to Google Drive using the previously created rclone configurations, trying carefully not to exceed the usage limit on each drive. The upload script loops through the 40 credential pairs in the $SAID variable (most likely representing 40 different users in the same workspace) to upload files in a round-robin fashion.

Screenshot of Upload Script Running Rclone to Upload

Compromised or registered?

A threat report from Google in 2023 states that 70% of compromised GCP assets were used to mine cryptocurrency. It’s an easy way to quickly monetize compromised assets, but the way that this repo was structured suggests that it more than likely used legitimate (not compromised) Google credentials for abuse.

A quick statistics check on the project_id values present in the credentials show a very even spread of projects; it would either mean that the operator compromised high-privileged GCP credentials and used them to create these resources automatically (which we saw no proof of in this repo, as all GCP keys were used only for Google Drive access), or that they registered their own Google Cloud account. 

Either way, according to a TruffleHog scan, the GCP keys are no longer live.

Running TruffleHog Against the GitHub Repo

But why?

Did this repository even need to be public? We are not sure. It’s very likely that leaving their GCP secrets publicly exposed on Github accelerated the compromise of this operation. The entire crypto-mining operation could’ve been completely done in private. 

We assumed the name of the repo and its author are Vietnamese, so we Google translated the repo name “caodangbaoloc” in different variations:

A Vietnamese-speaking friend of the author confirmed that the most likely meaning is that it’s a College called Bao Loc. How long, indeed, is the (Chia network block) height? Perhaps it was just a college project to impress friends or to study the Chia network. There is no way to tell the amount of fortune Phuoc Long amassed, unless a crypto expert can somehow derive their Chia transactions via the farmer and pool key. That’ll be left as an exercise for the reader.

The Dig

Thoughts, research findings, reports, and more from Truffle Security Co.