Guest User

The Dig

August 3, 2021

An API Worm In The Making: Thousands Of Secrets Found In Open S3 Buckets

An API Worm In The Making: Thousands Of Secrets Found In Open S3 Buckets

Guest User

August 3, 2021

Background

S3 buckets are a common place to store files in AWS. These buckets have a feature that allows you to make your files readable by anyone on the internet without authentication. If the content is meant for public consumption, like storing HTML, CSS, and JS assets for a website, this feature can be really useful, but it’s a double edged sword. Frequently, these files contain sensitive information, which has caused several high profile security incidents, including:

Typically the data exposed is the end of the reported story, but we’ve found it’s often not the end of the security story. Since we recently added S3 support to TruffleHog, we thought scanning the set of publicly exposed buckets for credentials would be a great way to get ahead of potential security incidents, and we ended up finding thousands of distinct secrets spanning hundreds of customers.

Methodology

The first thing we needed to do is compile a list of open S3 buckets. Luckily, the bucket names are globally unique and can be specified by subdomain. For example, if a bucket was named “trufflehogbucket”, the files could be accessed at: https://trufflehogbucket.s3.amazonaws.com/filename. Because DNS traffic is typically unencrypted, many bucket names are collected by DNS taps. Some vendors like RiskIQ expose this data via their PassiveTotal API.



Other tools like grayhatwarfare take a different approach and generate large lists of likely bucket names and make requests to the S3 API to determine if the bucket exists and contains publicly exposed files.  Using these, and other techniques, we built our initial list of buckets. Scanning all of the exposed data quickly grew impractical, so we needed a way to narrow the list to buckets and files likely to contain secrets. Fortunately, greyhatwarfare’s API also allows you to search the names of files, so we searched for common names like ‘.credentials’, ‘.env’, etc. and only scanned buckets containing matching files.

Results

After scanning approximately 4000 buckets containing .env files and .credentials files, we found a file containing secrets had an average of 2.5 secrets in it, with some as high as 10+ secrets in a file.


Secret results


We also found a wide variety of credential types, including:

  • AWS Keys

  • GCP service accounts

  • Azure Blob Storage connection strings

  • Coinbase API keys

  • Twilio  API keys

  • Mailgun API keys

  • RDS passwords

  • Sendgrid credentials

  • Pusher credentials

  • MSSQL passwords

  • Mailtrap credentials

  • Google OAuth credentials

  • Twitter OAuth credentials

  • Linked in OAuth credentials

  • Google Maps API keys

  • Segment API keys

  • Sauce API keys

  • Hosted MongoDB credentials

  • Firebase credentials

  • Stripe credentials

  • Rollbar credentials

  • Twilio credentials

  • Amplitude credentials

  • Mailjet credentials

  • SMS partner credentials

  • Dropbox credentials

  • Yousign credentials

  • PayPal credentials

  • Mandrill credentials

  • Zendesk credentials

  • Hosted message queue connection strings

  • Razor pay credentials

  • Text local credentials

  • Application signing secrets

  • JWT signing secrets

Impact Magnifier: Wormability

It’s clear from the surrounding context, many of these credentials unlock more buckets that are otherwise authenticated. Here are two examples


Leaked credentials leading to more buckets


Leaked credentials leading to more buckets


It’s probably fair to assume authenticated buckets contain more secrets than unauthenticated ones, due to the implied higher security bar authentication provides. This means attackers can likely use the first round of buckets to find keys that unlock an additional round of buckets and expose more keys, which could expose more buckets, etc. We did not use any of these keys or explore this possibility for obvious reasons, but this makes this type of attack “wormable”, ie, one bucket can lead to another bucket, and so on, magnifying the impact of the leak.


Worming through S3 buckets


What’s worse is some of these keys led to other large data stores that may have access to keys, such as Github API keys, and GCP Storage API keys.


Worming through multiple providers


Next Steps

Naturally at this point we needed to disclose what we found to the affected companies. This proved challenging at times because often buckets don’t have a lot of information connecting them with the bucket creator. We did hundreds of disclosures, and partnered with providers in some cases to get keys revoked for buckets where we couldn’t identify owners.  Disclosures ranged from dozens of fortune 500 companies, to NGOs and small startups. 

At any scale, it’s a good idea to have all of your buckets scanned routinely to prevent catastrophe.

Guest User