2016-02-01

Want to revert an unapproved security group (firewall) change in Amazon Web Services in 10 seconds without any external tools? As in about 10-20 minutes faster than is typically possible with a SIEM or other external tools? Then read on…

If you follow me on Twitter you may have noticed I went a bit nuts when Amazon Web Services announced their new CloudWatch events a couple weeks ago. I saw it as an incredibly powerful too for event driven security. I’m going to post on the concepts tomorrow but this is one of those times when I think it’s better to just show you how it works instead of describing general concepts. This entire thing took me about 4 hours to put together, and it was my first time writing a Lambda function or using Python in 10 years.

This example configures an AWS account to automatically revert any Security Group (firewall) changes without human interaction, using nothing but native capabilities in AWS itself. No security tools, no servers, nada. Just wiring together things already built into AWS. In my limited testing it works in 10 seconds or less and it’s only 100 lines of code, including comments. Yeah, this post is WAY longer than the code to make it all work.

I’m going to walk you through setting it up manually, but in production you would want to automate the configuration so you can manage this across multiple AWS accounts. That’s what we use Trinity for, and I’ll mention more about automating automation at the end of the post. Also, this is Amazon specific since other providers don’t yet expose the capabilities you need.

For background, it might help to read the AWS CloudWatch events launch post. The short version is you can instrument a large portion of AWS, and trigger actions based on a wide set of very granular events. And yeah, this is an example of the kind of research we are focusing on as part of our cloud pivot.

This might look long, but if you follow my instructions you can set all this up in 10-15 minutes. Tops.

Prep Work: Turn on CloudTrail

If you use AWS you should have it set up already, if not, you need to activate it and feed the logs to CloudWatch using these instructions. This only takes a minute or two if you pick all the default settings.

Step 1: Configure IAM

To make your life easier I put all my code up on the Securosis public GitHub repository. You don’t need to pull the code since you will be copying and pasting everything into the AWS console anyway.

The first step is to configure an IAM policy for the workflow, and then create a role that Lambda can assume when running the code. Lambda is a service in AWS that allows you to store and run code based on triggers. Lambda code runs in a container, but you don’t have to manage containers or servers. You load the code, then it executes based on triggers. You can build entirely serverless architectures with it, which is nice if you want to eliminate most of your attack surface, but that’s a discussion for a different day.

IAM in Amazon Web Services is how you manage who can do what in your account, including Amazon services themselves. It’s ridiculously granular and powerful, and the single most critical security tool to protect your AWS accounts.

Log into the AWS console. Got to the Identity and Access Management (IAM) dashboard.

Click on Policies then Create Policy.

Choose Create Your Own Policy.

Name it lambda_revert_security_group. Put in a description, then copy and paste my policy from GitHub. This policy allows the Lambda function to access CloudWatch logs, write to the log, and allows it to view security group information and revoke ingress or egress statements (but not create new ones). Damn I love granular policies!

Once the policy is set, you need to Create New Role. This is the role we attach the policy to that the Lambda function will assume when it runs.

Name it lambda_revert_security_group, assign it an AWS Lambda role type, then attach the lambda_revert_security_group policy we just created.

That’s it for the IAM changes. Now we need to set up the Lambda function and the CloudWatch event.

Step 2: Create the Lambda function

First of all, make sure you know what AWS region you are working in. I prefer us-west-2 (Oregon) for lab work since it is up to date and tends to support new capabilities pretty early. us-east-1 is the granddaddy of regions, but my lab account has so much cruft after 6+ years that things don’t always work right for me there.

Go to Lambda (it’s under Compute on the main services page) and Create a Lambda function.

Don’t pick a blueprint… hit the skip button on the next page.

Name it revertSecurityGroup. Put in a description and then pick Python for the runtime. Then paste my code into the main window. After that, pick the lambda_revert_security_group IAM role that the function will use. Then click Next, then Create function.

A few points on Lambda. You aren’t billed until the function triggers, then you are billed per each request and per runtime. Lambda is really good for quick tasks, but it does have a timeout (I think an hour these days) and the longer you run a function, the less cost viable it is compared to a dedicated server. I actually looked at migrating Trinity to lambda since we could offload our workflows, but at the time it had a 5 minute timeout, and at scale running hour-long workflows would likely kill us financially.

Now some notes on my code:

The main function handler includes a bunch of conditional statements you can use to only trigger reverting a security group change based on things like who requested the change, what security group was changed, if the security group is in a specified VPC, or if the security group has a particular tag. None of those lines will work since they refer to specific identifiers in my account, so you need to change them to work in your account.

By default, the function will revert any security group change in your account. You need to cut and paste the line “revert_security_group(event)” into a conditional block to only run it based on conditions.

The function only works for inbound rule changes. It’s trivial to modify it to work for egress rule changes, or run it to restrict both ingress and egress. The IAM policy we set will work for both, you just need to change the code.

This only works for EC2-VPC. EC2-Classic works differently, and my code won’t parse an EC2-Classic API call.

The code pulls the event details, finds the changes (which could be multiple changes submitted at the same time) and reverses them.

There may be ways around this. I ran through it over the weekend and tested multiple ways of making an EC2-VPC security group change and it always worked, but there might be a way I don’t know about that would change the log format enough that my code won’t work. Later I plan to update it to work with EC2-Classic, but since I never use that (neither does Securosis) and we advise our clients not to use it, that’s low on the priority list. If you find a hole, please drop me a line.

This works for internal (security group to security group) changes as well as external or internal IP address based rules.

Step 3: Configure the CloudWatch Event trigger

CloudWatch is Amazon’s built-in logging service. You can’t turn it off, since it’s the same tool AWS uses to monitor and manage the performance of you instances and services. CloudWatch Logs is a relatively newer feature you can use to store various log streams, including CloudTrail, the service that records all API calls on your account (even internal AWS calls).

Go to CloudWatch, then Events, then Create rule.

In the Event selector > Select event source pick AWS API call. This only works with CloudTrail turned on.

Pick EC2 for the Service name. Then click Specific operation(s), then AuthorizeSecurityGroupIngress. You could also add egress if you want.

For Targets pick Add target then Lambda function then select the one we just created. If you have a notification function you could add it here and get a text message or email whenever it runs, or send an alert to your SIEM.

Then name it. It’s active by default.

Now test it. Go into the console, make a security group change, wait about 10 seconds, then refresh the console. your changes should be gone. You can also look in the CloudWatch log to see what happened and the details of the API call and how the function executed.

Automating for Scale

Now this may only take 10-15 minutes if you have the code and know the process, but imagine configuring all this on hundreds or thousands of accounts at a time, which is typical for a mid or large sized organization with a lot of AWS projects.

To scale this up you need to create a new account deployment package. That’s what we use Trinity for (okay, that’s what I’m currently coding into Trinity, for our internal use right now). The idea is when you provision an account you hook into it and blast out all the configurations, settings, Lambda functions, etc. using automation code.

In last year’s Black Hat training we demonstrated that with demo code to configure alerts on IAM changes via CloudTrail and CloudWatch. The plan is to go into more detail in our new Advanced Cloud Security and Applied DevOps class this summer.

It really isn’t all that complex. Once you spend time on your cloud platform of choice and learn some basic coding via the APIs the rest is pretty easy. It’s just basic check a setting, make a change stuff, no complex math or crazy decision trees (for the most part).

This is insanely exciting stuff, because we, as security professionals, can now directly manage. monitor, and manipulate our infrastructure using the exact same tools as development and operations. The infrastructure itself can identify and fix configuration or other issues, including security issues, faster than a person or (most) external tools.

Try it out. It’s easy to get started, and with pretty minimal work you could make my sample code work for a whole host of different situations beyond basic firewalls.

- Rich
(0) Comments
Subscribe to our daily email digest

Securityboulevard.com

Event Driven Security on AWS: A Practical Example