Serverless Tracking of AWS VPC Information

If you have many AWS accounts within the Organization, you likely have concerns around tracking VPC (Virtual Private Cloud – basically networking in AWS) configuration information. Let’s take a look at a hypothetical yet pretty real world scenario: Account A is the Data Lake. Account B is a consumer of said Data Lake. In order for B to consume A we need communication to occur and this is of the TCP/IP variety. Too bad the TCP/IP stack does not like overlapping IP spaces….wonder why?!!? Imagine 2 houses on the same street with the same physical address – yeah the already confused mailman would be even more confused and likely your package wouldn’t get delivered.

One of the best parts of being an AWS Technical Account Manager and Enterprise Support Lead is that I get to work hand-in-hand with a super small subset of AWS’ largest customers. I’ve been working with my customer’s CCOE for the better part of 2019 and we have been making huge strides in defining their landing zone strategy. Part of their account deployment pipeline means laying down the groundwork:

  • DELETE the Default VPC
  • Create New VPC
  • Provide a CIDR range broad enough to meet the growth needs in the account.
  • Create Subnets in AZ(s)
  • Track VPC Usage – Now that we’re well over 100 AWS accounts – we’re tackling this task.

The solution must be easy to manage at scale, reliable and pretty well hands off as the environment continues to grow. OK, seems reasonable.

VPCTracker via AWS Serverless

The following AWS Services were utilized in this VPCTracker project. I would venture to guess that all of this would fit into the Free Tier as well.

  1. DynamoDB (Persistent Datastore)
  2. Lambda (Boto3 / Python 3.8)
  3. CloudWatch Event(s)

The DynamoDB table is pretty straight forward – since it’s holding very simple VPC information I went ahead and provisioned an Auto-Scaling 5 Write Capacity Unit / 5 Read Capacity Unit table. I set the Partition Key as VPCId, as this is a unique identifier but I also set the Sort Key on the Table as OwnerId – aka – This is the AWS Account number. This will make querying / sorting on account numbers later in life much easier.

There are many reasons why DynamoDB is the optimal database engine of choice here – First DDB is a simple NoSQL key-value database which scales horizontally (out), provides superb performance and it’s FULLY MANAGED. Huge fan of Dynamo and this is a fantastic use-case!

These are the items we will be populating in our Dynamo table.

  • VpcId
  • OwnerId
  • CidrBlock
  • IsDefault (Like to report on to ensure default VPC has been deleted)
  • State

Tackling the Python

I am definitely an aspiring code junkie quite honestly it’s one of my favorite parts of the modern day IT Professional toolkit. I use VSCode as my IDE (curious if you’re using something else? Share in the comments!)

This is where I’ve spent the large majority of the project – let’s break it down.

The VPC resource is a component of the EC2 service and the command we’re after is describe_vpcs as this method provides all of the details we’re after for our VPCTracker.

Above we’re allocating our variables and pulling in the necessary information from the response above. When testing it’s always a best practice to print out everything along the way so that you can easily see values assigned to variables – always a good sanity check.

If you were to print the entire response of the describe_vpcs it would be a lot of JSON and while it’s good information, it’s totally useless for our project. Hence why above I saved out the specific variables of interest so that we may use those to populate our Dynamo table.

The last section of Python initiates the DynamoDB resources within the boto3 SDK, sets the table and then finally executes Put-Item to put the items into the VPCTracker table. The item of note here is that Put-Item requires specific JSON formatting, so having our variables at the ready makes it super easy to properly format the input.

Putting the Python to Work

There’s a few ways to utilize the Python code

  1. Execute in runtime
  2. Execute on a schedule in AWS Lambda
    1. I will quickly cover this and Deep-Dive in the next post

Executing Python in runtime takes advantage of AWS Credentials ~/.aws/config and ~/.aws/credentials

When you install the AWS CLI by default you’re prompted to setup a base configuration [default]. You can setup alternative profiles as well – In my example I have [Prod] and [Test]. This is well documented here and here so not going to bore you with those details.

Simply throw this line in the directly after the import and that specific Profile will be used when executed.

That’s a Wrap!

In this post I walked through a serverless architecture to populate a DynamoDB table with valuable VPC configuration info. In the next post I will walk you through getting this going in AWS Lambda utilizing CloudWatch Event Trigger.