Blog › Pragmatic AWS

Ariel Smoliar, Senior Product Manager

AWS Elastic Load Balancing – New Visibility Into Your AWS Load Balancers

03.06.2014 | Posted by Ariel Smoliar, Senior Product Manager

After the successful launch of the Sumo Logic Application for AWS CloudTrail last November and with numerous customers now using this application, we were really excited to work again on a new logging service from AWS, this time providing analytics around log files generated by the AWS Load Balancers.

Our integration with AWS CloudTrail targets use cases relevant to security, usage and operations. With our new application for AWS Elastic Load Balancing, we provide our customers with dashboards that provide real-time insights into operational data. You will also be able to add additional use cases based on your requirements by parsing the log entries and visualizing the data using our visualization tools.

Insights from ELB Log Data

Sumo Logic runs natively on the AWS infrastructure and uses AWS load balancers, so we had plenty of raw data to work with during the development of the content. You will find 12 fields in the ELB logs covering the entire request/response lifecycle. By adding the request, backend and response processing time, we can highlight the total time (latency) from when the load balancer started reading the request headers to when the load balancer started sending the response headers to the client. The Latency Analysis dashboard presents a granular analysis per domain, client IP and backend instance (EC2).

The Application also provides analysis of the status codes based on the ELB and backend instances status codes. Please note that the total count for the status codes will be similar for both the ELB and the instances most of the time, unless there are issues, such as no backend response or client rejected request. Additionally, for ELBs that have been configured with a TCP listener (layer 4) rather than HTTP, the TCP requests will be logged. In this case, you will see that the URL has three dashes and there are no values for the HTTP status codes.

Alerting Frequency

Often during my discussions with Sumo Logic users, the topic of scheduled searches and alerting comes up. Based on our work with ELB logs, there is no specific threshold that we recommend that covers every single use case scenario. The threshold should be based on the application – e.g., tiny beacon requests versus downloading huge files cause different latencies. Sumo Logic provides you with the flexibility to set threshold in the scheduled search or just to change the color in the graph for monitoring purpose, based on the value range

Visualization

I want to talk a little bit about machine data visualization. While skiing last week in Steamboat Colorado, I kept thinking about the relevance of the beautiful Rocky Mountain landscape with the somewhat more mundane world of load balancer data visualization. So here is what we did to present the load balancers data in a more compelling way:

pic1_blog

You can slice and dice the data using our Transpose operator as we did in the Latency by Load Balancer monitor, but I would like to focus on a different feature that was built by our UI team and share how we used it in this application. This feature combines data about the number of requests, the size of the total requests, the client IP address and integrates these data elements into the Total Requests and Data Volume monitor. 

We first used this visualization approach in our Nginx app (Traffic Volume and Bytes Served monitor). We received very positive feedback and decided it made sense to incorporate this approach into this application as well.

Combining three fields in a single view enables you to get faster overview of your environment and also provides you with the ability to drill-down and investigate any activity.

Screen Shot 2014-03-05 at 6.32.01 PM

It reminds one of the landscape above, right? :-)

To get this same visualization, click on the gear icon in the Search screen and choose the Change Series option. 

pic3_blog

For each data series, you can choose how you would like to represent the data. We used Column Chart for the total requests and Line Chart for the received and sent data. 

pic4_blog

I find it beautiful and useful. I hope you plan to use this visualization approach in your dashboards, and please let us know if any help is required.

One more thing…

Please stay tuned and check our posts next week… we can’t wait to share with you where we’re going next in the world of Sumo Logic Applications.

Bruno Kurtic, Founding Vice President of Product and Strategy

Sumo Logic Application for AWS CloudTrail

11.13.2013 | Posted by Bruno Kurtic, Founding Vice President of Product and Strategy

Cloud is opaque

One of the biggest adoption barriers of SaaS, PaaS, and IaaS is the opaqueness and lack of visibility into changes and activities that affect cloud infrastructure.  While running an on-premise infrastructure, you have the ability to audit activity ; for example, you can easily tell who is starting and stopping VMs in virtualization clusters, see who is creating and deleting users, and watch who is making firewall configuration changes. This lack of visibility has been one of the main roadblocks to adoption, even though the benefits have been compelling enough for many enterprises to adopt the Cloud.

This information is critical to securing infrastructure, applications, and data. It’s critical to proving and maintaining compliance, critical to understanding utilization and cost, and finally, it’s critical for maintaining excellence in operations.

Not all Clouds are opaque any longer

Today, the world’s biggest cloud provider, Amazon Web Services (AWS),  announced a new product that, in combination with Sumo Logic, changes the game for cloud infrastructure audit visibility.  AWS CloudTrail is the raw log data feed that will tell you exactly who is doing what, on which sets of infrastructure, at what time, from which IP addresses, and more.  Sumo Logic is integrated with AWS CloudTrail and collects this audit data in real-time and enables SOC and NOC style visibility and analytics.

Here are few examples of what AWS CloudTrail data contains:Network Access

  • Network acl changes.

  • Creation and deletion of network interfaces.

  • Authorized Ingress/Egress across network segments and ports.

  • Changes to privileges, passwords and user profiles.

  • Deletion and creation of security groups.

  • Starting and terminating instances.

  • And much more.

Sumo Logic Application for AWS CloudTrail

Cloud data comes to life with our Sumo Logic Application for AWS CloudTrail, helping our customers across security and compliance, operational visibility, and cost containment. Sumo Logic Application for AWS CloudTrail delivers:

User Activity

  • Seamless integration with AWS CloudTrail data feed.

  • SOC-style, real-time Dashboards in order to monitor access and activity.

  • Forensic analysis to understand the “who, what, when, where, and how” of  events and logs.

  • Alerts when important activities and events occur.

  • Correlation of AWS CloudTrail data with other security data sets, such as intrusion detection system data, operating system events, application data, and more.

This integration delivers improved security posture and better compliance with internal and external regulations that protect your brand.  It also improves operational analytics that can improve SLAs and customer satisfaction.  Finally, it provides deep visibility into the utilization of AWS resources that can help improve efficiency and reduce cost.

The integration is simple: AWS CloudTrail deposits data in near-real time into your S3 account,  and Sumo Logic collects it as soon as it is deposited using an S3 Source.  Sumo Logic also provides a set of pre-built Dashboards and searches to analyze the CloudTrail Data.

To learn more, click here for more details: http://www.sumologic.com/applications/aws-cloudtrail/ and read the documentation: https://support.sumologic.com/entries/30216746-Sumo-Logic-for-Amazon-CloudTrail-App.

Pragmatic AWS: 3 Tips to enhance the AWS SDK with Scala

07.12.2012 | Posted by Stefan Zier, Chief Architect

At Sumo Logic, most backend code is written in Scala. Scala is a newer JVM (Java Virtual Machine) language created in 2001 by Martin Odersky, who also co-founded our Greylock sister company, TypeSafe. Over the past two years at Sumo Logic, we’ve found Scala to be a great way to use the AWS SDK for Java. In this post, I’ll explain some use cases. 

1. Tags as fields on AWS model objects

Accessing AWS resource tags can be tedious in Java. For example, to get the value of the “Cluster” tag on a given instance, something like this is usually needed: 

   String deployment = null;
   for (Tag tag : instance.getTags()) {
     if (tag.getKey().equals(“Cluster”)) {
       deployment = tag.getValue();
     }
   }

While this isn’t horrible, it certainly doesn’t make code easy to read. Of course, one could turn this into a utility method to improve readability. The set of tags used by an application is usually known and small in number. For this reason, we found it useful to expose tags with an implicit wrapper around the EC2 SDK’s Instance, Volume, etc. classes. With a little Scala magic, the above code can now be written as:

val deployment = instance.cluster

Here is what it takes to make this magic work:

object RichAmazonEC2 {
 implicit def wrapInstance(i: Instance) = new RichEC2Instance(i)
}

class RichEC2Instance(instance: Instance) {
 private def getTagValue(tag: String): String =
   tags.find(_.getKey == tag).map(_.getValue).getOrElse(null)
 
 def cluster = getTagValue(“Cluster”)
}

Whenever this functionality is desired, one just has to import RichAmazonEC2._

2. Work with lists of resources

Scala 2.8.0 included a very powerful new set of collections libraries, which are very useful when manipulating lists of AWS resources. Since the AWS SDK uses Java collections, to make this work, one needs to import collections.JavaConversions._, which transparently “converts” (wraps implicitly) the Java collections. Here are a few examples to showcase why this is powerful: 

Printing a sorted list of instances, by name:
ec2.describeInstances(). // Get list of instances.
 getReservations.                  
 map(_.getInstances).
 flatten.                          // Translate reservations to instances.
 sortBy(_.sortName).               // Sort the list.
 map(i => “%-25s (%s)”.format(i.name, i.getInstanceId)). // Create String.
 foreach(println(_))               // Print the string.

Grouping a list of instances in a deployment by cluster (returns a Map from cluster name to list of instances in the cluster):
ec2.describeInstances().            // Get list of instances.
 filter(_.deployment = “prod”).    // Filter the list to prod deployment.
 groupBy(_.cluster)                // Group by the cluster.

You get the idea – this makes it trivial to build very rich interactions with EC2 resources.

3. Add pagination logic to the AWS SDK

When we first started using AWS, we had a utility class to provide some commonly repeated functionality, such as pagination for S3 buckets and retry logic for calls. Instead of embedding functionality in a separate utility class, implicits allow you to pretend that the functionality you want exists in the AWS SDK. Here is an example that extends the AmazonS3 class to allow listing all objects in a bucket: 

object RichAmazonS3 {
 implicit def wrapAmazonS3(s3: AmazonS3) = new RichAmazonS3(s3)
}

class RichAmazonS3(s3: AmazonS3) {
 def listAllObjects(bucket: String, cadence: Int = 100): Seq[S3ObjectSummary] = {

   var result = List[S3ObjectSummary]()

   def addObjects(objects: ObjectListing) = result ++= objects.getObjectSummaries

   var objects = s3.listObjects(new ListObjectsRequest().withMaxKeys(cadence).withBucketName(bucket))
   addObjects(objects)

   while (objects.isTruncated) {
     objects = s3.listNextBatchOfObjects(objects)
     addObjects(objects)
   }

   result
 }
}

To use this:

val objects = s3.listAllObjects(“mybucket”)

There is, of course a risk of running out of memory, given a large enough number of object summaries, but in many use cases, this is not a big concern.

Summary

Scala enables programmers to implement expressive, rich interactions with AWS and greatly improves readability and developer productivity when using the AWS SDK. It’s been an essential tool to help us succeed with  AWS.

Pragmatic AWS: Principle of Least Privilege with IAM

06.12.2012 | Posted by Stefan Zier, Chief Architect

Lock and Chain - by Martin Magdalene

One of the basic principles in information security is the Principle of Least Privilege. The idea is simple: give every user/process/system the minimal amount of access required to perform its tasks. In this post, I’ll describe how this principle can be applied to applications running in a cluster of EC2 instances that need access to AWS resources. 

What are we protecting?

The AWS Access Key ID and Secret are innocent looking strings. I’ve seen people casually toss them around scripts and bake them into AMIs. When compromised, however, they give the attacker full control over all of our resources in AWS. This goes beyond root access on a single box – it’s “god mode” for your entire AWS world! Needless to say, it is critical to limit both the likelihood of a successful attack and the exposure in case of a successful attack against one part of your application.

Why do we need to expose AWS credentials at all?

Since our applications run on EC2 instances and access other AWS services, such as S3, SQS, SimpleDB, etc, they need AWS credentials to run and perform their functions.

Limiting the likelihood of an attack: Protecting AWS credentials

In an ideal world, we could pass the AWS credentials into applications without ever writing them to disk and encrypt them in application memory. Unfortunately, this would make for a rather fragile system – after a restart, we’d need to pass the credentials into the application again. To enable automated restarts, recovery, etc., most applications store the credentials in a configuration file.

There are many other methods for doing this. Shlomo Swidler compared tradeoffs between different methods for keeping your credentials secure in EC2 instances.

At Sumo Logic, we’ve picked what Shlomo calls the SSH/On Disk method. The concerns around forgetting credentials during AMI creation don’t apply to us. Our AMI creation is fully automated, and AWS credentials never touch those instances. The AWS credentials only come into play after we boot from the AMI. Each application in our stack runs as a separate OS user, and the configuration file holding the AWS credentials for the application can only be read by that user. We also use file system encryption wherever AWS credentials are stored.

To add a twist, we obfuscate the AWS credentials on disk. We encrypt them using a hard-coded, symmetric key. This obfuscation, an additional Defense-in-Depth measure, makes it a little more difficult to get the plain text credentials in the case of instance compromise. It also makes shoulder surfing much more challenging. 

Limiting exposure in case of a successful attack: Restricted access AWS credentials

Chances are that most applications only need a very small subset of the AWS portfolio of services, and only a small subset of resources within them. For example, an application using S3 to store data will likely only need access to a few buckets, and only perform limited set of operations against them.

AWS’s IAM service allows us to set up users with limited permissions, using groups and policies. Using IAM, we can create a separate user for every application in our stack, limiting the policy to the bare minimum of resources/actions required by the application. Fortunately, the actions available in policies directly correspond to AWS API calls, so one can simply analyze which calls an application makes to the AWS API and derive the policy from this list.

For every application-specific user, we create a separate set of AWS credentials and store them in the application’s configuration file.

In Practice – Automate, automate, automate!

If your stack consists of more than one or two applications or instances, the most practical option for configuring IAM users is automation. At Sumo Logic, our deployment tools create a unique set of IAM users. One set of users per deployment and one user per application within the deployment. Each user is assigned a policy that restricts access to only those of the deployments resources that are required for the application.

If the policies changes, the tools update them automatically. The tools also configure per-application OS level users and restrict file permissions for the configuration files that contain the AWS credentials for the IAM user. The configuration files themselves store the AWS credentials as obfuscated strings.

One wrinkle in this scheme is that the AWS credentials created for the IAM users need to be stored somewhere after their initial creation. After the initial creation of the AWS credentials, they can never be retrieved from AWS again. Since many of our instances are short-lived, we needed to make sure we could use the credentials again later. To solve this particular issue, we encrypt the credentials, then store them in SimpleDB. The key used for this encryption does not live in AWS and is well-protected on hardware tokens.   

Summary

It is critical to treat your AWS credentials as secrets and assign point-of-use specific credentials with minimal privileges. IAM and automation are essential enablers to make this practical.  

Update (6/12/2012): AWS released a feature named IAM Roles for EC2 Instances today. It makes temporary a set of AWS credentials available via instance metadata. The credentials are rotated multiple times a day. IAM Roles add a lot of convenience, especially in conjunction with the AWS SDK for Java.

Unfortunately, this approach has an Achilles heel: any user with access to the instance can now execute a simple HTTP request and get a valid set of AWS credentials. To mitigate some of the risk, a local firewall, such as iptables, can be used to restrict HTTP access to a subset of users on the machine.

Comparing the two approaches 

+ User privileges and obfuscation offer a stronger defense in scenarios where a single (non-root) user is compromised.
+ Per-application (not per-instance) AWS credentials are easier to reason about.
- The rotation of IAM keys performed transparently by IAM roles adds security. An attacker has to maintain access to a compromised machine to maintain access to valid credentials.

Best of Both Worlds

AWS’s approach could be improved upon with a small tweak: Authenticate access to the temporary/rotating credentials T in instance metadata using another pair of credentials A. A itself would not have any privileges other than accessing T from within an instance. This approach would be a “best of both worlds”. Access to A could be restricted using the methods described above, but keys would still be rotated on an ongoing basis.   

Pragmatic AWS: Data Destroying Drones

06.05.2012 | Posted by Stefan Zier, Chief Architect

 

 

 

 

 

 

 

As we evolve our service, we occasionally delete EBS (Elastic Block Store) volumes. This releases the disk space back to AWS to be assigned to another customer. As a security precaution, we have decided to perform a secure wipe of the EBS volumes. In this post, I’ll explain how we implemented the wipe.

Caveats

Wiping EBS volumes may be slightly paranoid and not strictly needed, since AWS guarantees to never return a previous users data via the hypervisor (as mentioned in their security white paper). We also understand that the secure wipe is not perfect. EBS is able to move our data around in the background and leave back blocks that we didn’t wipe. Still, we felt that this additional precaution was worth the bit of extra work and cost – better safe than sorry.

Drones

We wanted to make sure secure wiping did not to have any performance impact on our production deployment. Therefore, we decided that it would be great to perform the secure wipe from a different set of AWS instances — Data Destroying Drones. We also wanted them to be fire-and-forget, so we wouldn’t have to manually check up on them.

To accomplish all this, we built a tool that:

  1. Finds to-be-deleted EBS volumes matching a set of tag values. (we tag the volumes to mark them for wiping).
  2. Launches one t1.micro instance per EBS volume that needs wiping (using an Ubuntu AMI).
  3. Passes a cloud-init script with Volume ID and (IAM limited) AWS credentials into the instance.

The Gory Details

Ubuntu has a mechanism named cloud-init. It accepts a shell script via EC2’s user data, which is passed in as part of the RunInstances API call to EC2. Here is the script we use for the Data Destroying Drones:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
#!/bin/bash
set -e
export INSTANCE_ID=`wget -q -O - http://169.254.169.254/latest/meta-data/instance-id`
export VOLUME_ID=v-12345678
export EC2_URL=https://ec2.us-east-1.amazonaws.com
export EC2_ACCESS_KEY=[key id]
export EC2_SECRET_KEY=[key]
 
sudo apt-get install scrub
euca-attach-volume -i $INSTANCE_ID -d /dev/sdj $VOLUME_ID
sudo scrub -b 50M -p dod /dev/sdj > ~/sdj.scrub.log 2>&1
sleep 30
 
euca-detach-volume $VOLUME_ID
euca-delete-volume $VOLUME_ID
halt
view raw destroy_volume.sh hosted with ❤ by GitHub

This script automates the entire process:

  1. Attach the volume.
  2. Perform a  DoD 5220.22-M secure wipe of the volume using scrub.
  3. Detach and delete the volume.
  4. Halt the instance.

The instances are configured to terminate on halt, which results in all involved resources to disappear once the secure wipe completes. The scrub can take hours or even days, depending on the size of the EBS volumes, but the cost for the t1.micro instances makes this a viable option. Even if the process takes 48 hours, it costs less than $1 to wipe the volume.

Summary

Aside from being a fun project the Data Destroying Drones have given us additional peace of mind and confidence that we’ve followed best practice and made a best effort to secure our customers data by not leaving any of it behind in the cloud.

Pragmatic AWS: 4 Ideas for using EC2 Tags

05.15.2012 | Posted by Stefan Zier, Chief Architect

At Sumo Logic, we use Amazon Web Services (AWS) for everything. Our product, as well as all our internal infrastructure live in AWS. In this series of posts, we’ll share some useful practices around using AWS. In the first installment, I’ll outline some useful things we do with tags

1. Organize resources

We’ve decided on a hierarchical way of managing our EC2 (Elastic Compute Cloud) resources:

Deployment
 + Cluster
   + Instance/Node

Within an AWS account, we can have multiple “deployments”. A deployment is a complete, independent copy of our product and uses the same architecture as our production service. Besides production, we use several smaller-scale deployments for development, testing and staging. Each deployment consists of a number of clusters, and each cluster of one or more instances.

Instances and their corresponding EBS (Elastic Block Store) volumes are tagged with Deployment, Cluster and Node tags. As an example, the third frontend node of our production deployment would be tagged like so:

Deployment=prod
Cluster=frontend
NodeNumber=3

There is also a direct mapping to DNS names. The DNS name for this node would be prod-frontend-3.

Combined with the filtering features in AWS Console (you can make any tag a column in the resource listings), this makes it very easy to navigate to a particular set of resources.

2. Display Instance Status

Tags can also be used as an easy way to display status information in the AWS console. Simply update a tag with the current status, whenever it changes.

The code that deploys our instances into EC2 updates a DeployStatus tag whenever it progresses from one step to another. For example, it could read:

2012-05-10 17:53 Installing Cassandra

This allows you to see what’s going on with instances at a glance.

3. Remember EBS Volume Devices

For EC2 instances that have multiple EBS volumes, when they need to be attached, our tools need to know which volume gets mapped to which device on the instance.

When we first create a volume, for example /dev/sdj, we create add a DeviceName tag to the volume with a value of /dev/sdj to track where it needs to be attached. Next time we attach the volume, we know it’s “proper place”.

4. Attribute and remind people of costs

All our developers are empowered to create their own AWS resources. This is a huge benefit for full-scale testing, performance evaluations, and many other use cases. Since AWS is not a charity, however, we need to manage costs tightly. In order to do this, we tag all AWS resources with an Owner tag (either by hand, or via our automated deployment tool).

To consume this tag, we have a cron job that runs daily and emails users who have active resources in AWS to remind them to shut down what they no longer require.

The subject line of the email reads “[AWS] Your current burn rate is $725.91/month!”. The body of the email contains a table with a more detailed cost breakdown. In addition, there is also a rollup email that goes out to the entire development team.

 

Summary

EC2 tags are extremely useful tools to track state, organize resources and store relationships between resources like instances and EBS volumes. There are a myriad more ways to use them. I hope these tips have been helpful.

Twitter