Sign up for a live Kubernetes or DevSecOps demo

Click here
Back to blog results

November 9, 2017 By Ankit Goel

Monitor DynamoDB with Sumo Logic

What is DynamoDB ?

DynamoDB is a fast and flexible NoSQL database service provided by AWS. This cloud based database supports both document and key-value store models. It was internally developed at AWS to address the need for an incrementally scalable, highly-available key-value storage system.

Due to its auto scaling, high performance, and flexible data schema – it’s been widely adopted by various industries such as gaming, mobile, ad tech, IoT, and many other applications.

Sumo Logic App for DynamoDB

Sumo logic recently released an App for Amazon DynamoDB. It’s a unified Log and Metrics App, and collects data from two sources :

  • CloudTrail API calls
  • DynamoDB CloudWatch Metrics

App covers following high level use cases:

  • DynamoDB API calls (Create/Update/Delete Table), Geolocation, and User Info
  • How to plan ‘Capacity’ of DynamoDB
    • Capacity : Number of Read/Write per second per table
    • Read/Write Throttle Events
    • Successful and Throttle Requests by Table and Operation Name
  • Latency and Errors
    • User and System Error Count
    • Latency by Table Name and Operation Name
    • Conditional Check Failed Request by Table Name

DynamoDB Overview Dashboard

Key DynamoDB Performance Metrics

Metrics

Description

Percent of Provisioned Write Consumed This metric tells you percentage of Provisioned Write Capacity consumed by Table. It should stay below 100%, if it exceeds, then DynamoDB can throttle requests.
It’s calculated by : (ConsumedWriteCapacityUnits/ProvisionedWriteCapacityUnits) x 100
Percent of Provisioned Read Consumed This metric tells you percentage of Provisioned Read Capacity consumed by Table (s). It should stay below 100% – if it exceeds, then DynamoDB can throttle your requests.
It’s calculated by : (ConsumedReadCapacityUnits/ProvisionedReadCapacityUnits) x 100
Read Throttle Events by Table and GSI Requests to DynamoDB that exceed the provisioned read capacity units for a table or a global secondary index.
Write Throttle Events by Table and GSI Requests to DynamoDB that exceed the provisioned write capacity units for a table or a global secondary index.
These Read/Write Throttle Events should be zero all the time, if it is not then your requests are being throttled by DynamoDB, and you should re-adjust your capacity. As for how much to provision for your table, it depends a lot on your workload. You could start with provisioning to something like 80% of your peaks and then adjust your table capacity depending on how many throttles you receive. Hence monitoring throttle helps you plan your capacity against your workload.
Before you decide on how much to adjust capacity, consider the best practices at Consider Workload Uniformity When Adjusting Provisioned Throughput.
Throttle Requests by Table and Operation Name Requests to DynamoDB that exceed the provisioned throughput limits on a resource.
Number of Successful Requests by Table and Operation Name The number of successful requests (SampleCount) during specified time period
User Error Count Number of requests to DynamoDB that generate HTTP 400 status code during the specified time period. An HTTP 400 usually indicates a client-side error such as an invalid combination of parameters, attempting to update a nonexistent table, or an incorrect request signature.To ensure your services are interacting smoothly with DynamoDB, this count should always be Zero.
Latency by Table Name Time taken by DynamoDB in Milliseconds to complete processing the request. You should monitor this, and if this crosses normal threshold level or keep increasing, then you should get involved as it can impact the performance of your services.
Sumo Logic’s Machine Learning based Outlier detector automatically lets you detect any outlier in latency, and also let you set up Metrics Monitor for alerting.
System Error Count by Table and Operation Name Number of requests to DynamoDB that generate HTTP 500 status code during the specified time period. An HTTP 500 usually indicates an internal service error. This count should always be zero if your services are working fine, and if not then you should be immediately involved in debugging services generating 500 error code.
Conditional Check Failed Request Count The number of failed attempts to perform conditional writes. The PutItem, UpdateItem, and DeleteItem operations let you provide a logical condition that must evaluate to true before the operation can proceed. If this condition evaluates to false, ConditionalCheckFailedRequests is incremented by one.

Unify DynamoDB API calls and Metrics

You can use Log overlay feature to identify if there is any correlation between number of API calls being made to a specific table and increased Latency on that table.

Metrics Outlier and Alert

Metrics outlier feature automatically identifies the data points which are not in normal range. Furthermore, you can configure different knobs available to filter out the noise. For example, here Sumo Logic’s machine learning algorithm automatically detects outlier in ReadThrottleEvents Count. For more info, see here

Also, you can set up alerts, and then send an email or web-hook notifications if your metrics crosses a certain threshold. For more info, see here

Secure your DynamoDB API calls.

You can correlate DynamoDB CloudTrail events with Sumo Logic – CrowdStrike Threat Intel feed to secure your infrastructure from any malicious activity and user.

Conclusion

With Sumo Logic App for DynamoDB:

  • You can monitor and alert on key dynamoDB metrics.
  • You can detect any outlier in metrics.
  • You can overlay DynamoDB API Calls with metrics to start debugging issues easily.
  • You can find any malicious activities in DynamoDB environment by correlating CloudTrail events with Sumo Logic Threat Intel Offering.

What’s next ?

If you already have Sumo Logic account then DynamoDB App is available for free to use. If you are new to Sumo Logic, then start by signing up for free account here.

Questions

Thanks for reading! If you have any questions or comments feel free to reach out via email (ankit@sumologic.com) or LinkedIn

Complete visibility for DevSecOps

Reduce downtime and move from reactive to proactive monitoring.

Ankit Goel

Ankit Goel

Ankit Goel is Solutions Architect at Sumo Logic with 10+ years of experience in designing and architecting applications. He is passionate about Machine Learning and Big Data projects. Ankit graduated from Carnegie Mellon University with a masters degree in Information Systems.

More posts by Ankit Goel.

People who read this also enjoyed