Get the reportMore
Posts by Michael Floyd
New DevOps Site Chronicles the Changing Face of DevOps
As organizations increasingly adopt DevOps as part of the digital initiative, the very nature of DevOps is rapidly changing. As DevOps as a Service, security-first design patterns, containerization and microservices come into focus, it’s clear that DevOps is becoming the de facto means by which organizations build, run and secure their modern applications.
Designing a Data Analytics Strategy for Modern Apps
Yesterday at AWS re:Invent 2016, Sumo Logic Co-Founder and CTO Christian Beedgen presented his vision for machine data analytics in a world where modern apps are disrupting virtually every vertical market in business. Every business is a software business, Marc Andreessen wrote more than five years ago. Today, driven by customer demand, the need to differentiate and the push for agility, digital transformation initiatives are disrupting every industry. “We are still at very the beginning of this wave of Digital Transformation,” Christian said. “By 2020 half of all businesses will have figured out digitally enhanced products and services.” The result is that modern apps are being architected differently than they were just 3 years ago. Cloud applications are being built on microservices by DevOps teams that automate to deliver new functionality faster. “It used to be that you could take the architecture and put it on a piece of paper with a couple of boxes and a couple of arrows. Our application architecture was really clean.” But with this speed and agility comes complexity, and the need for visibility has become paramount. “Today our applications look like spaghetti. Building microservices, wiring them up, integrating them so they can work with something else, foundational services, SQL databases, NoSQL databases…” You need to be able to see what’s going on, because you can’t fix what you cannot see. Modern apps require Continuous Intelligence to provide insights, continuously and in real-time, across the entire application lifecycle. Designing Your Data Analytics Strategy Ben Newton, Sumo Logic’s Principal Product Manager of the Metrics team, took the stage to look at the various types of data and what you can do with them. Designing a data analytics strategy begins by understanding the data types that are produced by machine data, then focusing on the activities that data supports. The primary activities are Monitoring where you detect and notify (or alert), and Troubleshooting where you identify, diagnose, restore and resolve. “What we often find is that users can use that same data to do what we call App Intelligence – the same logs and metrics that allows you to figure out something is broken, also tells you what your users are doing. If you know what users are doing, you can make life better for them because that’s what really matters.” So who really cares about this data? When it comes to monitoring where the focus is on user-visible functionality, it’s your DevOps and traditional IT Ops teams. Engineering and development also are responsible for monitoring their code. In troubleshooting apps where the focus is on end-to-end visibility, customer success and technical support teams also become stakeholders. For app intelligence, the focus is on user activity and visibility everyone is a stakeholder including sales, marketing, and product management. “Once you have all of this data, all of these people are going to come knocking on your door,” said Ben. Once you understand the data types you have, where it is within your stack and the use cases, you can begin to use data to solve real problems. In defining what to monitor and measure, Ben highlighted: Monitor what’s important to your business and your users. Measure and monitor user visible metrics. Build fewer, higher impact, real-time monitors. “Once you get to troubleshooting side, it gets back to you can’t fix what you can’t measure.” Ben also said: You can’t improve what you can’t measure. You need both activity metrics and detailed logs. Up to date data drives better data-driven decisions. You need data from all parts of your stack. So what types of data will you be looking at? Ben broke it down to the following categories: Infrastructure Rollups vs. Detailed What resolution makes sense? Is real-time necessary? Platform Rollups vs. Detailed Coverage of all components Detailed logs for investigations Architecture in the metadata Custom How is your service measured? What frustrates users? How does the business measure itself? “Everything you have produces data. It’s important to ensure you have all of the components covered.” Once you have all of your data, it’s important to think about the metadata. Systems are complex and the way you make sense out of it is through your metadata. You use metadata to describe or tag your data. “For the customer, this is the code you wrote yourself. You are the only people that can figure out how to monitor that. So one of the things you have to think about is the metadata. ” Cloud Cruiser – A Case Study Cloud Cruiser’s Lead DevOps Engineer, Ben Abrams, took the stage to show how the company collects data and provide some tips on tagging it with metadata. Cloud Cruiser is a SaaS app that enables you to easily collect, meter, and understand your cloud spend in AWS, Azure, and GCP. Cloud Cruiser’s customers are large enterprises and mid-market players globally distributed across all verticals, and they manage hundreds of millions of cloud spend. Cloud Cruiser had been using an Elastic (Elasticsearch, Logstash, and Kibana) stack for their log management solution. They discovered that managing their own logging solution was costly and burdensome. Ben cited the following: Operational burden was a distraction to the core business. Improved security. Ability to scale + cost. Cloud Cruiser runs on AWS (300-500 instances) and utilizes microservices written in Java using the dropwizard framework. Their front-end web app runs on Tomcat and uses Angularjs. Figure 1 shows the breadth of the technology stack: In evaluating a replacement solution, Ben said “We were spending too much time on our ELK stack.” Sumo Logic’s Unified Logs and Metrics (ULM) was also a distinguishing factor. The inclusion of metrics meant that they didn’t have to employ yet another tool that would likewise have to be managed. “Logs are what you look at when something goes wrong. But Metrics are really cool.” Ben summarized the value and benefits they achieved this way: Logs Reduced operational burden. Reduced cost. Increased confidence in log integrity. Able to reduce the number of people needing VPN. Alerting based on searches did not need ops handholding. Metrics Increased visibility in system and application health. Used in an ongoing effort with application and infrastructure changes in that we were able to reduce our monthly AWS bill by over 100%. Ben then moved into a hands on session, showing how they automate the configuration and installation of Sumo Logic collectors, and how they tag their data using source categories. Cloud Cruiser currently collects data from the following sources: Chef: automation of config and collector install Application Graphite Metrics from Dropwizard Other graphite metrics forwarded by Sensu to Sumo Logic “When I search for something I want to know what environment is it, what type of log is it, and which server role did it come from.” One of their decisions was to differentiate log data from metrics data as shown below. Using this schema allows them to search logs and metrics by environment, type of log data and corresponding Chef role. Ben walked through the Chef Cookbook they used for deploying with Chef and shared how they automate the configuration and installation of Sumo Logic collectors. For those interested, I’ll follow on this up in the DevOps Blog. A key point from Ben, though, was “Don’t log secrets.” The access ID and key should be defined elsewhere, out of scope and stored in an encrypted data bag. Ben also walked through the searches they used to construct the following dashboard. Through this one dashboard, Cloud Cruiser can utilize both metrics and log data to get an overview of the health of their production deployment. Key Takeaways Designing your data analytics strategy is highly dependent on your architecture. Ultimately it’s about the experience you provide to your users. It’s no longer just about troubleshooting issues in production environments. It’s also about understanding the experience you provide to your users. The variety of data that streams in real time comes from the application, operating environment and network layers produces an ever increasing volume of data every day. Log analytics provides the forensic data you need, and time-series based metrics give you insights into the real-time changes taking place under the hood. To understand both the health of your deployment and the behavior/experience of your customers, you need to gather machine data from all of its sources, then apply both logs and metrics to give teams from engineering to marketing the insights they need. Download the slides and view the entire presentation below:
CDN with AWS CloudFront - Tutorial
Consider a situation in which you’ve developed a groundbreaking website, and you would like to share your content with the world. The problem is that your hosting provider is based in New York, and you’re concerned that a user from a different region, such as Europe or Australia, won’t be able to view your content quickly and reliably. Amazon CloudFront is a content delivery network (CDN) solution that allows your content distribution to be shared in an accelerated manner from multiple edge locations around the world.
Sumo Logic Brings Machine Data Analytics to AWS Marketplace
The founders of Sumo Logic recognized early on when they founded the company that in order to remain competitive in an increasingly disruptive world, companies would be moving to the cloud to build what are now being called modern apps. Hence, Sumo Logic purposefully architected the Sumo Logic machine data analytics platform from the ground up on Amazon Web Services. Along the way, Sumo Logic has acquired a vast knowledge and expertise in not only log management overlaid with metrics, but in the inner workings of the services offered by Amazon. Today, more 6 years later, we are pleased to announce that Sumo Logic is one of a handful of initial AWS partners participating in the launch of SaaS Subscription products on Amazon Web Services (AWS) Marketplace, and the immediate availability of the Sumo Logic Platform in AWS Marketplace. Now, customers already using AWS can leverage Sumo Logic’s expertise in machine-data analytics to visualize and monitor workloads in real-time, identify issues and expedite root cause analysis to improve operational and security insights across AWS infrastructure and services. How it Works AWS Marketplace is an online store that allows AWS customers to procure software and services directly in the marketplace and immediately start using those services. Billing runs through your AWS account, allowing your organization to consolidate billing for all AWS services, SaaS subscriptions and software purchased through the Marketplace. To get started with Sumo Logic in the AWS Marketplace go to the Sumo Logic page. You should see a screen similar to the following. Pricing As you can see, pricing is clearly marked next to the product description. Pricing is based on several factors starting with which edition of Sumo Logic you’re using – Professional or Enterprise. Professional edition supports up to 20 users and 30 days of data retention among other features. Enterprise Edition includes support for 20+ users and multi-year data retention as part of its services. See Sumo Logic Pricing page for more information. Reserved Log Ingest Once you’ve decided which edition, you’re ready to select the plan that’s best for you based on your anticipated ingest volume. Reserved Log Ingest Volume is the amount of logs you have contracted to send each day to the Sumo Logic service. The Reserve price is how you much you pay for GB of logs ingested each day. During signup, you can select a Reserved capacity in GB’s per day (see below). There are no minimum days, and you can cancel at any time. On-Demand Log Ingest Bursting is allowed and at the end of the billing cycle, for any usage beyond the total Reserved capacity for the period, you will pay the On-demand rate. Your first 30 days of service usage are FREE. Signing up When you click Continue, you’ll be taken to the Sumo Logic signup form similar to Figure 2. Enter your email address, then click Plan to select your Reserved Log Ingest volume. At this point you will select your Reserved capacity. Plans are available for increments of 1, 3, 5, 10, 20, 30, 40 and 50 GB per day. Once you’ve selected your plan, click the signup button to be taken through the signup process. Recall, billing is managed through AWS so no credit card required. What You Get If you’re not already familiar with the Sumo Logic, the platform unifies logs, metrics and events, transforming a variety of data types into real-time continuous intelligence across the entire application lifecycle enabling organizations to build, run and secure their modern applications. Highlights of Sumo Logic include: Unified Logs and Metrics Machine learning capabilities like LogReduce and LogCompare, machine learning features to quickly identify root cause. Elasticity and bursting support without over-provisioning Data encryption at rest, PCI DSS 3.1 with log immutability, and HIPAA compliance at no additional cost. Zero log infrastructure Management overhead. Go to sumologic.com for more information Sumo Logic Apps for AWS As mentioned, Sumo Logic has tremendous expertise in AWS, and experience building and operating massively multi-tenant, highly distributed cloud systems. Sumo Logic passes that expertise along to its customers in the form of Apps for AWS services. Sumo Logic Apps for AWS contain preconfigured searches and dashboards for the most common use cases, and are designed to accelerate your time to value with Sumo Logic. Using these dashboards and searches you can quickly get an overview of your entire AWS application at the app, system and network levels. You can quickly identify operational issues, drill down using search and apply tools like LogReduce and LogCompare to quickly get at the root cause of the problem. You also gain operational, security and business insight into services that support your app like S3, CloudTrail, VPC Flow and Lambda. Apps that are generally available include: Amazon S3 Audit App Amazon VPC Flow Logs App Amazon CloudFront App AWS CloudTrail App AWS Config App AWS Elastic Load Balancing App AWS Lambda App In addition, the following Apps are in Preview for Sumo Logic Customers: Amazon CloudWatch – ELB Metrics Amazon RDS Metrics Getting Started and Next Steps Sumo Logic is committed to educating its customers using the deep knowledge and expertise it has gained in working with AWS. If you’re new to Amazon Web Services, we’ve created AWS Hub, a portal dedicated to learning AWS fundamentals. The portal includes 101’s to get you started with EC2, S3, ELB, VPC Flow and AWS Lambda. In addition you’ll find deep-dive articles and blog posts walking you through many of the AWS service offerings. Finally, if you’re planning to attend AWS re:Invent at the end of November, stop by and get your questions answered, or take a quick tour of Sumo Logic and all machine learning and data analytics has to offer.
Troubleshooting Apps and Infrastructure Using Puppet Logs
If you’re working with Puppet, there’s a good chance you’ve run into problems when configuring a diverse infrastructure. It could be a problem with authorization and authentication, or perhaps with MCollective, Orchestration or Live Management. Puppet logs can provide a great deal of insight into the status of your apps and infrastructure across the data center. Knowing where to look is just the first step. Knowing what to look for is another matter. Here’s a cheat sheet that will help you identify the logs that are most useful, and show you what to look for. I’ll also explain how to connect Puppet to a log aggregation and analytics tool like Sumo Logic. Where are Puppet Logs Stored? The Puppet Enterprise platform produces a variety of log files across its software architecture. This Puppet documentation describes the file path of the following types of log files: Master Logs: Master application logs containing information such as fatal errors and reports, warnings, and compilation errors. Agent Logs: Information on client configuration retrieved from the Master. ActiveMQ Logs: Information on the ActiveMQ actions on specific nodes. MCollective service logs: Information on the MCollective actions on specific nodes. Console logs: Information around console errors, fatal errors and crash reports. Installer logs: Contains information around Puppet installations, such as errors occurred installation, last installation run and other relevant information. Database logs: Information around database modifications, errors, etc. Orchestration logs: Information around orchestration changes. The root of Puppet log storage is different depending on whether Puppet is running in a Unix-like system or in a Windows environment. For *nix-based installs, the root folder for Puppet is /etc/puppetlabs/puppet. For Windows-based installs the root folder for Puppet is: C:\ProgramData\PuppetLabs\puppet\etc for all versions of Windows server from 2008 onwards. Modifying Puppet Log Configuration The main setting that needs to be configured correctly to get the best from Puppet logs is the log_level attribute within the main Puppet configuration file. The log_level parameter can have the following values, with “notice” being the default value: debug info notice warning err alert emerg crit The Puppet Server can also be configured to process logs. This is done using the Java Virtual Machine Logback library. An .xml file is created, usually named logback.xml, which can be piped into the Puppet Server at run time. If a different filename is used, it will need to be specified in the global.conf file. The .xml file allows you to override the default root logging level of ‘info’. Possible levels are trace, debug, info and debug. For example, if you wanted to produce full debug data for Puppet, you would add the following parameter to the .xml file. <root level=”debug”> The Most Useful Puppet Logs Puppet produces a number of very useful log files, from basic platform logs to full application orchestration reports. The most commonly used Puppet logs include: Platform Master Logs These give generalized feedback on issues such as compilation errors, depreciation warnings, and crash/fatal termination. They can be found at the following locations: /var/log/puppetlabs/puppetserver/puppetserver.log /var/log/puppetlabs/puppetserver/puppetserver-daemon.log Application Orchestration Logs Application orchestration is probably the single most attractive aspect of the Puppet platform. It enables the complete end-to-end integration of the DevOps cycle into a production software application. As a result, these logs are likely to be the most critical logs of all. They include: /var/log/pe-mcollective/mcollective.log – This log file contains all of the log entries that affect the actual MCollective platform. This is a good first place to check if something has gone wrong with application orchestration. /var/lib/peadmin/.mcollective.d/client.log – a log file to be found on the client connecting to the MCollective server, the twin to the log file above, and the second place to begin troubleshooting. /var/log/pe-activemq/activemq.log – a log file that contains entries for ActiveMQ. /var/log/pe-mcollective/mcollective-audit.log – a top-level view of all MCollective requests. This could be a good place to look if you are unsure of exactly where the problem occurred so that you can highlight the specific audit event that triggered the problem. Puppet Console Logs Also valuable are Puppet logs, which include the following: /var/log/pe-console-services/console-services.log – the main console log that contains entries for top-level events and requests from all services that access the console. /var/log/pe-console-services/pe-console-services-daemon.log – low-level console event logging that occurs before the standard logback system is loaded. This is a useful log to check if the problem involves the higher level logback system itself. /var/log/pe-httpd/puppet-dashboard/access.log – a log of all HTTPS access requests made to the Puppet console. Advanced Logging Using Further Tools The inbuilt logging functions of Puppet mostly revolve around solving issues with the Puppet platform itself. However, Puppet offers some additional technologies to help visualize status date. One of these is the Puppet Services Status Check. This is both a top-level dashboard and a queryable API that provides top-level real-time status information on the entire Puppet platform. Puppet can also be configured to support Graphite. Once this has been done, a mass of useful metrics can be analyzed using either the demo Graphite dashboard provided, or using a custom dashboard. The ready-made Grafana dashboard makes a good starting point for measuring application performance that will affect end users, as it includes the following metrics by default: Active requests – a graphical measure of the current application load. Request durations – a graph of average latency/response times for application requests. Function calls – graphical representation of different functions called from the application catalogue. Potentially very useful for tweaking application performance. Function execution time – graphical data showing how fast specific application processes are executed. Using Puppet with Sumo Logic To get the very most out of Puppet log data, you can analyze the data using an external log aggregation and analytics platform, such as Sumo Logic. To work with Puppet logs on Sumo Logic, you simply use the Puppet module for installing the Sumo Logic collector. You’ll then be able to visualize and monitor all of your Puppet log data from the Sumo Logic interface, alongside logs from any other applications that you connect to Sumo Logic. You can find open source collectors for Docker, Chef, Jenkins, FluentD and many other servers at Sumo Logic Developers on Github. About the Author Ali Raza is a DevOps consultant who analyzes IT solutions, practices, trends and challenges for large enterprises and promising new startup firms. Troubleshooting Apps and Infrastructure Using Puppet Logs is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.
Sumo Logic Launches New Open Source Site on Github Pages
Customers Share their AWS Logging with Sumo Logic Use Cases
In June Sumo Dojo (our online community) launched a contest to learn more about how our customers are using Amazon Web Services like EC2, S3, ELB, and AWS Lambda. The Sumo Logic service is built on AWS and we have deep integration into Amazon Web Services. And as an AWS Technology Partner we’ve collaborated closely with AWS to build apps like the Sumo Logic App for Lambda.So we wanted to see how our customers are using Sumo Logic to do things like collecting logs from CloudWatch to gain visibility into their AWS applications. We thought you’d be interested in hearing how others are using AWS and Sumo Logic, too. So in this post I’ll share their stories along with announcing the contest winner.The contest narrowed down to two finalists – SmartThings, which is a Samsung company operates in the home automation industry and provides access to a wide range of connected devices to create smarter homes that enhance the comfort, convenience, security and energy management for the consumer.WHOmentors, Inc. our second finalist, is a publicly supported scientific, educational and charitable corporation, and fiscal sponsor of Teen Hackathon. The organization is, according to their site, “primarily engaged in interdisciplinary applied research to gain knowledge or understanding to determine the means by which a specific, recognized need may be met.”At stake was a DJI Phantom 3 Drone. All entrants were awarded a $10 Amazon gift card.AWS Logging Contest RulesThe Drone winner was selected based on the following criteria:You have to be a user of Sumo Logic and AWSTo enter the contest, a comment had to be placed on this thread in Sumo Dojo.The post could not be anonymous – you were required to log in to post and enter.Submissions closed August 15th.As noted in the Sumo Dojo posting, the winner would be selected based on our own editorial judgment and community reactions to the post (in the form of comments or “likes”) to select one that’s most interesting, useful and detailed.SmartThingsSmartThings has been working on a feature to enable Over-the-air programming (OTA) firmware updates of Zigbee Devices on user’s home networks. For the uninitiated, Zigbee is an IEEE specification for a suite of high-level communication protocols used to create personal area networks with small, low-power digital radios. See the Zigbee Alliance for more information.According to one of the firmware engineers at SmartThings, there are a lot of edge cases and potential points of failure for an OTA update including:The Cloud PlatformAn end user’s hubThe device itselfPower failuresRF inteference on the mesh networkDisaster in this scenario would be a user’s device ending up in a broken state. As Vlad Shtibin related:“Our platform is deployed across multiple geographical regions, which are hosted on AWS. Within each region we support multiple shards, furthermore within each shard we run multiple application clusters. The bulk of the services involved in the firmware update are JVM based application servers that run on AWS EC2 instances.Our goal for monitoring was to be able to identify as many of these failure points as possible and implement a recovery strategy. Identifying these points is where Sumo Logic comes into the picture. We use a key-value logger with a specific key/value for each of these failure points as well as a correlation ID for each point of the flow. Using Sumo Logic, we are able to aggregate all of these logs by passing the correlation ID when we make calls between the systems.Using Sumo Logic, we are able to aggregate all of these logs by passing the correlation ID when we make calls between the systems.We then created a search query (eventually a dashboard) to view the flow of the firmware updates as they went from our cloud down to the device and back up to the cloud to acknowledge that the firmware was updated. This query parses the log messages to retrieve the correlation ID, hub, device, status, firmware versions, etc.. These values are then fed into a Sumo Logic transaction enabling us to easily view the state of a firmware update for any user in the system at a micro level and the overall health of all OTA updates on the macro level.Depending on which part of the infrastructure the OTA update failed, engineers are then able to dig in deeper into the specific EC2 instance that had a problem. Because our application servers produce logs at the WARN and ERROR level we can see if the update failed because of a timeout from the AWS ElasticCache service, or from a problem with a query on AWS RDS. Having quick access to logs across the cluster enables us to identify issues across our platform regardless of which AWS service we are using.As Vlad noted, This feature is still being tested and hasn’t been rolled out fully in PROD yet. “The big take away is that we are much more confident in our ability identify updates, triage them when they fail and ensure that the feature is working correctly because of Sumo Logic.”WHOmentors.comWHOmentors.com, Inc. is a nonprofit scientific research organization and the 501(c)(3) fiscal sponsor of Teen Hackathon. To facilitate their training to learn languages like Java, Python, and Node.js, each individual participate begins with the Alexa Skills Kit, a collection of self-service application program interfaces (APIs), tools, documentation and code samples that make it fast and easy for teens to add capabilities for use Alexa-enabled products such as the Echo, Tap, or Dot.According WHOmentors.com CEO, Rauhmel Fox, “The easiest way to build the cloud-based service for a custom Alexa skill is by using AWS Lambda, an AWS offering that runs inline or uploaded code only when it’s needed and scales automatically, so there is no need to provision or continuously run servers.With AWS Lambda, WHOmentors.com pays only for what it uses. The corporate account is charged based on the number of requests for created functions and the time the code executes. While the AWS Lambda free tier includes one million free requests per month and 400,000 gigabyte (GB)-seconds of compute time per month, it becomes a concern when the students create complex applications that tie Lambda to other expensive services or the size of their Lambda programs are too long.Ordinarily, someone would be assigned to use Amazon CloudWatch to monitor and troubleshoot the serverless system architecture and multiple applications using existing AWS system, application, and custom log files. Unfortunately, there isn’t a central dashboard to monitor all created Lambda functions.With the integration of a single Sumo Logic collector, WHOmentors.com can automatically route all Amazon CloudWatch logs to the Sumo Logic service for advanced analytics and real-time visualization using the Sumo Logic Lambda functions on Github.”Using the Sumo Logic Lambda Functions“Instead of a “pull data” model, the “Sumo Logic Lambda function” grabs files and sends them to Sumo Logic web application immediately. Their online log analysis tool offers reporting, dashboards, and alerting as well as the ability to run specific advanced queries as needed.The real-time log analysis combination of the “SumoLogic Lambda function” assists me to quickly catch and troubleshoot performance issues such as the request rate of concurrent executions that are either stream-based event sources, or event sources that aren’t stream-based, rather than having to wait hours to identify whether there was an issue.I am most concerned about AWS Lambda limits (i.e., code storage) that are fixed and cannot be changed at this time. By default, AWS Lambda limits the total concurrent executions across all functions within a given region to 100. Why? The default limit is a safety limit that protects the corporate from costs due to potential runaway or recursive functions during initial development and testing.As a result, I can quickly determine the performance of any Lambda function and clean up the corporate account by removing Lambda functions that are no longer used or figure out how to reduce the code size of the Lambda functions that should not be removed such as apps in production.”The biggest relief for Rauhmel is he is able to encourage the trainees to focus on coding their applications instead of pressuring them to worry about the logs associated with the Lambda functions they create.And the Winner of AWS Logging Contest is…Just as at the end of an epic World-Series battle between two MLB teams, you sometimes wish both could be declared winner. Alas, there can only be one. We looked closely at the use cases, which were very different from one another. Weighing factors like the breadth in the usage of the Sumo Logic and AWS platforms added to our drama. While SmartThings uses Sumo Logic broadly to troubleshoot and prevent failure points, WHOmentors.com use case is specific to AWS Lambda. But we couldn’t ignore the cause of helping teens learn to write code in popular programming languages, and building skills that may one day lead them to a job.Congratulations to WHOmentors.com. Your Drone is on its way!
Dockerizing Microservices for Cloud Apps at Scale
Last week I introduced Sumo Logic Developers’ Thought Leadership Series where JFrog’s Co-founder and Chief Architect, Fred Simon, came together with Sumo Logic’s Chief Architect, Stefan Zier, to talk about optimizing continuous integration and delivery using advanced analytics. In Part 2 of this series, Fred and Stefan dive into Docker and Dockerizing microservices. Specifically, I asked Stefan about initiatives within Sumo Logic to Dockerize parts of its service. What I didn’t realize was the scale at which these Dockerized microservices must be delivered. Sumo Logic is in the middle of Dockerizing its architecture and is doing it incrementally. As Stefan says, “We’ve got a 747 in mid-air and we have to be cautious as to what we do to it mid-flight.” The goal in Dockerizing Sumo Logic is to gain more speed out of the deployment cycle. Stefan explains, “There’s a project right now to do a broader stroke containerization of all of our microservices. We’ve done a lot of benchmarking of Artifactory to see what happens if a thousand machines pull images from Artifactory at once. That is the type of scale that we operate at. Some of our microservices have a thousand-plus instances of the service running and when we do an upgrade we need to pull a thousand-plus in a reasonable amount of time – especially when we’re going to do continuous deployment: You can’t say ‘well we’ll roll the deployment for the next three hours then we’re ready to run the code,’ That’s not quick enough anymore. It has to be minutes at most to get the code out there.” The Sumo Logic engineering team has learned a lot in going through this process. In terms of adoption and learning curve Stefan suggests: Developer Education – Docker is a new and foreign thing and the benefits are not immediately obvious to people. Communication – Talking through why it’s important and why it’s going to help and how to use it. Workshops – Sumo Logic does hands-on workshops in-house to get its developers comfortable with using Docker. Culture – Build a culture around Docker. Plan for change – the tool chain is still evolving. You have to anticipate the evolution of the tools and plan for it. As a lesson learned, Stefan explains, “We’ve had some fun adventures on Ubuntu – in production we run automatic upgrades for all our patches so you get security upgrades automatically. It turns out when you get an upgrade to the Docker Daemon it kills all the running containers. We had one or two instances where, this wasn’t in production fortunately, but in one or two instances we experienced where across the fleet all containers went away. Eventually we traced it back to Docker Daemon and now we’re explicitly holding back Docker daemon upgrades and make it an explicit upgrade so that we are in control of the timing. We can do it machine by machine instead of the whole fleet at once.” JFrog on Dockerizing Microservices Fred likewise shared JFrog’s experiences, pointing out that JFrog’s customers asked early on for Docker support. So JFrog has been in it from the early days of Docker. Artifactory has supported Docker images for more than 2 years. To Stefan’s point, Fred says “we had to evolve with Docker. So we Dockerized our pure SaaS [product] Bintray, which is a distribution hub for all the packages around the world. It’s highly distributed across all the continents, CDN enabled, [utilizes a] MongoDB cluster, CouchDB, and all of this problematic distributed software. Today Bintray is fully Dockerized. We use Kubernetes for orchestration.” One of the win-wins for Frog developers is that the components the developer is “not” working on are delivered via Docker, the exact same containers that will run in production, on their own local workstation. ‘We use Vagrant to run Docker inside a VM with all the images so the developer can connect to microservices exactly the same way. So the developer has the immediate benefit that he doesn’t have to configure and install components developed by the other team. Fred also mentioned Xray, which was just released, is fully Dockerized. Xray analyzes any kind of package within Artifactory including Docker images, Debian, RPM, zip, jar, war files and analyzes what it contains. “That’s one of the things with Docker images, it’s getting hard to know what’s inside it. Xray is based on 12 microservices and we needed a way to put their software in the hands of our customers, because Artifactory is both SaaS and on-prem, we do both. So JFrog does fully Docker and Docker Compose delivery. So developers can get the first image and all images from Bintray.” “The big question to the community at large,” Fred says, “is how do you deliver microservices software to your end customer?” There is still some work to be done here.” More Docker Adventures – TL;DR Adventures is a way of saying, we went on this journey, not everything went as planned and here’s what we learned from our experience. If you’ve read this far, I’ve provided a good summary of the first 10 minutes, so you can jump there to learn more. Each of the topics are marked by a slide so you can quickly jump to a topic of interest. Those include: Promoting containers. Why it’s important to promote your containers at each stage in the delivery cycle rather than retag and rebuild. Docker Shortcuts. How Sumo Logic is implementing Docker incrementally and taking a hybrid approach versus doing pure Docker. Adventures Dockerizing Cassandra. Evolving Conventions for Docker Distribution. New Shifts in Microservices What are the new shifts in microservices? In the final segment of this series, Fred and Stefan dive into microservices and how they put pressure on your developers to create clean APIs. Stay tuned for more adventures building, running and deploying microservices in the cloud. https://www.sumologic.com/blog... class="at-below-post-recommended addthis_tool">
CI/CD, Docker and Microservices - by JFrog and Sumo Logic’s Top Developers
Sumo Dojo Winners - Using Docker, FluentD and Sumo Logic for Deployment Automation
Recently, Sumo Dojo ran a contest in the community see who is analyzing Docker logs with Sumo Logic, and how. The contest ran the month of June and was presented at DockerCon. Last week, the Sumo Dojo selected the winner, Brandon Milsom, from Australia-based company Fugro Roames. Roames uses remote sensing laser (LIDAR) technology to create interactive 3D asset models for powerline networks for energy companies in Australia and the United Kingdom. As Brandon writes: "We use Docker and Sumo Logic as a part of our deployment automation. We use Ansible scripts to automatically deploy our developer’s applications onto Amazon EC2 instances inside Docker containers as part of our cloud infrastructure. These applications are automatically configured to send tagged logs to Sumo Logic using Fluentd, which our developers use to identify their running instances for debugging and troubleshooting. Not only are the application logs sent directly to Sumo Logic, but the Docker container logs are also configured using Docker’s built in Fluentd logging driver. This forwards logs to another Docker container on the same host running a Fluentd server, which then seamlessly ships logs over to Sumo Logic. The result is developers easily access their application and container OS logs that their app is running in just by opening a browser tab." Part of our development has also been trialling drones for asset inspection, and we also have a few drone fanatics in our office. Winning a drone would also be beneficial as it would give us something to shoot at with our Nerf guns, improving office morale. Brandon's coworker, Adrian Howchin also wrote in saying" "I think one of the best things that we've gained from this setup is that it allows us to keep users from connecting (SSH) in to our instances. Given our CD setup, we don't want users connecting in to hosts where their applications are deployed (it's bad practice). However, we had no answer to the question of how they get their application/OS logs." Thanks to SumoLogic (and the Docker logging driver!), we're able to get these logs out to a centralized location, and keep the users out of the instances. Congratulations to Brandon and the team at Fugro Roames. Now you have something cool to shoot at.
Top 5 Questions From DockerCon 2016
Developers, IT Ops engineers and enterprise professionals converged on DockerCon 2016 with a vengeance and Sumo Logic was on hand to show them how the Sumo Logic platform gives them visibility into their Docker ecosystems. We also released a new eBook for practitioners, Docker, From Code to Container that provides best practices for building, testing and deploying containers and includes hands-on exercises with Docker Compose. The Sumo Logic Community also announced a chance to win a DJI Phantom 3 Drone. The contest ends June 30, so there’s still time. With announcements of new features like Containers as a Service and tools like Docker Universal Control Plane (UCP), Docker is taking the deployment of microservices via containers to a whole new level. UCP offers automated container scanning and the ability run signed binaries. As a primarily DevOps crowd with a heavy bent toward the developer, there was a lot of interest in Docker logging, monitoring and analytics, and we received a lot of questions about the internals of the Sumo Logic approach to collecting logs. In fact, the #1 question I got was how we implemented the container, so I thought I’d answer that and other questions here. How Does Sumo Logic Operate in a Docker Ecosystem? Sumo Logic uses a container to collect and ship data from Docker. The image itself contains a collector and a script source. You can grab the image from DockerHub by just running a Docker pull. docker pull sumologic/appcollector:latest Before you run the container, you’ll need to create an access key in Sumo Logic (see documentation for details). Then run the container using the AccessID and Access_key that you created previously. docker run -d -v /var/run/docker.sock:/var/run/docker.sock --name="sumologic-docker" sumologic/appcollector:latest The container creates a collector in your Sumo Logic account, and establishes two sources: Docker Logs and a Docker Stats. That’s it. Once the image is installed and configured locally, you simply select the App for Docker from the Library in Sumo Logic, bring up the one of the dashboards and watch data begin to populate. If you’d like to try it out yourself and don’t already have an account, sign up for Sumo Logic Free. Can you monitor more than One Docker Host? Another question I got was whether you could monitor more than one host. Apparently not all monitoring tools let you do this. The answer is, you can. As you can see in this Overview Dashboard, there are two Docker Hosts in this example. The Sumo Logic collector image typically runs on the same host as the Docker host. You can collect data from multiple hosts by installing an image on each host. Note, however, that you can only run one instance at a time. A better approach is to run the Sumo Logic Collector on one host, and have containers on all other hosts log to it by setting the syslog address accordingly when running the container. Our CTO, Christian Beedgen explains more in this post on Logging Drivers. What kind of data do you capture and what analytics do you provide? To get real value from machine-generated data, Sumo Logic takes a comprehensive approach to monitoring Docker. There are five requirements to enable comprehensive monitoring: Events Configurations Logs Stats Host and daemon logs For events, you can send each event as a JSON message, which means you can use JSON as a way of logging each event. The Sumo Logic collector enumerates all running containers, then starts listening to the event stream, collecting each running container and each start event. See my post on Comprehensive Monitoring in Docker for more detail. We call the inspect API to get configurations and send that in JSON. For logs, we call the logs API to open a stream and send each log. Now you have a record of all the configurations together with your logs, making it easy search for them when troubleshooting. For statistics, we call the stats API to open a stream for each running container and each start event, and send each received JSON message as a log. For host and daemon logs, you can include a collector into host images or run a collector as a container. Do you have any Free Stuff? No conference would be complete with a new backpack stuffed with hoodies, T-shirts and may be a Red Hat (Thanks guys!) But I also believe in adding value by educating developers and ops. So, I’ve put together an eBook, Docker – From Code to Container, that I hope you’ll find interesting. Docker From Code to Container explains how containerization enables Continuous Integration and Continuous Delivery processes, shows how you can take Docker to production with confidence, walks you through the process of building applications with Docker Compose, and presents a comprehensive model for monitoring Docker both your application stack and your Docker ecosystem. Ultimately, you will learn how containers enable DevOps teams build, run and secure their Dockerized a applications. In this Webinar you will learn: How Docker enables continuous integration and delivery Best practices for delivering Docker containers to production How to build Applications with Docker Compose Best practices for securing docker containers How to gauge the health of your Docker ecosystem using analytics A comprehensive approach to monitoring and logging What’s Next I’m glad you asked. We’re featuring a Live Webinar with Jason Bloomberg, president of Intelyx and Kalyan Ramanathan, VP of Marketing for Sumo Logic to dive deeper into the use cases for Docker monitoring. The webinar is July 20 at 10:00 am PDT. Be there or be square! https://www.sumologic.com/blog... class="at-below-post-recommended addthis_tool">
JFrog Artifactory Users Gain Real-time Continuous Intelligence with New Partnership
Correlating Logs and Metrics
This week, CEO Ramin Sayar offered insights into Sumo Logic’s Unified Logs and Metrics announcement, noting that Sumo Logic is now the first and foremost cloud-native, machine data analytics SaaS to handle log data and time-series metrics together. Beginning this week Sumo Logic is providing “early access” to customers that are using either Amazon CloudWatch or Graphite to gather metrics. That’s good news for practitioners from developers to DevOps and release managers, because as Ben Newton explains in his blog post you’ll now be able to view both logs and metrics data together and in context. For example, when troubleshooting an application issue, developers can start with log data to narrow a problem to a specific instance, then overlay metrics to build screens that show both logs and metrics (like CPU utilization over time) in the context of the problem. What Are you Measuring? Sumo Logic already provides log analytics at three levels: System (or machine) Network Application Unified Logs & Metrics also extends the reporting of time-series data to these three levels. So using Sumo Logic you’ll now be able to focus on application performance metrics, infrastructure metrics, custom metrics and log events. Custom Application Metrics Of the three, application metrics can be the most challenging because as your application changes, so do the metrics you need to see. Often you don’t know what you will be measuring until you encounter the problem. APM tools provide byte-code instrumentation where they load code into the JVM. That can be helpful, but results are restricted to what the APM tool is designed or configured to report on. Moreover, the cost for instrumenting code using APM tools can be expensive. So developers, who know their code better than any tool, often resort to creating their own custom metrics to get the information needed to track and troubleshoot specific application behavior. That was the motivation behind an open-source tool called StatsD. StatsD allows you to create new metrics in Graphite just by sending it data for that metric. That means there’s no management overhead for engineers to start tracking something new: simply give StatsD a data point you want to track and Graphite will create the metric. Graphite itself has become a foundational monitoring tool, and because many of our customers already use it Sumo Logic felt it important to support it. Graphite, which is written in Python and open-sourced under the Apache 2.0 license, collects, stores and displays time-series data in real time. Graphite is fairly complex, but the short story is that it’s good at graphing a lot of different things like dozens of performance metrics from thousands of servers. So typically you write an application that collects numeric time-series data and sends it to Graphite’s processing backend (Carbon), which stores the data in a Graphite database. The Carbon process listens for incoming data but does not send any response back to the client. Client applications typically publish metrics using plaintext, but can also use the pickle protocol, or Advanced Message Queueing Protocol (AMQP). The data can then be visualized through a web interface like Grafana. But as previously mentioned, your custom application can simply send data points to a StatsD server. Under the hood StatsD is a simple NodeJS daemon that listens for messages on a UDP port, then parses the messages, extracts the metrics data, and periodically (every 10 seconds) flushes the data to graphite. Sumo Logic’s Unified Logs and Metrics Getting metrics into Sumo Logic is super easy. With StatsD and Graphite, you have two options. You can point your StatsD server to a Sumo Logic hosted collector or you can install native collector within the application environment. CloudWatch CloudWatch is Amazon’s service for monitoring applications running on AWS and system resources. CloudWatch tracks metrics (data expressed over a period of time) and monitors log files for EC2 Instances and other AWS resources like EBS volumes, ELB, DynamoDB tables, and so on. For EC2 Instances, you can collect metrics on things like CPU Utilization, then apply dimensions to filter by instance ID, instance type, or image id. Pricing for AWS CloudWatch is based on Data Points. A DP = 5 minute of activity (specifically the previous minutes). A Detailed DP (DDP) = 1 minute. Unified Logs and Metrics dashboards allow you to view metrics by category, and are grouped first by namespace, and then by the various dimension combinations within each namespace. One very cool feature is you can search for meta tags across EC2 instances. Sumo Logic makes the call once to retrieve meta tags and caches them. That means you no longer have to make an API call to retrieve each meta tag, which can result in cost savings since AWS charges per API call. Use Cases Monitoring – Now you’ll be able to focus on tracking KPI behavior over time with Dashboards and Alerts. Monitoring allows you to: Track SLA adherence Watch for anomalies Respond quickly to emerging issues Compare to past behavior Troubleshooting – This about determining if there is an outage and then restoring service. With Unified Logs and Metrics you can: Identify what is failing Identify when it changed Quickly iterate on ideas “Swarm” issues Root-cause Analysis – Focuses on determining why something happened and how to prevent it.Dashboards overlayed with log data and metrics allows you to: Perform historical analysis Correlate Behavior Uncover long term fixes Improve Monitoring Correlating Logs and Metrics When you start troubleshooting you really want to start correlating multiple types of metrics and multiple sources of log data. Ultimately, you’ll be able to start with Outliers and begin overlaying metrics and log data to quickly build views and help you quickly identify issues. Now you’ll be able to overlay log and metrics from two different systems and do it in real time. If you want to see what Unified Logs and Metrics can do, Product Manager Ben Newton walks you through the steps of building on logs and overlaying metrics in this short introduction.
Containerization: Enabling DevOps Teams
What is containerization? Software containers are a form of OS virtualization where the running container includes just the minimum operating system resources, memory and services required to run an application or service. Containers enable developers to work with identical development environments and stacks. But they also facilitate DevOps by encouraging the use of stateless designs.
Sumo Logic Add-on for Heroku Now GA
Back in October Sumo Logic opened up its beta program of the Sumo Logic Add-on for Heroku. Today we are pleased to make the Sumo Logic Add-on generally available in the Heroku Elements marketplace, and adding support for Private Spaces. The Sumo Logic Add-on for Heroku helps PaaS developers to build, run and secure their applications. Using Sumo Logic’s pre-built dashboards, predictive analytics and features like Live Tail, Heroku developers can monitor their applications in real-time and troubleshoot them from code to production. Existing Heroku developers can test drive Sumo Logic with a free trial, and upgrade the add-on to a paid plan from the Heroku marketplace.The add-on is easy to set up. From Heroku Elements, simply select Sumo Logic as an add-on for your application. You can then launch the Sumo Logic service directly from your Heroku Dashboard to gain real-time access to event logs in order to monitor new deployments, troubleshoot applications, and uncover performance issues.Build, Run, SecureDevOps teams can monitor and troubleshoot their applications from code to production. Heroku’s Logplex consolidates application, system and network logs into a single stream and retains 1,500 lines log log data. Sumo Logic effortlessly collects terabytes of data from Logplex and your Heroku application. Data can be pre-parsed and partitioned on ingest to get separate views of your application and network streams. Our lightweight collectors replace traditional complex setups and effortlessly collect, compress, cache, and encrypt your data for secure transfer.Using Sumo Logic, developers can monitor application event streams in real time, tail logs in production using Live Tail, and utilize other tools to understand performance, detect critical issues, correlate events, analyze trends, and detect anomalies.DevOps teams can also utilize applications Analytics to understand how users use their app, to analyze business KPIs in real-time, and to optimize their applications to deliver the most value to customers.Secure by design, Sumo Logic maintains an array of critical certifications and attestations including PCI DSS 3.0. Ensure that your application complies with regulations like PCI or HIPAA, and that it handles sensitive data securely. Monitor access and other user behavior and detect malicious activity.Compliance with the U.S. – E.U. Safe Harbor framework SOC 2, Type II attestation Attestation of HIPAA compliance PCI DSS 3.0 FIPS 140 complianceBeyond Log Collection and CentralizationSumo Logic offers a broad set of features including monitoring, search and predictive analytics.Search and Analyze. Run searches and correlate events in real-time using a simple search engine-like syntax, such as PARSE, WHERE, IF, SUMMARIZE, TIMESLICE, GROUP BY, and SORT. LogReduce™ technology reduces hundreds of thousands of log events into groups of patterns. By filtering out the noise in your data, LogReduce can help reduce the Mean Time to Identification of issues by 50% or more. Transaction Analytics automates analysis of transactional context to decrease time associated with compiling and applying intelligence across transactions flowing through your multi-tiered Heroku application. Detect and Predict. When rules are not enough, Anomaly Detection technology powered by machine-learning algorithms detects deviations to uncover the unknowns in your data. Outlier Detection, also powered by a unique algorithm, analyzes thousands of data streams with a single query, determines baselines and identify outliers in real-time. Purpose-built visualization highlights abnormal behaviors giving Operations and Security teams visibility into critical KPIs for troubleshooting and remediation. Predictive Analytics extends and complements Anomaly and Outlier Detection by predicting future KPI violations and abnormal behaviors through a linear projection model. The ability to observe violations that may occur in the future helps teams address issues before they impact their business Monitor and Visualize. Custom Dashboards and brilliant visualization help you easily monitor your data in real-time. The Dashboards unify all data streams so you can keep an eye on events that matter. Charting capabilities such as bar, pie line, map, and combo charts help you keep an eye on the most important KPIs for your Heroku application. Alert and Notify. Custom alerts proactively notify you when specific events and outliers are identified across your data streams. Proactive notifications are generated when your data deviates from calculated baselines or exceed thresholds to help you address potential issues promptly.Integrated Solutions for DevOps TeamsSumo Logic provides out-of-the-box solutions to help you build, run and secure your Heroku applications.Docker. Provides a native collection source for your entire Docker infrastructure. Real-time monitoring of Docker infrastructure including stats, events and container logs. Troubleshoot issues and set alerts on abnormal container or application behavior. Artifactory. Provides insight into your JFrog Artifactory binary repository. The App provides preconfigured Dashboards that include an Overview of your system, Traffic, Requests and Access, Download Activity, Cache Activity, and Non-Cached Deployment Activity. Github. Coming soon, our Github beta will allow DevOps teams gather metrics to facilitate code reviews, monitor team productivity, and enable teams to secure intellectual property.Next StepsExisting Heroku customers can launch the Sumo Logic service directly from the Heroku dashboard. Select a pricing that works for you, or take a test drive for 30 days with Sumo Free. I’ve written a short getting started for Ruby Developers. If you’d like to contribute a tutorial in another language, feel free to share it on our developer community site.
Sumo Logic’s Christian Beedgen Speaks on Docker Logging and Monitoring
Support for Docker logging has evolved over the past two years, and the improvements made from Docker 1.6 to today have greatly simplified both the process and the options for logging. However, DevOps teams are still challenged with monitoring, tracking and troubleshooting issues in a context where each container emits its own logging data. Machine data can come from numerous sources, and containers may not agree on a common method. Once log data has been acquired, assembling meaningful real-time metrics such as the condition of your host environment, the number of running containers, CPU usage, memory consumption and network performance can be arduous. And if a logging method fails, even temporarily, that data is lost. Sumo Logic’s co-founder and CTO, Christian Beedgen presented his vision for comprehensive container monitoring and logging to the 250+ developers that attended the Docker team’s first Meetup at Docker HQ in San Francisco this past Tuesday. Docker Logging When it comes to logging in Docker, the recommended pathway for developers has been for the container to write to its standard output, and let Docker collect the output. Then you configure Docker to either store it in files, or send it to syslog. Another option is to write to a directory, so the plain log file is the typical /var/log thing, and then you share that directory with another container. In practice, When you stop the first container, you indicate that /var/log will be a “volume,” essentially a special directory, that can then be shared with another container. Then you can run tail -f in a separate container to inspect those logs. Running tail by itself isn’t extremely exciting, but it becomes much more meaningful if you want to run a log collector that takes those logs and ships them somewhere. The reason is you shouldn’t have to synchronize between application and logging containers (for example, where the logging system needs Java or Node.js because it ships logs that way). The application and logging containers should not have to agree on specific dependencies, and risk breaking each others’ code. But as Christian showed, this isn’t the only way to log in Docker. Christian began the presentation by reminding developers of the 12-Factor app, a methodology for building SaaS applications, recommending that you limit to one process per container as a best practice, with each running unbuffered and sending data to Stdout. He then introduced the numerous options for container logging from the pre-Docker 1.6 days forward, and quickly enumerated them saying that some were better than others. You could: Log Directly from an Application Install a File Collector in the Container Install a File as a Container Install a Syslog Collector as a Container Use Host Syslog for Local Syslog Use a Syslog Container for Local Syslog Log to Stdout and use a file collector Log to StdOut and use Logspout Collect from the Docker File systems (Not recommended) Inject Collector via Docker Exec Logging Drivers in Docker Engine Christian also talked about logging drivers, which he believes have been a very large step forward in the last 12 months. He stepped through incremental logging enhancements made to Docker from 1.6 to today. Docker 1.6 added 3 new log drivers: docker logs, syslog, and log-driver null. The driver interface was meant to support the smallest subset available for logging drivers to implement their functionality. Stdout and stderr would still be the source of logging for containers, but Docker takes the raw streams from the containers to create discrete messages delimited by writes that are then sent to the logging drivers. Version 1.7 added the ability to pass in parameters to drivers, and in Docker 1.9 tags were made available to other drivers. Importantly, Docker 1.10 allows syslog to run encrypted, thus allowing companies like Sumo Logic to send securely to the cloud. He noted recent proposals for Google Cloud Cloud Logging driver, and the TCP, UDP, Unix Domain Socket driver. “As part of the Docker engine, you need to go through the engine commit protocol. This is good, because there’s a lot of review stability. But it is also suboptimal because it is not really modular, and it adds more and more dependencies on third party libraries.” So he poses the question of whether this should be decoupled. In fact, others have suggested the drivers be external plugins, similar to how volumes and networks work. Plugins would allow developers to write custom drivers for their specific infrastructure, and it would enable third-party developers to build drivers without having to get them merged upstream and wait for the next Docker release. A Comprehensive Approach for Monitoring and Logging As Christian stated, “you can’t live on logs alone.” To get real value from machine-generated data, you need to look at what he calls “comprehensive monitoring.” There are five requirements to enable comprehensive monitoring: Events Configurations Logs Stats Host and daemon logs For events, you can send each event as a JSON message, which means you can use JSON as a way of logging each event. You enumerate all running containers, then start listening to the event stream. Then you start collecting each running container and each start event. For configurations, you call the inspect API and send that in JSON, as well. “Now you have a record,” he said. “Now we have all the configurations in the logs, and we can quickly search for them when we troubleshoot.” For logs, you simply call the logs API to open a stream and send each log as, well, a log. Similarly for statistics, you call the stats API to open a stream for each running container and each start event, and send each received JSON message as a log. “Now we have monitoring,” says Christian. “For host and daemon logs, you can include a collector into host images or run a collector as a container. This is what Sumo Logic is already doing, thanks to the API.” Summary Perhaps it is a testament to the popularity of Docker, but even the Docker team seemed surprised by the huge turnout for this first meetup at HQ. As proud sponsor Sumo Logic of the meetup, we look forward to new features in Docker 1.10 aimed at enhancing container security including temporary file systems, seccomp profiles, user namespaces, and content addressable images. If you’re interested in learning more about Docker logging and monitoring, you can download Christian’s Docker presentation on Slideshare.
Introducing Sumo Logic Live Tail
In my last post I wrote about how DevOps’ emphasis on frequent release cycles leads to the need for more troubleshooting in production, and that developers are being frequently being drawn into that process. Troubleshooting applications in production isn’t always easy: For developers, the first course of action is to drop down to terminal, ssh into the environment (assuming you have access) and begin tailing log files to determine the current state. When the problem isn’t immediately obvious, they might tail -f the logs to a file, then grep for specific patterns. But there’s no easy way to search log tails in real time. Until now. Now developers and team members have a new tool, called Sumo Logic Live Tail, that lets you tail log files into a window, filter for specific conditions and utilize other cool features to troubleshoot in real time. Specifically, Live Tail lets you: Pause the log stream, scroll up to previous messages, then jump to the latest log line and resume the stream. Create keywords that will then be used to highlight occurrences within the log stream. Filter log files on-the-fly in real time Tail multiple log files simultaneously by multi-tailing Launch Sumo Logic Search in context of Sumo Logic Live Tail (and vice versa) Live Tail is immediately available from with the Sumo Logic environment, and coming soon is a command line interface (CLI) that will allow developers to launch live tail directly from the command line. What Can I Do With Live Tail? Troubleshoot Production Logs in Real Time You can now troubleshoot without having to log into business critical applications. Users also can harness the power of Sumo Logic by being able to launch Search in the context of Live Tail and vice versa. There is simply no need to go between different tools to get the data you need. Save Time Requesting and Exporting Log Files As I mentioned, troubleshooting applications in production with tail-f isn’t always easy. First, you need to gain access to production log files. For someone managing sensitive data, admins may be reluctant to grant that access. Live Tail allows you to view your most recent logs in real time, analyze them in context, copy and share every time via secure email when there’s an outage, and set up searches based on live tail results using Sumo Logic. Consolidate Tools to Reduce Costs In the past, you may have toggled between two tools: one for tailing your logs and another for advanced analytics for pattern recognition to help with troubleshooting, proactive problem identification and user analysis. With Sumo Logic Live Tail, you can now troubleshoot from the Sumo Logic browser interface or from a Sumo Logic Command Line Interface without investing in a separate solution for live tail, thereby reducing the cost of owning licenses for multiple tools. Getting Started There are a couple of ways to initiate a Live Tail session. From the Sumo Logic web app: Go directly to Live Tail by hovering over the Search menu and clicking on the Live Tail menu item; or From an existing search, click the Live Tail link (just below the search interface). In both instances, you’ll need to enter the name of the _sourceCategory, _sourceHost, _sourceName, _source, or _collector of the log you want to tail, along with any filters. Click Run to initiate the search query. That will bring up a session similar to Figure 1. Figure 1. A Live Tail session. To find specific information, such as errors and exceptions you can filter by keyword. Just add your keywords to the Live Tail query and click Run or press Enter. The search will be rerun with the new filter and those keywords will be highlighted on incoming messages, making easy to spot conditions. The screen clears, and new results automatically scroll. Figure 2. Using Keyword Highlighting to quickly locate items in the log stream. To highlight keywords that appear in your running Live Tail, click the A button. A dialog will open — enter the term you’d like to highlight. You may enter multi-term keywords separated by spaces. Hit enter to add additional keywords. The different keywords are then highlighted using different colors, so that they are easy to find on the screen. You can highlight up to eight keywords at a time. Multi-tailing A single log file doesn’t always give you a full view. Using the multi-tail feature, you can tail multiple logs simultaneously. For example, after a database reboot, you can check if it was successful by validating that the application is querying the database. But if there’s an error on one server, you’ll need to check the other servers to see if they may be affected. You can start a second Live Tail session from the Live Tail page, or from the Search page, and the browser opens in split-screen mode, and streams 300 – 400 messages per minute. You can also open, or “pop out” a running Live Tail session into a new browser window. This way, you can move the new window to another screen, or watch it separately from the browser window where Sumo Logic is running. Figure 3. Multi-tailing in split screen mode Launch In Context One of the highlights of Sumo Logic Live Tail is the ability to launch in context, which allows you to seamlessly alternate between Sumo Logic Search and Live Tail in browser mode. For example, when you are on the search page and need to start tailing a log file to view the most recent log files coming in (raw log lines), you click on a button to launch the Live Tail page from Search and the source name gets carried forward automatically. If you are looking to perform more advanced operations like parsing, using operators or increasing the time range for the previous day, simply click “Open in Search”. This action launches a new search tab which automatically includes the parameters you entered on the Live Tail page. There is no delay to re-enter the parameters. For more information about using Live Tail, check out the documentation in Sumo Logic Help.
Open Source Projects at Sumo Logic
Someone recently asked me, rather smugly I might add, “who’s ever made money from open source?” At the time I naively answered with the first person who came to mind, which was Rod Johnson, the creator of Java’s Spring Framework. My mind quickly began retrieving other examples, but in the process I began to wonder about the motivation behind the question. The inference, of course, was that open source is free. Such a sentiment speaks not only to monetization but to the premise of open source, which raises a good many questions. As Karim R. Lakhani and Robert G Wolf wrote, “Many are puzzled by what appears to be irrational and altruistic behavior… giving code away, revealing proprietary information, and helping strangers solve their technical problems.” While many thought that better jobs, career advancement, and so on are the main drivers, Lakhani and Wolf discovered it is how creative a person feels when working on the project (what they call “enjoyment-based intrinsic motivation”) is the strongest and most pervasive driver. They also found that user need, intellectual stimulation derived from writing code, and improving programming skills are top motivators for project participation. Open Source Projects at Sumo Logic Here at Sumo Logic, we have some very talented developers on the engineering team and they are passionate about both the Sumo Logic application and giving back. To showcase some of the open-source projects our developers are working on, as well as other commits from our community we’ve created a gallery on our developer site where you can quickly browse projects and dive into the repos, code, and gists we’ve committed. Here’s a sampling of what you’ll find: Sumoshell Parsing out fields on the command line can be cumbersome. Aggregating is basically impossible, and there is no good way to view the results. Written by Russell Cohen, Sumoshell is collection of CLI utilities written in Go that you can use to improve analyzing log files. Grep can’t tell that some log lines span multiple individual lines. In Sumoshell, each individual command acts as a phase in a pipeline to get the answer you want. Sumoshell brings a lot of the functionality of Sumo Logic to the command line. Sumobot As our Chief Architect, Stefan Zier, explains in this blog post, all changes to production environments at Sumo Logic follow a well-documented change management process. In the past, we manually tied together JIRA and Slack to get from a proposal to approved change in the most expedient manner. So we built a plugin for our sumobot Slack bot. Check out both the post and the plugin. Sumo Logic Python SDK Written by Yoway Buorn, the SDK provides a Python interface to the Sumo Logic REST API. The idea is to make it easier to hit the API in Python code. Feel free to add your scripts and programs to the scripts folder. Sumo Logic Java Client Sumo Logic provides a cloud-based log management solution. It can process and analyze log files in petabyte scale. This library provides a Java client to execute searches on the data collected by the Sumo Logic service. Growing Number of Projects Machine data and analytics is about more than just server logging and aggregation. There are some interesting problems yet to be solved. Currently, you’ll find numerous appenders for .Net and Log4j, search utilities for Ruby and Java, Chef Cookbooks, and more. We could additional examples calling our REST API’s from different languages. As we build our developer community, we’d like to invite you contribute. Check out the open-source projects landing page and browse through the projects. Feel free to fork a project and share, or add examples to folders where indicated.
DevOps Visibility - Monitor, Track, Troubleshoot
As organizations embrace the DevOps approach to application development they face new challenges that can’t be met with legacy monitoring tools. Teams need DevOps Visibility. While continuous integration, automated testing and continuous delivery have greatly improved the quality of software, clean code doesn’t mean software always behaves as expected. A faulty algorithm or failure to account for unforeseen conditions can cause software to behave unpredictably. Within the continuous delivery (CD) pipeline, troubleshooting can be difficult, and in cases like debugging in a production environment it may not even be possible.DevOps teams are challenged with monitoring, tracking and troubleshooting issues in a context where applications, systems, network, and tools across the toolchain all emit their own logging data. In fact, we are generating an ever-increasing variety, velocity, and volume of data.Challenges of Frequent Release CyclesThe mantra of DevOps is to "Release faster and automate more." But these goals can also become pain points. Frequent release introduces new complexity and automation obscures that complexity. In fact, DevOps teams cite deployment complexity as their #1 challenge.The current challenges for DevOps teams is:Difficulty in collaborating across silos. Difficulty syncing multiple development work-streams. Frequent performance or availability issues. No predictive analytics to project future KPI violations. No proactive push notifications to alert on service outages.DevOps drives cross-organizational team collaboration. However, organizations amidst a DevOps adoption are finding they are having difficulty in collaborating across silos. Frequent release cycles also adds pressure when it comes to syncing multiple development work-streams. These forces are driving the need for more integration between existing legacy tools, and the need for new tools that cross-organizational teams can use collaboratively.Because of its emphasis on automated testing, DevOps has also created a need for toolsets that enable troubleshooting and root-cause analysis. Why? Because, as I've said, clean code doesn’t mean software always behaves as expected. That's why a greatest pain point for many of these teams is additions and modifications to packaged applications - often these are deployed to multi-tenant cloud environments.Troubleshooting from the Command LineDevOps teams are discovering that performance and availability problems have increased with more frequent releases. That means Ops is spending more time troubleshooting, and development is being drawn into production troubleshooting. In response developers typically will ssh into a server or cloud environment, drop down to the command line, and tail -f the log file. When the problem isn't readily seen they begin grepping the logs using regular expressions and hunt for patterns and clues to the problem. But grep doesn't scale. Simply put, log data is everywhere. Application, system and network logs are stored in different locations of each server, and may be distributed across locations in the cloud or other servers. Sifting through terabytes of data can take days.The difficulty is there's no consistency, no centralization and no visibility—No Consistency Ops is spending more time troubleshooting. Development is drawn into production troubleshooting. Service levels have degraded with more frequent releases. Performance and availability problems have increased. No CentralizationMany locations of various logs on each server. Logs are distributed across locations in the cloud or various servers. SSH + GREP doesn’t scale. No DevOps VisibilityHigh-value data is buried in petabytes Meaningful views are difficult to assemble No real-time visibility Immense size of Log DataDevOps Visibility Across the Tool ChainSumo Logic provides a single solution that is tool-agnostic and provides visibility throughout the Continuous Integration-Continuous Delivery pipeline, as well as across the entire DevOps toolchain. Sumo Logic delivers a comprehensive strategy for monitoring, tracking and troubleshooting applications at every stage of the build, test, deliver, and deploy release cycle.Full Stack DevOps Visibility - gather event streams from applications at every stage from sandbox development to final deployment and beyond. Combine with system and infrastructure data to get a complete view of your application and infrastructure stack in real time. No integration hassles - Sumo Logic can be integrated with a host of DevOps tools across the entire continuous delivery pipeline, not just server data. Increased Availability and Performance - Because you can monitor deployments in real time, issues can be identified before they impact the application and customer. Precise, proactive analytics quickly uncover hidden root causes across all layers of the application and infrastructure stack. Streamlines Continuous Delivery Troubleshoot issues and set alerts on abnormal container or application behavior Visualizations of key metrics and KPIs, including image usage, container actions and faults, as well as CPU/Memory/Network statistics Ability to easily create custom and aggregate KPIs and metrics using Sumo Logic’s powerful query language Advanced analytics powered by Log Reduce, Anomaly Detection, Transaction Analytics, and Outlier DetectionVersatilityThe one reaction I hear from customers is surprise - An organization will typically apply Sumo Logic to a specific use case such as security compliance. Then they discover the breadth of the product and apply it to use cases they had never thought of.“Many benefits and features of Sumo Logic came to us as a surprise. The Sumo Logic Service continues to uncover different critical issues and deliver new insight throughout the development/support lifecycles of each new version we release” -- Nathan Smith, Technical Director, Outsmart GamesSumo Logic enables DevOps teams to get deep, real-time visibility into their entire toolchain and production environment to help create better software faster. You can check out Sumo Logic right now with a free trial. It's easy to set up and allows you check out the wealth of features including LogReduce, our pattern-matching algorithm that quickly detects anomalies, errors and trending patterns in your data.
New Heroku Add-on for Sumo Logic Goes Beta
Today, Sumo Logic is pleased to announce that it is partnering with Heroku to bring a new level of real-time visibility to Heroku logs. Now Heroku developers will be able to select the Sumo Logic add-on directly from the Heroku marketplace, and quickly connect application, system and event logs to the Sumo Logic service with just a few clicks. Developers can then launch the Sumo Logic service directly from their Heroku Dashboard to gain real-time access to event logs in order to monitor new deployments, troubleshoot applications, and uncover performance issues. They can take advantage of Sumo Logic’s powerful search language to quickly search unstructured log data and isolate the application node, module or library where the root cause of a problem hides. Developers can also utilize patent-pending LogReduce™ to reduce hundreds of thousands of log events down to groups of patterns while filtering out the noise in your data. LogReduce can help reduce the Mean Time to Identification (MTTI) of issues by 50% or more. Developers also have access to Outlier & Anomaly Detection. Often, analysis and troubleshooting is centered around known data in systems. However, most errors, and security breaches stem from unknown data, or data that is new to a system. Analyzing this data requires highly-scalable infrastructure, and advanced algorithms to process the data. This is what Sumo Logic enables with its Anomaly Detection feature. Extending Heroku’s Logplex Heroku is a polyglot Platform-as-a-Service (PaaS) that allows developers to build applications locally, then push changes via Github up to Heroku for deployment. Heroku provides a managed container environment that supports popular development stacks including Java, Ruby, Scala, Clojure, Node.js, PHP, Python and Go. For logging, Heroku provides a service called Logplex that captures events and outputs streams of your app’s running processes, system components and other relevant platform-level events, and routes them into a single channel. Logplex aggregates log output from your application (including logs generated from an application server and libraries), system logs (such as restarting a crashed process), and API logs (e.g., deploying new code). The caveat is that Heroku only stores the last 1,500 lines of consolidated logs. To get Sumo Logic’s comprehensive logging with advanced search, pattern matching, outlier detection, and anomaly detection, you previously had to create a Heroku log drain – network service that can consume your app’s logs. Then you could configure an HTTPS service for sending logs to Sumo Logic. Seamless UX The Heroku add-on simplifies this process while providing developers with a seamless experience from the Heroku Dashboard. Now, with the Heroku add-on for Sumo Logic, you simply push your changes up to Heroku, then run the following on the command line to create your app: heroku addons:create sumologic –app <my_app_name> This creates the application on Heroku, configures the app and points the log drain to the Sumo Logic service for you automatically. To view your logs, simply go to your Heroku dashboard and launch the Sumo Logic add-on. From the Heroku Dashboard, simply click on the Sumo Logic add-on. That will open Sumo Logic. Heroku Add-on for Sumo Logic Quick Start I’ve created a quick start that shows you how to build a simple Ruby app on Heroku, install the Sumo Logic add-on, and connect your new app to the Sumo Logic service. You can use Sumo Free to test your configuration, and you can run through the entire quick start in 15 minutes. About the Author Michael is the Head of Developer Programs at Sumo Logic. You can follow him on Twitter @CodeJournalist or LinkedIn.
Heroku Add-on for Sumo Logic Quick Start
New DevOps Community Enables Continuous Delivery Practitioners
According to the Puppet Labs’ 2015 State of DevOps Report high-performing IT organizations deploy 30 times more frequently with 200 times shorter lead times; they have 60 times fewer failures and recover 168 times faster. While staggering, those numbers should also lend credence to the fact that DevOps, however you spell it, is working.
Deploying “Hello, World!” DevOps Style
Change Management in a Change-Dominated World
DevOps isn't just about change -- it's about continuous, automated change. It's about ongoing stakeholder input and shifting requirements;about rapid response and fluid priorities. In such a change-dominated world, how can the concept of change management mean anything? But maybe that's the wrong question. Maybe a better question would be this: Can a change-dominated world even exist without some kind of built-in change management? Change management is always an attempt to impose orderly processes on disorder. That, at least, doesn't change. What does change is the nature and the scope of the disorder, and the nature and the scope of the processes that must be imposed on it. This is what makes the DevOps world look so different, and appear to be so alien to any kind of recognizable change management. Traditional change management, after all, seems inseparable from waterfall and other traditional development methodologies. You determine which changes will be part of a project, you schedule them, and there they are on a Gantt chart, each one following its predecessor in proper order. Your job is as much to keep out ad-hoc chaos as it is to manage the changes in the project. And in many ways, Agile change management is a more fluid and responsive version of traditional change management, scaled down from project-level to iteration-level, with a shifting stack of priorities replacing the Gantt chart. Change management's role is to determine if and when there is a reason why a task should move higher or lower in the priority stack, but not to freeze priorities (as would have happened in the initial stages of a waterfall project). Agile change management is priority management as much as it is change management -- but it still serves as a barrier against the disorder of ad-hoc decision-making. In Agile, the actual processes involved in managing changes and priorities are still in human hands and are based on human decisions. DevOps moves many of those management processes out of human hands and places them under automated control. Is it still possible to manage changes or even maintain control over priorities in an environment where much of the on-the-ground decision-making is automated? Consider what automation actually is in DevOps -- it's the transfer of human management policies, decision-making, and functional processes to an automatically operating computer-based system. You move the responsibilities that can be implemented in an algorithm over to the automated system, leaving the DevOps team free to deal with the items that need actual, hands-on human attention. This immediately suggests what naturally tends to happen with change management in DevOps. It splits into two forks, each of which is important to the overall DevOps effort. One fork consists of change management as implemented in the automates continuous release system, while the other fork consists of human-directed change management of the somewhat more traditional kind. Each of these requires first-rate change management expertise on an ongoing basis. It isn't hard to see why an automated continuous release system that incorporates change management features would require the involvement of human change management experts during its initial design and implementation phases. Since the release system is supposed to incorporate human expertise, it naturally needs expert input at some point during its design. Input from experienced change managers (particularly those with a good understanding of the system being developed) can be extremely important during the early design phases of an automated continuous release system; you are in effect building their knowledge into the structure of the system. But DevOps continuous release is by its very nature likely to be a continually changing process itself, which means that the automation software that directs it is going to be in a continual state of change. This continual flux will include the expertise that is embodied in the system, which means that its frequent revision and redesign will require input from human change management experts. And not all management duties can be automated. After human managers have been relieved of all of the responsibilities that can be automated, they are left with the ones that for one reason or another do not lend themselves well to automation -- in essence, anything that can't be easily turned into an algorithm. This is likely to include at least some (and possibly many) of the kinds of decision that fall under the heading of change management. These unautomated responsibilities will require someone (or several people) to take the role of change manager. And DevOps change management generally does not take its cue from waterfall in the first place. It is more likely to be a lineal descendant of Agile change management, with its emphasis on managing a flexible stack of priorities during the course of an iteration, and not a static list of requirements that must be included in the project. This kind of priority-balancing requires more human involvement than does waterfall's static list, which means that Agile-style change management is likely to result in a greater degree of unautomated change management than one would find with waterfall. This shouldn't be surprising. As the more repetitive, time-consuming, and generally uninteresting tasks in any system are automated, it leaves greater time for complex and demanding tasks involving analysis and decision-making. This in turn make it easier to implement methodologies which might not be practical in a less automated environment. In other words, human-based change management will now focus on managing shifting priorities and stakeholder demands, not because it has to, but because it can. So what place does change management have in a change-dominated world? It transforms itself from being a relatively static discipline imposed on an inherently slow process (waterfall development) to an intrinsic (and dynamic) part of the change-driven environment itself. DevOps change management manages change from within the machinery of the system itself, while at the same time allowing greater latitude for human guidance of the flow of change in response to the shifting requirements imposed by that change-driven environment. To manage change in a change-dominated world, one becomes the change.