- Stefan Zier (12)
- Sanjay Sarathy (10)
- Joan Pepin (10)
- Bruno Kurtic (9)
- David Andrzejewski (7)
- Vance Loiselle (7)
- Christian Beedgen (7)
- Russell Cohen (6)
- Ben Newton (5)
- Kumar Saurabh (5)
- Amanda Saso (4)
- Ariel Smoliar (3)
- Praveen Rangnath (3)
- Dwayne Hoover (2)
- Rishi Divate (2)
- Johnathan Hodge (2)
- Jacek Migdal (2)
- Brandon Mensing (2)
- Karthik Anantha Padmanabhan (1)
- Bill Lazar (1)
- Robert Sloan (1)
- Manish Khettry (1)
- Caleb Sotelo (1)
- Jim Wilson (1)
- David Carlton (1)
- Mike Cook (1)
- Yan Qiao (1)
- Eillen Voellinger (1)
- Ozan Unlu (1)
- Máté Kovács (1)
- Remy Guercio (1)
- CloudPassage: Cloud Security Guest Account (1)
- Binh Nguyen (1)
- Zack Isaacson (1)
- Mycal Tucker (1)
- Sebastian Mies (1)
- Vera Chen (1)
- Jana Lass (1)
- Megha Bangalore (1)
- Bright Fulton (1)
- Vivek Kaushal (1)
- Mark Musselman (1)
- Derek Hall (1)
- Garrett Nano (1)
11.20.2014 | Posted by Binh Nguyen
At Misfit Wearables, we’ve been using Sumo Logic with great success and we wanted to share our story. We create smart devices that promote fitness and wellness. In order to develop great devices, capturing and analyzing data are critical tasks. We previously used a well-known open source log management tool. This tool was slow, limited and didn’t really deliver the value that we were looking for, yet for some time we had come to merely accept what we configured it to do for us. When we made the change to Sumo Logic, we saw some big changes. In summary Sumo Logic is an effective tool to manage logs and analyze data and is widely used by all our engineers in Misfit. However it goes way beyond that. The feedback from our product team has been tremendous and the main thing that is different is performance. Today, with Sumo Logic running in our environment, a job that once took five to ten minutes now takes several seconds. That was the beginning of our Sumo Logic story, because we are now implementing the Partition feature throughout our environment, and we’re already seeing results. Partitions has taken us to another level of performance improvement.
The Misfit Environment
To describe our environment, we collect a lot of data, but our setup is not unusual. Each day, we collect various logs from different collectors such as servers, clients, customers, websites and stores. Then by using the Data Forwarding option, all of these can be backed-up by an AWS S3 bucket. Pictured below is a report from the Sumo Logic dashboard of the total queries that run daily.
Fig 1. Daily query by Sumo Logic
Sumo Logic also provides us an Anomaly Detection tool that can help us to automatically uncover security and other issues in real time. For those that are new to the software, there are a number of useful support features that make things easier such as Amazon Cloud Front, Data Volume, AWS Cloud Trail and Log Analysis Quick Start. Adoption and value have thus come about quickly at Misfit, and we are constantly finding ways to save time and effort with this powerful tool.
Fig 2. Sumo Logic Applications Collection
Recently, Sumo Logic introduced a feature called “Partitions” and we have started implementing it in a number of productive ways. For example, we now can easily filter a subset of one collector into a partition by creating an index. With this approach, we have seen drastically improved search query performance based on the reduced total number of messages that need to be searched and all partition indexes can now be automatically included in searches.
To better understand the new feature, we have set up the following test:
1. We chose a small-size collector, which only has around 2% of total daily volume.
2. Then, we measure the query time by using the default index and one specific index for this collector during last N days (N = 1, 2, 3,…,14).
Fig 3. Query time by using the default index versus an index by partitions. (a) Query time by the default index. (b) Query time by using an index for this collector. (c) The ratio between the query time by using an index and the query time by using the default index.
In Figure 3, for example searching some logs during “Last 14 days”, using the default index will take about 154 seconds; meanwhile it only takes 12 seconds by creating an index for this collector. The ratio between non-partitioned and partitioned indices over various time periods can be seen in Figure 3c. We save a lot of time, effort and resources with this new feature which helps with product development cycles.
Saving time and resources on query and analysis is critical to our product. We here at Misfit Wearables have enjoyed using Sumo Logic and we look forward to further emerging features from the company that help us in what we do every day which is improve our product day by day and even hour by hour.
Binh Nguyen @ Misfit Wearables
11.17.2014 | Posted by Remy Guercio
When choosing an Application Performance Management (APM) provider to complement your log analytics solution, it is important to consider factors such as ease of integration, breadth of features offered and alignment with your growing business needs. APM solutions are metrics focused, they do a great job of identifying latency spikes, drop offs in user interaction, and other metrics based issues. However, they can’t always identify the root cause of a problem or detect complex anomalies such as new error types in applications. Together, APM and log analytics solutions provide end-to-end visibility into your application stack. This post aims to provide some insight into the features and integrations that various APM providers offer with log analytics solutions. The APM vendors are listed in random order, not by any specific ranking.
New Relic provides a SaaS based APM solution aimed at businesses with apps of all sizes. The APM solution provided by New Relic supports applications built on Ruby, Java, Node.js, PHP, Python, iOS and Android. The base APM solution integrates with their other solutions providing metrics based analytics from browsers, mobile platforms and synthetic testing monitors (in beta). New Relic was named as a leader in Gartner’s 2014 Magic Quadrant for APM.
- Code level visibility
- Transaction tracing
- SQL query analysis
- Alerting integrations with HipChat, Jira, PagerDuty and Campfire
- Network request tracking
- SLA compliance reporting
- Mobile: Crash reporting
- Mobile: Device analytics
- Synthetic monitoring integration
- 14 day free Pro trial
- Lite – 24hr data retention
- Pro – unlimited data retention, transaction tracing, phone support, and service SLA
- Enterprise – dedicated account manager, greater support
- Lite – 24hr data retention, summary data
- Standard – 1 week data retention, response time metrics, user interaction overview
- Enterprise – 3 month data retention, device metrics, user interaction traces
New Relic provides iPhone and Android applications where customers can view and receive alerts for important metrics, and on the web, New Relic offers a slick dashboard experience for looking into real-time application performance.
AppNeta provides a SaaS base APM solution that supports applications written in Java, .NET, Python, PHP, Node.js and Ruby. AppNeta was named as a niche player in Gartner’s 2014 Magic Quadrant for APM.
- Synthetic monitoring
- Code level visibility
- Transaction tracing
- SLA Compliance Reporting
TraceView (APM solution):
- Free – 1 application, 1 hour of data retention, 1 user
- Startup – 1 application, 24 hours of data retention, 3 users
- Enterprise – unlimited applications, 45 days retention, unlimited users
AppView (Synthetic monitoring):
- Small – 5 monitors
- Medium – 10 monitors
- Large – 40 monitors
AppNeta’s heat-map charting interface makes it very intuitive and easy to spot patterns, trends, and outliers in your apps metrics.
AppDynamics focuses on providing a SaaS application performance management solution to both large and small businesses, and they offer support for Java, .NET, PHP, Node.js, iOS and Android. They offer add ons that providing metrics based analytics from browsers and mobile platforms. Gartner named AppDynamics as a leader in the 2014 Magic Quadrant for APM.
- Realtime user monitoring
- Network request snapshots
- Alerting integrations with Service Now, Pager Duty, and Jira
- Code level visibility
- Mobile: Crash reporting
- Mobile: Device analytics
- Synthetic monitoring
Note: Pricing is done in units. Units are usually 1 process each, except for with node.js where 10 processes equal 1 unit.
- Lite – 24hr data retention, one unit
- Pro – upto 10 units
AppDynamics’ dashboards allow for the creation of visually appealing and robust application component maps that show the detailed breakdown of your application’s performance in real time.
11.11.2014 | Posted by Brandon Mensing
Our app for AWS CloudTrail now offers a dashboard specifically for monitoring console login activity. In the past months since the AWS team added this feature, we decided to break out these user activities in order to provide better visibility into what’s going on with your AWS account.
Many of you might think of this update as incremental and not newsworthy, but I’m actually writing here today to tell you otherwise! More and more people are using APIs and CLIs (and third parties) to work with AWS outside the console. As console logins are becoming more and more rare and as more business-critical assets are being deployed in AWS, it’s critical to always know who’s logged into your console and when.
For a great and terrifying read about just how badly things can go wrong when someone gains access to your console, look no further than the story of Code Spaces. With one story opening with “was a company” and another “abruptly closed,” there isn’t exactly a lot of suspense about how things turned out for this company. After attackers managed to gain access to Code Spaces’ AWS console, they built themselves a stronghold of backdoors and began an attempt to extort money from the company. When the attackers accounts were removed, they quickly used the additional users they had generated to get back in and begin taking out infrastructure and data. With the service down and their customer’s data in disarray, all trust in their product was lost. The company was effectively destroyed in a matter of hours.
The new dashboard in our updated CloudTrail app allows you to quickly see who’s attempting to login to your console, from where and whether or not they’re using multi-factor authentication (which we highly recommend).
If you haven’t installed the app previously, be sure to follow our simple steps from our documentation to setup the appropriate permissions in AWS. For those of you who have already installed the app previously, you can install the app again anew in order to get a new copy of the app with the additional dashboard included. From there, we encourage you to customize queries for your specific situation and even consider setting up a scheduled search to alert you to a problematic situation.
Keeping an eye out for suspicious activity on your AWS console can be an invaluable insight. As attackers get more sophisticated, it’s harder and harder to keep your business secure and operational. With the help of Sumo Logic and logs from AWS CloudTrail you can stay ahead of the game by preventing the most obvious (and most insidious) types of breaches. With functionality like this, perhaps Code Spaces would still be in business.
11.05.2014 | Posted by Eillen Voellinger
“How do I build trust with a cloud-based service?” This is the most common question Sumo Logic is asked, and we’ve got you covered. We built the service so it was not just an effortless choice for enterprise customers but the obvious one and building trust through a secure architecture was one of the first things we took care of.
Sumo Logic is SOC 2 Type 2 and HIPAA compliant. Sumo Logic also complies with the U.S. – E.U. Safe Harbor framework and will soon be PCI/DSS 3.0 compliant. No other cloud-based log analytics service can say this. For your company, this means you can safely get your logs into Sumo Logic – a service you can trust and a service that will protect your data just like you would.
These are no small accomplishments, and it takes an A-team to get it done. It all came together when we hired Joan Pepin, a phreak and a hacker by admission. Joan is our VP of Security and CISO. She was employee number 11 at Sumo Logic and her proficiency has helped shape our secure service.
Our secure architecture is also a perfect match for our “Customer First” policy and agile development culture. We make sure that we are quickly able to meet customer needs and to fix issues in real-time without compromising our secure software development processes. From network security to secure software development practices, we ensured that our developers are writing secure code in a peer-reviewed and process-driven fashion.
Sumo Logic was built from the ground up to be secure, reliable, fast, and compliant. Joan understands what it means to defend a system, keep tabs on it, watch it function live. Joan worked for the Department of Defense. She can’t actually talk about what she did when she was there, but we can confirm that she was there because the Department of Defense, as she puts it, “thought my real world experience would balance off the Ph.Ds.”
Joan learned the craft from Dr. Who, a member of the (http://en.wikipedia.org/wiki/Legion_of_Doom_(hacking)) Legion of Doom. (http://phrack.org/issues/31/5.html#article ) If hacker groups were rock and roll, the Legion of Doom would be Muddy Waters, Chuck Berry, Buddy Holly. They created the idea of a hacker group. They hacked into a number of state 911 systems and stole the documentation on them, distributing it throughout BBS’ in the United States. They were the original famous hacking group. Joan is no Jane-come-lately. She’s got the best resume you can have in this business.
We’re frequently asked about all the security procedures we adopt at Sumo Logic. Security is baked into every component of our service. Other than the various attestations I mentioned earlier, we also encrypt data at rest and in transit. Other security processes that are core to the Sumo Logic service include:
+ Centrally managed, FIPS-140 two-factor authentication devices for operations personnel
+ Biometric access controls
+ Whole-disk encryption
+ Thread-level access controls
+ Whitelisting of individual processes, users, ports and addresses
+ Strong AES-256-CBC encryption
+ Regular penetration tests and vulnerability scans
+ A strong Secure Development Life-Cycle (SDLC)
+ Threat intelligence and managed vulnerability feeds to stay current with the constantly evolving threatscape and security trends
If you’re still curious about the extent to which our teams have gone to keep your data safe, check out our white paper on the topic:
We use our own service to capture our logs, which has helped us accomplish our enviable security and compliance accomplishments. We’ve done the legwork so your data is secure and so you can use Sumo Logic to meet your unique security and compliance needs. We have been there done that with the Sumo Logic service and now it’s your turn.
10.28.2014 | Posted by Karthik Anantha Padmanabhan
10.22.2014 | Posted by Ariel Smoliar, Senior Product Manager
The new Sumo Logic Transaction capability allows users to analyze related sequences of machine data. The comprehensive views uncover user behavior, operational and security insights that can help organizations optimize business strategy, plans and processes.
The new capability allows you to monitor transactions by a specific transaction ID (session ID, IP, user name, email, etc.) while handling data from distributed systems, where a request is passed through several different systems, each with its own transaction ID.
Over the past two months, we have worked with beta customers on a variety of use cases, including:
Tracking transactions in a payment processing platform
Following typical user sessions, detecting anomalous checkout transactions and catching checkout drop off in e-commerce websites
Tracking renewals, upgrades and new signup transactions
Monitoring phone registrations failures over a specific period
Tracking on-boarding of new users in SaaS products
The last use case is reflective of what SaaS companies care most about: truly understanding the behavior of users on their website that drive long-term engagement. We’ve used our new transaction analytics capabilities to better understand how users find our site, the process by which they get to our Sumo Logic Free page, and how quickly they sign up. Our customer success team uses Transaction Analytics to monitor how long it takes users to create a dashboard, run a search, and perform other common actions. This enables them to provide very specific feedback to the product team for future improvements.
This screenshot depicts a query with IP as the transaction ID and the various states mapped from the logs
Sankey diagram visualizes the flow of the various components/states of a transaction on an e-commerce website
Many of our customers are already using tools such as Google Analytics to monitor visitors flow on their website and understand customer behavior. We are not launching this new capability to replace Google Analytics (even if it’s not embraced in some countries as Germany). What we bring on top of monitoring visitors flow, is the ability to identify divergence in state sequences and understand better the transitions between the states, in terms of latency for example. You probably see updates that some companies are announcing on plugins for log management platforms to detect anomalies and monitor user behavior and sessions. The team’s product philosophy is that we would like to provide our users all-rounded capability that enables them to make smart choices without requiring external tools, all from their machine data within the Sumo product.
It was a fascinating journey working on the transaction capability with our analytics team. It’s a natural evolution of our analytics strategy which now includes: 1) real-time aggregation and correlation with our Dashboards; 2) machine learning to automatically uncover anomalies and patterns; and 3) now transaction analytics to rapidly uncover relationships across distributed events.
We are all excited to launch Transaction Analytics. Please share with us your feedback on the new capability and let us know if we can help with your use cases. The transaction searches and the new visualization are definitely our favorite content.
10.20.2014 | Posted by Amanda Saso, Principal Tech Writer
Ever had that sinking feeling when you start a new job and wonder just why you made the jump? I had a gut check when, shortly after joining Sumo Logic in June of 2012, I realized that we had less than 50 daily hits to our Knowledge Base on our support site. Coming from a position where I was used to over 7,000 customers reading my content each day, I nearly panicked. After calming down, I realized that what I was actually looking at was an amazing opportunity.
Fast forward to 2014. I’ve already blogged about the work I’ve done with our team to bring new methods to deliver up-to-date content. (If you missed it, you can read the blog here.) Even with these improvements I couldn’t produce metrics that proved just how many customers and prospects we have clicking through our Help system. Since I work at a data analytics company, it was kind of embarrassing to admit that I had no clue how many visitors were putting their eyes on our Help content. I mean, this is some basic stuff!
Considering how much time I’ve spent working with our product, I knew that I could get all the information I needed using Sumo Logic…if I could get my hands on some log data. I had no idea how to get logging enabled, not to mention how logs should be uploaded to our Service. Frankly, my English degree is not conducive to solving engineering challenges (although I could write a pretty awesome poem about my frustrations). I’m at the mercy of my Sumo Logic co-workers to drive any processes involving how Help is delivered and how logs are sent to Sumo Logic. All I could do was pitch my ideas and cross my fingers.
I am very lucky to work with a great group of people who are happy to help me out when they can. This is especially true of Stefan Zier, our Chief Architect, who once again came to my aid. He decommissioned old Help pages (my apologies to anyone who found their old bookmarks rudely displaying 404’s) and then routed my Help from the S3 bucket through our product, meaning that Help activity can be logged. I now refer to him as Stefan, Patron Saint of Technical Writers. Another trusty co-worker we call Panda helped me actually enable the logging.
Once the logging began we could finally start creating some Monitors to build out a Help Metrics Dashboard. In addition to getting the number of hits and the number of distinct users, we really wanted to know which pages were generating the most hits (no surprise that search-related topics bubbled right to the top). We’re still working on other metrics, but let me share just a few data points with you.
Take a look at the number of hits our Help site has handled since October 1st:
We now know that Wednesday is when you look at Help topics the most:
And here’s where our customers are using Help, per our geo lookup operator Monitor:
It’s very exciting to see how much Sumo Logic has grown, and how many people now look at content written by our team, from every corner of the world. Personally, it’s gratifying to feel a sense of ownership over a dataset in Sumo Logic, thanks to my friends.
What’s next from our brave duo of tech writers? Beyond adding additional logging, we’re working to find a way to get feedback on Help topics directly from users. If you have any ideas or feedback, in the short term, please shoot us an email at email@example.com. We would love to hear from you!
10.16.2014 | Posted by Derek Hall
10.16.2014 | Posted by Kumar Saurabh, Co-Founder & VP of Engineering
In 1965, Dr. Hubert Dreyfus, a professor of philosophy at MIT, later at Berkeley, was hired by RAND Corporation to explore the issue of artificial intelligence. He wrote a 90-page paper called “Alchemy and Artificial Intelligence” (later expanded into the book What Computers Can’t Do) questioning the computer’s ability to serve as a model for the human brain. He also asserted that no computer program could defeat even a 10-year-old child at chess.
Two years later, in 1967, several MIT students and professors challenged Dreyfus to play a game of chess against MacHack (a chess program that ran on a PDP-6 computer with only 16K of memory). Dreyfus accepted. Dreyfus found a move, which could have captured the enemy queen. The only way the computer could get out of this was to keep Dreyfus in checks with his own queen until he could fork the queen and king, and then exchange them. And that’s what the computer did. The computer checkmated Dreyfus in the middle of the board.
I’ve brought up this “man vs. machine” story because I see another domain where a similar change is underway: the field of Machine Data.
Businesses run on IT and IT infrastructure is getting bigger by the day, yet IT operations still remain very dependent on analytics tools with very basic monitoring logic. As the systems become more complex (and more agile) simple monitoring just doesn’t cut it. We cannot support or sustain the necessary speed and agility unless the tools becomes much more intelligent.
We believed in this when we started Sumo Logic and with the learnings of running a large-scale system ourselves, continue to invest in making operational tooling more intelligent. We knew the market needed a system that complemented the human expertise. Humans don’t scale that well – our memory is imperfect so the ideal tools should pick up on signals that humans cannot, and at a scale that perfectly matches the business needs and today’s scale of IT data exhaust.
Two years ago we launched our service with a pattern recognition technology called LogReduce and about five months ago we launched Structure Based Anomaly Detection. And the last three months of the journey have been a lot like teaching a chess program new tricks – the game remains the same, just that the system keeps getting better at it and more versatile.
We are now extending our Structured Based Anomaly Detection capabilities with Metric Based Anomaly Detection. A metric could be just that – a time series of numerical value. You can take any log, filter, aggregate and pre-process however you want – and if you can turn that into a number with a time stamp – we can baseline it, and automatically alert you when the current value of the metric goes outside an expected range based on the history. We developed this new engine in collaboration with the Microsoft Azure Machine Learning team, and they have some really compelling models to detect anomalies in a time series of metric data – you can read more about that here.
The hard part about Anomaly Detection is not about detecting anomalies – it is about detecting anomalies that are actionable. Making an anomaly actionable begins with making it understandable. Once an analyst or an operator can grok the anomalies – they are much more amenable to alert on it, build a playbook around it, or even hook up automated remediation to the alert – the Holy Grail.
And, not all Anomaly Detection engines are equal. Like chess programs there are ones that can beat a 5 year old and others that can even beat the grandmasters. And we are well on our way to building a comprehensive Anomaly Detection engine that becomes a critical tool in every operations team’s arsenal. The key question to ask is: does the engine tell you something that is insightful, actionable and that you could not have found with standard monitoring tools.
Below is an example of an actual Sumo production use case where some of our nodes were spending a lot of time in garbage collection impacting refresh rates for our dashboards for some of the customers.
If this looks interesting, our Metric Based Anomaly Detection service based on Azure Machine Learning is being offered to select customers in a limited beta release and will be coming soon to machines…err..a browser near you (we are a cloud based service after all).
P.S. If you like stories, here is another one for you. 30 years after MackHack beat Dreyfus, in the year 1997 Kasparov (arguably one of the best human chess players) played the Caro-Kann Defence. He then allowed Deep Blue to commit a knight sacrifice, which wrecked his defenses and forced him to resign in fewer than twenty moves. Enough said.
10.09.2014 | Posted by David Andrzejewski, Data Sciences Engineer
Abstraction is a fundamental concept in software development. Identifying and building abstractions well-suited to the problem at hand can make the difference between clear, maintainable code and a teetering, Jenga-like monolith duct-taped together by a grotesque ballet of tight coupling and special case handling. While a well-designed abstraction can shield us from detail, it can also suffer from leakage, failing to behave as expected or specified and causing problems for code built on top of it. Ensuring the reliability of our abstractions is therefore of paramount concern.
In previous blog posts, we’ve separately discussed the benefits of using type classes in Scala to model abstractions, and using randomized property testing in Scala to improve tests. In this post we discuss how to combine these ideas in order to build more reliable abstractions for use in your code. If you find these ideas interesting please be sure to check out the references at the end of this post.
Type classes for fun and profit
Type classes allow us to easily build additional behaviors around data types in a type-safe way. One simple and useful example is associated with the monoid abstraction, which represents a set of items which can be combined with one another (such as the integers, which can be combined by addition). Loosely, a monoid consists of
a collection of objects (e.g., integers)
a binary operation for combining these objects to yield a new object of the same type (e.g., addition)
an identity object whose combination leaves an object unchanged (e.g., the number 0)
|1 2 3 4||
The utility of this machinery is that it gives us a generalized way to use types that support some notion of “addition” or “combination”, for example:
|1 2 3 4 5 6 7 8 9 10||
As described in our earlier machine learning example, this can be more convenient than requiring that the data types themselves subtype or inherit from some kind of “Addable” interface.
In Scala, the Monoid[F] trait definition (combined with the compiler type-checking) buys us some important sanity checks with respect to behavior. For example, the function signature append(x: F, y: F): F guarantees that we’re never going to get a non-F result.
However, there are additional properties that an implementation of Monoid[F] must satisfy in order to truly conform to the conceptual definition of a monoid, but which are not easily encoded into the type system. For example, the monoid binary operation must satisfy left and right identity with respect to the “zero” element. For integers under addition the zero element is 0, and we do indeed have x + 0 = 0 + x = x for any integer x.
We can codify this requirement in something called type class law. When defining a particular type class, we can add some formal properties or invariants which we expect implementations to obey. The codification of these constraints can then be kept alongside the type class definition. Again returning to scalaz Monoid, we have
|1 2 3 4 5 6 7 8 9 10||
An interesting observation is that this implementation depends upon another type class instance Equal[F] which simply supplies an equal() function for determining whether two instances of F are indeed equal. Of course, Equal[F] comes supplied with its own type class laws for properties any well-defined notion of equality must satisfy such as commutativity (x==y iff y==x), reflexivity (x==x), and transitivity (if a==b and b==c then a==c).
A machine learning example
We now consider an example machine learning application where we are evaluating some binary classifier (like a decision tree) over test data. We run our evaluation over different sets of data, and for each set we produce a very simple output indicating how many predictions were made, and of those, how many were correct:
We can implement Monoid[Evaluation]  in order to combine the our experimental results across multiple datasets:
|1 2 3 4 5||
We’d like to ensure that our implementation satisfies the relevant type class laws. We could write a handful of unit tests against one or more hand-coded examples, for example using ScalaTest:
|1 2 3 4 5 6 7 8 9 10 11||
However, this merely gives us an existence result. That is, there exists some value for which our the desired property holds. We’d like something a little stronger. This is where we can use ScalaCheck to do property testing, randomly generating as many arbitrary instances of Evaluation as we’d like. If the law holds for all  generated instances, we can have a higher degree confidence in the correctness of our implementation. To accomplish this we simply need to supply a means of generating random Evaluation instances via ScalaCheck Gen:
|1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16||
Now that’s an abstraction we can believe in!
This level of confidence becomes important when we begin to compose type class instances, mixing and matching this machinery to achieve our desired effects. Returning to our Evaluation example, we may want to evaluate different models over these datasets, storing the results for each dataset in a Map[String,Evaluation] where the keys refer to which model was used to obtain the results. In scalaz, we get the Monoid[Map[String,Evaluation]] instance “for free”, given an instance of Monoid[Evaluation]:
|1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18||
Conclusion and references
If you are using the scalaz library, many of the provided type classes come “batteries included” with type class laws. Even if you are not, these ideas can help you to build more reliable type class instances which can be composed and extended with confidence. See below for some additional references and readings on this subject:
- Law Enforcement using Discipline
- Haskell’s Type Classes: We Can Do Better
 Omitting associativity and explicit discussion of closure.
 For brevity, these code snippets do not show library (scalaz, ScalaTest, ScalaCheck) imports.
 Excluding the unfortunate possibilities of null return values or thrown Exceptions.