Blog › Authors › Joan Pepin
01.14.2014 | Posted by Joan Pepin, Director of Security
Today we announced that Sumo Logic has successfully completed the Service Organization Controls (SOC) Type 2 examination of the Trust Service Principles; Security, Availability and Confidentiality. Frankly, this is a pretty big deal and something we have been working towards for a while (we achieved our SOC 2 Type 1 in August of 2012) so I’m here to explain a little bit about what that means for you.
In case you’re not familiar with the SOC 2 Type 2 it may help you to know that that the SOC family of reports was implemented by the American Institute of Certified Public Accountants (the AICPA) as a replacement for the venerable old SAS-70 report back in 2011. (So if you’re still asking your vendors for their SAS-70, you’re behind the times a bit- I get this a lot- it’s usually followed by questions about our backup tapes on security assessment paperwork that hasn’t been updated since it was noisily written in Lotus Notes(™) on this bad-boy…)
The main purpose of the SOC 2 Type 2 report is to show our customers that an independent third party has evaluated our controls and our adherence to those controls over a period of time. In the words of the AICPA, a SOC 2 report is ideal for:
“A Software-as-a-Service (SaaS) or Cloud Service Organization that offers virtualized computing environments or services for user entities and wishes to assure its customers that the service organization maintains the confidentiality of its customers’ information in a secure manner and that the information will be available when it is needed. A SOC 2 report addressing security, availability and confidentiality provides user entities with a description of the service organization’s system and the controls that help achieve those objectives. A type 2 report also helps user entities perform their evaluation of the effectiveness of controls that may be required by their governance process.”
The major areas of the SOC report are called “Trust Service Principles” because Trust is what this is all about. Once again in the words of the AICPA:
“Trust Services helps differentiate entities from their competitors by demonstrating to stakeholders that the entities are attuned to the risks posed by their environment and equipped with the controls that address those risks. Therefore, the potential beneficiaries of Trust Services assurance reports are consumers, business partners, creditors, bankers and other creditors, regulators, outsourcers and those using outsourced services, and any other stakeholders who in some way rely on electronic commerce (e-commerce) and IT systems.”
You know how you handle your data, but before you hand it over to someone else, you should know a good deal about how they are going to handle it, and because trust is based on openness your data services vendors should be extremely open about that.
Because trust is an important factor in any business relationship, our report lists 263 controls around Security, Availability and Confidentiality put into effect at Sumo Logic and the tests that our examiners (The wonderful people at Brightline CPAs & Associates) performed. This is an extremely thorough overview of what we do to ensure that we deserve your trust, and if you are considering sending us your data, you should ask us for a copy and look it over. And If you are considering any of our competitors, you should also ask to see their third-party assessment. (Hint: They don’t have one.)
05.09.2013 | Posted by Joan Pepin, Director of Security
Pharmacy networks, electronic medical records, third-party billing, referrals— the medical establishment in this country runs on shared data. To ensure the safety and proper use of all of this highly sensitive and widely-shared information the US Congress passed the Health Insurance and Portability Act of 1996 (HIPAA). This law has changed the way healthcare related businesses operate inside the United States, and has had wide-reaching and expensive effects on every aspect of the healthcare industry.
There is no central certification authority for HIPAA, and the onus is on individual medical providers to ensure they are compliant with all of the appropriate “rules” within the act. HIPAA, while affording important protection, is a complex and cumbersome regulation with potentially severe civil and criminal penalties for violation. As such, compliance with the act is of utmost importance to “covered entities” (largely, billing providers, employer sponsored health plans, health insurers, and medical service providers, including doctor’s offices and pharmacies) who must ensure that any service provider they do business with is compliant if there is any chance that “Protected Health Information” is involved.
In order to provide our cutting-edge log management and analytics platform to these businesses we need to assure them that Sumo Logic can be trusted to handle this highly sensitive information in a secure and compliant manner. To accomplish this, Sumo Logic has undergone an extensive examination by a well-respected Certified Public Accounting firm who determined that Sumo Logic’s information security program “incorporates the essential elements of the HIPAA final security rule, including but not limited to administrative, physical and technical safeguards.”
This report, (available to Sumo Logic customers and prospects under NDA) is easily digestible by the compliance office at any medical company and will demonstrate our best-in-class dedication to the security of our customers’ data. Our commitment to data security and privacy makes Sumo Logic the only cloud-based log management solution able to demonstrate the ability to operate in a HIPAA regulated environment (as well as the only cloud-based log management service to carry a SOC 2 attestation, the replacement for the venerable SAS70.)
And our compliance story is just beginning! We have several other very exciting initiatives on the way over the next 12 months which will continue to prove that our dedication to enterprise-grade information security practices sets us clearly apart from the rest.
12.03.2012 | Posted by Joan Pepin, Director of Security
A couple of weeks ago I gave a cool little web presentation (I say cool because I like doing those decently more than I like sitting at my desk, and I say little because I went for 33 minutes, and I know I could have gone for 90…) about cloud security best practices and design principles (I will be giving this talk again, BTW, on January 9th for the Amazon Web Services Ecosystem) and I got a pretty good question from one of the viewers. They wanted to know “what mistakes do people make when they utilize cloud based infrastructure providers?” and I thought that was an excellent question, and since not all of you were there to hear my answer, I’m here to share it, and expound on it a little bit.
In my opinion, the biggest mistake you can make in adopting Infrastructure as a Service (IaaS) providers is to just move your data-center into the cloud wholesale in basically the same shape it is already in.
Now, certainly I understand this temptation! You have probably spent a lot of time creating scripts and setting up access controls and logging mechanisms, and everything that comes with building your deployment in a traditional (what I call ‘data-center-centric’) way. And so it may seem that the best, fastest, easiest and cheapest thing to do is to simply pick it up and move it, as it were, to your new “hosting provider”, but this approach may well leave you missing out on some of the best reasons to run in the cloud.
IaaS providers, such as Amazon Web Services, offer a multitude of services and features that can vastly improve your operational efficiency, scalability, and security, but they must be properly leveraged. Cloud computing, while similar in some respects to hosting, is an entirely different paradigm, and in order to take full advantage of it’s benefits, some time and care needs to be taken in the design phase of such a project.
I like to compare the differences in cloud versus data-center configurations in terms of two types of gambling/entertainment most of you will be familiar with- playing Three Card Monte on the street (your data-center) and going to gamble in a major casino (the cloud).
In a Three Card Monte scenario, the ‘house’ makes its money by keeping a tight level of control over the game. They know exactly which card the token is under at all times, they can palm or move the token at will, and they will have one or more shills in the audience to help them control the crowd’s reactions. This can be a very profitable endeavor for the ‘house’, but it is not scalable to large crowds, multiple dealers, or to environments where there is a high degree of scrutiny.
In contrast, a casino is designed to achieve the same ends (to take your money and provide some entertainment in the process) but does so in a very different way. The casino relies on statistics in order to win over any given day. The casino can’t control (due to regulators) which slot machines will pay out exactly how much exactly when, nor can they control which blackjack dealers will have good or bad nights, and they cannot ‘fix’ the roulette wheel, but yet- the house always wins. This model is scaleable to large crowds, multiple dealers and games, and even high degrees of scrutiny. It is through exercising control at a higher level and giving up control at the lower level that they are able to achieve this scalability and profitability.
It works the same in the cloud. Rather than having precise control over your hardware and network connections, you exercise control at the design level by creating feedback loops, auto-scaling triggers and by catching and reacting to exceptions. This allows you, much the same as the casino, to give up control over many of the details, and still ensure you always win at the end of the day.
So just as it would be impractical to set up a Three Card Monte table in a modern casino, simply hauling your existing design into the cloud is not the best approach. Take the time to re-design your system to utilize all of the great advantages that IaaS providers such as Amazon provide.
Tune in later fo Part II.
09.10.2012 | Posted by Joan Pepin, Director of Security
In Part 1 of this post, I discussed standards and regulations in general and some basic compliance concepts, in Part 2 I explore some current standards and regulations and their relevance.
What Happened to SAS70?
SAS 70 is no more. You can take a look here to read all about how and why that happened. (I assure you it is riveting ) Suffice it to say the original standard had become rather stretched and bogged-down, so it has been retired and replaced by a suite of standards under the banner of SAES-16 SOC reports.
09.05.2012 | Posted by Joan Pepin, Director of Security
Lately I’ve been on a lot of calls and email-threads with customers and salespeople concerning compliance with various standards and regulations. I have also been working very closely with our auditors over at Brightline to attain a couple of attestations and a certification for Sumo Logic. I have come to realize that there is a lot of confusion out there regarding all of this. Further adding to the general confusion is the relative newness of cloud-based service organizations like Sumo Logic that leverage IaaS providers. I’d like to clear that all up (in a sort of final way that I can link to when it comes up the next time. )
06.15.2012 | Posted by Joan Pepin, Director of Security
As I mentioned in one of my previous posts, here at Sumo Logic we believe cloud-based services provide excellent value due to their ease of setup, convenience and scalability, and we leverage them extensively to provide internal services that would be far more time, labor and cash intensive to manage ourselves. Today I’m going to talk about some of the services we use for collaboration, operations and I/T, why we use them, and how they simplify our lives.
Campfire is a huge part of our productivity and culture at Sumo Logic. While I would lump this and Skype together under something like “Managed Corporate Messaging” they fill two very different niches in our environment.
Campfire from 37 Signals is a fantastic tool for group conversations. Using the Campfire service, we have set up multiple chat rooms for various types of issues, including Production Issues, Development Issues, Sales/Customer-Support Issues, and of course, a free-for-all chat-room where we try to make one another spontaneously erupt into chaotic LOLs.
These group-chats provide a critical space where we can work together to troubleshoot and solve problems cooperatively. Campfire makes it very easy to upload pictures and share large amounts of information in real-time with co-workers who can be anywhere. The conversations are all archived for later reference, which allows us to use the Production Incidents room as a 24×7 conference call and canonical forum of record for anything happening to production systems. Our Production on-call devs are expected to echo their actions into the Production channel and keep up with events there as they transpire.
Campfire also has a cool feature which allows you to start a voice conference with participants if needed, which is a great option in certain situations. These calls can also be archived for later reference. One down side to the text and audio archives is that they are not easily searchable so it helps to know approximately when something happened, and we have found it necessary to consult other records to determine where to look.
Skype is, of course, the very popular IM and VOIP service that was purchased by Microsoft a while back. We use Skype extensively for 1:1 chatting and easy and secure file-transfers throughout the company. We also make extensive use of the wide array of available emoticons. (Stefan Zier is a particularly prolific and artistic user of these.)
We also use Skype video chat for interviews and to collaborate with team members abroad. We have a conference room with a TV and Skype camera just for this application.
Running a large-scale cloud-based service requires a lot of operational awareness. One of the ways we achieve this is through Cloudkick. Cloudkick was recently acquired by Rackspace and is evolving into Rackspace Cloud Monitoring. We are still on the legacy Cloudkick service, which we have come to use heavily.
We automatically install Cloudkick agents on all of our production instances and use them to collect a wide array of status codes from the O/S and through JMX as well as by running our own custom scripts which we use to check for the existence of critical processes and to detect if things like HPROF files exist.
The Cloudkick website has a “show only failures” mode which we call the “What’s Wrong? Page”. This is a very helpful tool that allows our EverybodyOps team to quickly assess issues with our production environment.
Of course, we also need to be proactively alerted to failures and crossed thresholds that could indicate trouble, and for this we rely on PagerDuty. (Affectionately known as P. Diddy to many of us, nickname coined by Christian). PagerDuty is another great tool which allows us to maximize the benefits of our EverybodyOps culture.
Within PagerDuty we have a number of on-call rotations. One for our Production Primary role and one for the Secondary role, as well as another role for monitoring test failures and a lesser-known role for those of us who monitor the temperature in the one small server room we do have. P. Diddy allows us easily cover for each other using exceptions or by simply switching the Primary and Secondary roles on the fly if the Primary needs to go AFK for a while.
P. Diddy allows each user to set their own personal escalation policy which can include texting, calling, and emailing with a configurable number of re-tries and timeouts. Another nice touch is that the rotation calendars can be imported into our personal calendars to remind us of when we are up next. This all makes the on-call rotation run pretty flawlessly from an administrative perspective with no gnarly configuration and management on our end.
I must admit, I do have a personal habit of “Joaning” my secondary when I am on call… To properly “Joan” your secondary you accidentally escalate an alert to them that you meant to resolve, (I blame the comma after “Resolv”!)
Like many companies of all sizes we rely on Google for our email service. While some Sumos (like myself and Stefan) use mail clients to read our email, most Sumos are happy with the standard web interface from Google. We also heavily use internal groups for team communications.
We also make good use of Google Docs for document authoring and sharing (this blog post was written and communally edited using Google Docs, in fact, due to the impressive real-time collaboration, Stefan Zier is watching me add this bit in order to resolve his comment right now!) We use Google Calendar for our scheduling needs (and calendar-stalking exercises!)
We also use Google Analytics to obsess over you.
Also, as Sumo Logic’s Director of Security, (which makes me partially responsible for managing the users and groups in Google Apps) I appreciate the richness of their security settings and especially the two-factor authentication and mobile device policy management.
These are just some of our SaaS providers. In an upcoming post I’ll talk more about some of the services that help us support and bill our customers and test and develop our product.
We have found all of these providers deliver valuable and even crucial services that it would be far more expensive and time consuming for us to manage ourselves. We hope you may find some of them helpful too!
05.10.2012 | Posted by Joan Pepin, Director of Security
As the Director of Security for a big data company operating in the public cloud, “objection handling” is becoming an increasingly important part of my job. So far I’ve been largely engaging in this proactively: educating our sales and marketing forces, writing blogs and putting together a white paper discussing our security philosophy and some of our design principles.
I have also been talking one-on-one with some of our customers who have security concerns and discussing our plans to obtain various certifications and attestations. For these customers, most of whom are cloud companies themselves, there is no barrier to entry to the cloud other than having all of the proper paperwork in order (which is something we are working diligently towards).
None of these endeavors has yet put me face to face with anything I would label a true objection to our security or the security of the public cloud in general. Yet I know these objections exist. I have heard that some companies will not even consider a cloud-based solution due to their vague “policy” regarding anything cloud.
I would like to think that these objections and vague policies are more than just the knee-jerk reactions of my policy-writing colleagues in the security world to rapidly emerging technology that they have not yet taken the time to understand. I would like to think that their policies and controls are based on well thought-through logic and grounded solidly in their respective business needs and security postures. I would sincerely hope that the vein of technological conservatism that runs within the information security community is not so deeply ingrained as to blind us to the many advantages that are available in the public cloud.
Because the fact of the matter is that the economics make cloud adoption inevitable, and the current over-crowded, expensive-to-maintain legacy situation in many enterprise data-centers is untenable. The increased productivity and decreased time to market for new and powerful services alone is enough of a driver to counterbalance some of the risks inherent in taking on any new technology or platform.
With discipline, adherence to age-old best-practices surrounding data encryption and operational security there is no reason to trust the Public Cloud any less than you trust the Public Internet or the Public Switched Telephone Network on whose shoulders this new Public Cloud firmly stands. And I will also note that massive volumes of highly sensitive data transit these other public networks constantly as a matter of business, and we as an industry and a society deal with that just fine.
At Sumo Logic, we employ encryption end-to-end and we take our security and processes very seriously. We believe that we offer a highly secure service and we have employed some of the best penetration testers in the world to shake us down, and in case that isn’t enough, we have built in features to our service that allow you to control at a very granular level what data you send to us.
I would like to start handling any serious objections that still remain out there. If there is FUD, I would like to address it head on. Where there are legitimate concerns, I want to hear about them and ensure that here at Sumo Logic we work together with each other and our service and infrastructure providers to take the proper steps to address and solve those issues. I believe that the cloud is both safe and inevitable, and considering and responding to concerns will lead to even more solid and secure solutions.
04.26.2012 | Posted by Joan Pepin, Director of Security
For the last two years, Sumo Logic has been (quietly) building a secure, massively scalable, multi-tenant data management and analytics platform in the cloud. For us at Sumo Logic, the Cloud is a concept we believe in and have internalized deeply into our culture, our processes and infrastructure. In our office we only own two equipment racks, and they are less than half full. The boxes there are our back-up server, some security gear, a VOIP box, and a single small server to provide network and AAA services for the LAN. We have ‘dogfooded’ not only our own product here (we make extensive use of our own product for troubleshooting and operations, see Stefan Zier’s series “Sumo on Sumo”), but the entire idea of the cloud itself. From our email and build environment, through our CRM and our product itself, we live in the Cloud. Through adopting best practices and developing some of our own we operate there in a way that is designed to be secure, and I’d like to share some of the insights we’ve picked up along the way.
Of course, the “Cloud” is a nebulous term and here at Sumo Logic we use several different types of cloud-based services, which mostly break down into two categories; SaaS and IaaS. On the SaaS side, we have our email and CRM, testing, support and billing and as well as a number of services we use to monitor and alert on our service availability, and on the IaaS side, we use AWS to host our build environment and its associated bug-tracker, wiki and code-repository. One of the many advantages to this model is exemplified by our build environment (Hudson). Hosting this in EC2 provides us with great flexibility in bringing up new build-slaves at peak times, such as before a major release or branch.
In general, the SaaS providers we use provide excellent security features. For instance, we mandate the use of two-factor authentication and strong passwords for access to Sumo Logic email, and our provider has a rich variety of security controls and features such as the two-factor authentication that we can (and do) leverage. This is much simpler than keeping this level of security would be if we ran the whole mess ourselves
On the IaaS side, Stefan Zier has done an amazing job of setting us up in AWS. One example of how IaaS features can be leveraged is the way in which he handled access to our AWS hosted resources. In addition to username and password authentication to our cloud based services (more on that later) Amazon “security groups” are used to limit network-level access to these services to only certain IP addresses on a whitelist. In order to handle the automation of that whitelist, we make use of a dynamic DNS provider that assigns hostnames to authorized systems. Stefan wrote a program which polls for the addresses of those authorized hosts and updates the corresponding security group in AWS. We plan on getting this set up on AWS’ Virtual Private Cloud sometime soon, which will allow us to layer a VPN on top of this already very secure solution.
Another layer of protection we incorporate is anonymity. All of our cloud-based company infrastructure is attached to a domain which is not connected to Sumo Logic in any way. Similarly, we have used anonymized labels for our private git repositories, etc. The public cloud allows for some obscurity and anonymity, and we leverage that.
Of course, there is still work involved in keeping things secure. Living in the cloud means having a lot of accounts. A LOT of them. Our process for on-boarding and off-boarding employees requires the creation or deletion of a very large number of accounts and the adding or removing of a lot of tags, groups, lists and checkboxes. Having solid documented procedures for this is the only way to keep it straight and running smoothly. We also have to host our own LDAP server for AAA to some of our tools, and we also use this for our VPN authentication. So we have to manage that, and it is a pain. Centralized AAA and policy/group management services exist for cloud-based services, and we’ve looked at some. Unfortunately, none of them also supported hosting or managing an LDAP instance for us, and keeping that synced up and tied-into the rest of the mess would be a killer feature. We certanly feel there is a gap in the market here that we wish somebody would fill.
From an end-user perspective, there are a lot of accounts to keep straight and a lot of passwords to remember. In order to make this both secure and manageable, we provide (and mandate the use of) a password management tool that runs both Mac and Windows (and has a useable web-interface for Linux and others) and also runs on Android and iPhone. It uses a cloud-based file-storage service to sync its encrypted password database between devices. This allows us to mandate that users have extremely strong passwords that are different for every account and it gives our users the tools to actually comply with that rule
Of course, building a secure cloud-based service ourselves requires a lot of thought and engineering well beyond just leveraging our provider’s consoles. We have done a lot of thought about how to build a secure service leveraging IaaS and we have written a paper about some of the design principles and practices we employ. If you are interested you can download it here.
03.06.2012 | Posted by Joan Pepin, Director of Security
Logs are the Cornerstone of Security Best Practices.
Anyone who has worked in the field of Information Security for any length of time will tell you: there is a lot of security and security-relevant data out there in the enterprise. Access logs, database logs, application logs from web, email and other services are all useful and sometimes essential in bringing a security investigation to a successful conclusion. Of course, traditional firewalls alone generate a tremendous volume of logs, and web-proxy log volumes can be staggering. Intrusion Detection Systems (IDS) are very noisy, and then you have the Anti-Virus logs, the DHCP logs, Active Directory or LDAP logs, authentication logs from disparate operating systems spread across the globe (often in different time-zones).