May 21, 2013By Sebastian Dijmarescu

From Academia to Sumo Logic dir="ltr">While I was wrapping up my Ph.D. thesis, my girlfriend (now wife) and I decided that we wanted to leave Germany to live and work in a different country. Prior to my Ph.D., I started off in computer gaming (ported “Turrican 2” to the PC when I was a kid1). Following that, I did my MSCS and Ph.D. in distributed systems and computer networks in Karlsruhe, Germany.

I have been working as a Software Engineer at Sumo Logic since October 2012. At first I was skeptical about how intellectually engaging and challenging a commercial venture in log management could be. However, after working at Sumo Logic for more than 6 months, I have to admit that I misjudged the academic and engineering challenges of log management.

Why? I underestimated the problem and potential!

In contrast to academia, where algorithms are tested under controlled and reproducible conditions, we face the full force of unexpected behaviors of a live system here at Sumo Logic. When we turn algorithms into reality, we are responsible for the entire development process, including planning, testing, and implementing the finished component in a production environment.

No other company is approaching Big Data-scale log management like Sumo Logic. As a main differentiator Sumo Logic offers enterprise class log file processing in the Cloud. Sumo Logic ingests terabytes per day of unstructured log files that need to be processed in real time. In contrast to websites or other content, log files need exact processing; e.g., a needle in the haystack of logs can be comprised of merely 16 characters (out of the terabytes of data ingested and stored). Thus, there are only a few heuristics we can use to increase efficiency. This makes developing new algorithms to process log data challenging and interesting.

Furthermore, all our databases need to answer queries in a timely manner. Databases with unpredictable latencies on certain queries are not suitable for the problems we are solving. We mix-and-match between open source technologies and in-house customized solutions for that reason.

In addition, our customers trust us with information of vital importance to them. Security concerns influence design decisions across many levels, ranging from operating system level for full hard drive encryption, to application level for role-based access control (RBAC). We have to carefully select algorithms to balance performance (encrypted log files can challenge the efficient use of our cloud resources) while continuing to isolate customers, so that one customer’s demands don’t impact the performance of another.

In summary, I am glad I took the opportunity and joined Sumo Logic to turn my academic research into solutions used by customers to process TBs of their critical data in real time. This experience has brought self-improvement with each challenge, full-stack knowledge, and a sense of engineering not possible in any other environment.

