How to Build a Scalable, Secure IoT Platform on GCP in 10 Days
The cloud is an attractive proposition not just for enterprises, but also for startups looking to disrupt markets. Leveraging the cloud, innovative startup teams can disrupt legacy business models and markets by reducing cost and time to market. Most developers have a very good understanding of Amazon Web Services (AWS), but not too many have taken the plunge into Google Cloud Platform (GCP) and its offerings yet.
As a data engineer, I’ve been using GCP for nearly two years, building high-performance data-intensive applications and have been a longtime fan of the cloud platform. For anyone new to cloud development, my goal is to help you better understand the various services available on GCP, how they inter-operate, why you would choose to use them and what time and resources are needed. This article will walk you through our design for a minimum viable product (MVP) for an internet of things (IoT) customer use case step-by-step to learn how exactly to build a platform to manage IoT hardware, collect sensor data, perform analytics, develop web and machine learning applications — all without deploying any infrastructure.
Implementing DevSecOps in the Cloud
In this webinar, George Gerchow, VP of Security and Compliance at Sumo Logic, will do a deep dive into the steps it takes to successfully implement and maintain DevSecOps in your organization at scale.
The customer for this specific use case was a newly created digital innovations team at one of the world’s largest farm equipment manufacturers. The digital innovations team first had to develop a prototype environment sensor to capture local farm parameters such as temperature, pressure, humidity, moisture, soil pH and solar radiation. Development of this hardware prototype was the first step towards addressing a $4 billion worldwide market for data-driven precision agriculture.
Next, the team needed to quickly build out an MVP analytics platform with machine learning capabilities to acquire and analyze terabytes of data. The need was to minimize development costs as the MVP was to be a technology demonstrator to stakeholders and partners. However, once approved, the platform had to effortlessly scale to handle tens of thousands of sensors on product launch.
The data gathered hourly from sensors will be used to train machine learning algorithms for precision farming and will also be analyzed on-demand by resident data scientists. A web application will also be required for device owners to see the latest data points in real time.
We took on the challenge to build an MVP for the customer while keeping upfront costs low. The following goals were identified for the project:
Elastic scalability and performance at launch
Support for software updates to sensors
Platform security and integrity
Data warehousing to enable analytics
Real-time location aware database queries
I chose GCP because I was already familiar with it, and because GCP’s service offerings in the big data and artificial intelligence (AI) space are excellent and were perfectly suited for this project. Also the team at Google has been very proactive at answering technical questions and we could leverage their engineers if we wanted to. We decided on a serverless architecture that could scale seamlessly and meet the requirements of the project while keeping development costs low and operating costs predictable.
Here is what the architecture for the MVP looks like:
Hardware – The customer chose to go with a Raspberry Pi 3 running RaspbianOS as the sensor operating system. Google offers a stripped-down version of Android for IoT devices called “Android Things.” Android Things lets developers configure and push over-the-air software updates to the operating system and applications on devices via an API call.
Cloud IoT Core – Cloud IoT Core is a serverless platform that enables device enrollment and data collection for IoT devices. The service performs auto-load balancing and scales to support data ingestion from millions of devices. Devices communicate with an MQTT broker that publishes data to Google Pub/Sub. IoT Core is equivalent to similar services offered by AWS and Microsoft Azure.
Pub/Sub – Pub/Sub is a messaging queue in the cloud with strong reliability. It can scale to handle spikes in data volume when many devices simultaneously respond to events in the physical world and it requires absolutely no infrastructure. Each device sends data points and a deviceID to a Pub/Sub channel (events). Any application that needs to read data from devices can now subscribe to a single channel to receive all the data.
BigQuery – BigQuery is Google’s serverless, fully managed data warehouse that can run SQL like queries against terabytes of data in seconds. Using BigQuery allows the customers’ data scientists to ask questions against terabytes of data without needing to deploy and manage Hadoop clusters and at very reasonable costs.
Cloud Machine Learning – Once we collect enough data, my team plans to use Cloud ML and Tensorflow to train the models the customer needs on cloud hosted infrastructure. Once trained, the models can be easily exported to run elsewhere. Google offers a free crash course if you are a beginner to machine learning.
Firebase Firestore – When you need a database that supports location aware queries in real time, look no further than Firestore. It is a NoSQL document database. Pokemon Go was built on a predecessor of Firestore called Datastore. It comes with software developer kits (SDKs) for most programming languages and is really easy to develop with.
Firebase Hosting – For the web app, my team used the Vue.js framework. The compiled application code was deployed to Firebase hosting which caches application files on edge servers all over the world. This means blazing fast load times, regardless of user location.
Think Serverless – Writing applications that leverage fully managed and interoperable cloud services allows you to develop products faster and with less complexity. You no longer need to worry about provisioning servers and services, clustering, software updates, security monitoring, etc., and can focus on developing your ideas into products. Scalability, performance, security and availability come built in.
Cost – Cost is by far the biggest motivation to go serverless. Running a cloud function can mean cost savings of 80 percent or more compared to running dedicated servers with application logic. With a generous free trial credit, building an IoT platform on Google Cloud cost us nothing. You can estimate running costs in advance using GCP’s pricing calculator.
Time – I spent 10 days designing the architecture and another 10 days building the platform with the help of a frontend engineer. The time commitment was reduced by afactor of three by eliminating the need to deploy infrastructure.
GCP vs. AWS vs. Azure – which to choose? If costs are similar, choose the platform that you can most easily support. The AWS community is very active right now. However, GCP does a great job at reducing barriers to entry. I foresee GCP quickly gaining market share over the next two years.
I hope you can now walk away with a better understanding of GCP and how to construct a completely serverless platform for your next project, paying only for the resources consumed. Happy building!
Want to know how to monitor & secure your GCP stack with Sumo Logic? Check out these additional resources!
Amit Deshmukh has been writing code for over 15 years, is a GCP Certified data engineer and a contributing blogger to Machine Data Almanac. He is also a consultant for 80 Analytics, where he works on blockchain and big data projects. Follow him on Twitter: @AmitDeshmukh.