Back to blog results

March 27, 2019By Sadequl Hussain

What is Serverless and AWS Lambda?

What is serverless?

Serverless computing is a cloud-based application architecture where the application’s infrastructure and support services layer is completely abstracted from the software layer.

Any computer program needs hardware to run on, so serverless applications are not really “serverless” - they do run on servers - it’s just that the servers are not exposed as physical or virtual machines to the developer running the code. In a truly serverless paradigm, program code runs on infrastructure hosted and managed by a third party - typically a cloud service - which not only takes care of provisioning, scaling, load balancing and securing the infrastructure, but also installs and manages operating systems, patches, code libraries and all necessary support services. As far as the user is concerned, a serverless back-end would scale and load-balance automatically as the application load increases or decreases, all the while keeping the application online. The user would only need to pay for the resources consumed by a running application.

Theoretically at least, this has the promise of drastically reduced development cycles and low operational costs. And that’s why serverless is a hot buzzword in today’s IT world.

Types of serverless

Serverless applications can be made up of two types of components:

  • Serverless functions
  • Serverless backends

Of these, serverless functions are hosted and managed by a type of service known as “Function as a Service” or FaaS. FaaS is the primary platform for running serverless program code. With FaaS, developers write independent code scripts known as “functions” and upload those functions to the FaaS. The code can be then triggered by some event or run on a schedule. Popular examples of FaaS are Amazon Lambda, Azure Functions or Google Cloud Functions.

Serverless backends on the other hand, refers to managed services which serverless functions can make use of. These services are typically used for storage, database, messaging, notifications, orchestration or security. Like FaaS, users don’t need to provision and manage any infrastructure when using a serverless backend.

Another feature of serverless backends is that they are not coupled with FaaS only. This means non-serverless applications can also make use of serverless backends.

An example of serverless backend is Simple Queue Service (SQS) which provides a managed message queuing service from Amazon. Similarly, Amazon Aurora Serverless is a serverless database service. This is distinctly different from Amazon RDS or Aurora which - although being a managed service - requires users to provision and manage database instances.

Some common serverless service

The following table lists some serverless services from three major cloud vendors.

Vendor

Function-as-a-service

Serverless backends

Amazon Web Service

AWS Lambda

Simple Storage Service (S3), Elastic File Systems (EFS) for object storage

DynamoDB, DocumentDB, Aurora Serverless for data store

Simple Notification Service (SNS) for notifications

Simple Queue Service (SQS), for asynchronous messaging

Kinesis for streaming data

Microsoft Azure

Azure Functions

Azure storage for object storage

Azure CosmosDB, for data store

Azure Active Directory for user authentication

Azure Event Grid for event routing, Service Bus for messaging

Azure Kubernetes Service for orchestrating serverless containers

Google Cloud Platform

Google Cloud Functions

Cloud Storage for object storage

Cloud FireStore for NoSQL database

BigQuery for data warehouse

Cloud Pub/Sub for messaging

Cloud Dataflow for streaming or batch data

In this article, we will primarily focus on serverless functions.

Why serverless?

Businesses realize a number of benefits when they start using the serverless model. These benefits are also its well-known features:

  • There is no system administration involved. No servers to provision, no network bottleneck to worry about, no service outages for failover, no firewall to configure and no runtimes to install. This means almost zero operational maintenance cost while the application developers concentrate on the app’s functionality.

  • The application automatically scales with load. Whether it’s ten users over an hour or hundreds of thousands of concurrent users, the underlying infrastructure elastically scales to meet demand.

  • There are no wasted resources. The underlying container running serverless code is ephemeral: it’s destroyed once the code finishes running.

  • It’s a pay-per-use model - users only pay for the computing resources consumed by the code at run time or the number of API calls made. This drastically reduces an application’s Total Cost of Ownership (TCO).

  • Low management overhead means agile teams can develop, test and deploy new features more frequently, decreasing the overall time-to-market.

Common use cases of serverless

Here are some applications of serverless architecture:

  • Microservice architecture is a way of building applications as collections of small, loosely-coupled components or “services”. Each component performs a particular small function and has a small number of well-defined interfaces to it. Other functions communicate with it using lightweight protocols.

    Serverless functions are ideal as microservice building blocks. Each service can be written as a serverless function, tested, deployed and maintained independently.

  • Sometimes, processes need to run on schedule. For example, a file may need to be copied from an FTP site to an object storage medium once every day. Code written in any popular language can do this. Instead of dedicating a whole server to run the code, it can be turned into a scheduled serverless function.

  • A serverless function can be used to perform one or more actions in response to an event. Using the example above, when the FTP file is copied to object storage, the file copy event can start another serverless function which processes the file’s contents.

Some of the most common use cases of serverless also include: performing authentication, serving static web content, running ETL jobs and database queries, processing IoT and streaming data as well as real time file processing.

The image below shows a simplified application architecture that makes use of some serverless functions and services:

Here, a user is accessing a web application from a mobile device. The user’s HTTP request is routed by a DNS server. The static content of the site is served up by a content delivery network (CDN) that interfaces with an object storage medium. The dynamic content is served by a web server which sends its requests to an API gateway. The gateway routes application requests to different serverless functions. One function is used for authentication, another one takes care of reading from and writing to a backend database while a third one saves session states in a key-value NoSQL database.

Common features of serverless

Serverless functions are “stateless”, which means they cannot save and share session states between two different runs of the same function or different functions. There is no way for serverless functions to save data or files in the underlying disk subsystem. Similarly, a serverless function will often call other serverless functions as part of an application stack. Finally, a function may need to communicate with other functions using asynchronous messaging.

Developers can make use of serverless backend services to overcome these challenges. For example:

  • NoSQL databases can be used for saving session states
  • Object storage mediums can be used for writing files
  • Serverless relational databases can be used for persisting structured data
  • Serverless ETL platforms can be used for data extract, transform and load
  • Managed messaging services can be used for passing messages between functions
  • API gateways can be used to route application calls to serverless functions
  • Serverless logging and monitoring services can be used for recording runtime events
  • Serverless directory service, authentication service, token service or key management service can be used for security.

Serverless disadvantages

The serverless model may not be the answer in certain cases.

  • Long-running processes or continuously running applications are not suitable for serverless. This is because serverless functions timeout after a maximum period. Large database backup is an example of a long running process. Such operations can fail due to this timeout.

  • There is a certain time lag when a serverless function is called for the first time. This happens because the system has to provision a container for the program to run. This is called a “cold start”. For a frequently running function, the container may not be destroyed, which will make subsequent runs faster. However, the time lag may not be acceptable in certain cases such as online gaming or e-commerce applications where sub-millisecond latency is the norm.
  • Refactoring complex applications into serverless components can be costly and time-consuming. Dozens or even hundreds of functions may need to be orchestrated and managed - a complex task in itself.

  • There are limited ways of testing serverless applications. Serverless app debugging usually means running it “live”, which means testing and debugging actually costs money.

  • Last but not least, serverless also means a lock-in with a service provider. Interoperability between third-party vendors’ FaaS APIs can be a major headache. There is also the risk of pricing changes of data leakage from the vendor’s side. Changing the provider may need a significant amount of code change.

Serverless vs. containers

Another hot topic in today’s computing world is containers.

A container is a stand-alone, scaled-down unit of computing environment that can run executable code. It’s a lightweight version of a virtual machine that has everything installed for running programs: an operating system, runtime libraries, system tools, mapping to persistent storage and an “entry point”.

Containers also run on servers, but the program running inside the container is not aware of the underlying hardware. Behind the scene, serverless functions also run on containers. Serverless goes one step further by abstracting this compute layer from the user.

The table below shows general differences between containers and serverless functions

Containers

Serverless functions

User is responsible for writing a container definition file that installs operating system, software and necessary run times, map storage and configure networking.

The user then creates an image from the file, uploads the image to a registry and instantiates the container from that image.

The entire process is abstracted from the user. A user only needs to write program code supported by the serverless platform and upload it. The service provider takes care of provisioning the computing environment.

Once started, a container will keep running unless explicitly shut down or destroyed.

The underlying computing environment of a serverless function is destroyed when it completes running.

Even when no program is executing, a running container will need a server to be available. This results in operational expense.

Serverless functions are charged only for the computing resources they consume when running.

Code running in a container is not restricted by any timeout.

Serverless functions are usually constrained by timeouts.

Containers can run in a cluster of machines.

Any underlying hardware is transparent to users.

A container can save data to its ephemeral storage or a mapped storage volume.

There is no option for saving data to ephemeral storage from a serverless function. Data is usually saved in object storage medium.

Containers can host complex applications or simple microservices.

Serverless functions are best suited for microservices.

User can choose language or runtime for an application running in a container.

Choice of language for serverless functions is limited to what the service provider supports.

Monitoring serverless applications

There are two aspects of serverless application monitoring: monitoring application logs and measuring performance.

Serverless logs

Any well-designed application should generate run-time logs. Log messages are invaluable for troubleshooting application failures or slow performance. Most languages used for writing serverless apps have dedicated logging frameworks. Using these frameworks, developers can embed log messages in their code which are emitted during run time.

Since serverless functions cannot save any data in local storage, the log messages are routed to a different location - typically, another managed service. For example:

  • AWS Lambda functions’ log messages are recorded in CloudWatch Logs.
  • Google Cloud Functions’ log messages are sent to Stackdriver logs.
  • Azure Functions’ log messages are saved in Table Storage key-value store.

Log messages from these sources can be sent to third-party log management solutions.

Serverless metrics

FaaS providers also expose metrics which can help monitor a serverless function’s performance. The table below shows some typical metrics

Metric

What it means

Executions

Number of times a function was run in the last sample period, categorized by its status (ok, timeout or error).

Errors

Number of times a function failed in last sample period due to internal errors like timeouts, out-of-memory, insufficient privileges or unhandled exceptions.

Ideally, this value should be zero. A consistent non-zero trend requires troubleshooting.

Throttles

Number of times a function was stopped from running in the last sample period because the rate at which it was being called exceeded its allowed limit of concurrent runs.

Duration or Execution times

The time in milliseconds or nanoseconds a function was running.

Memory usage

Maximum amount of memory used by the function during execution

Note how there are no disk or CPU related metrics here. This is expected because the underlying storage or computing resources are abstracted.

Monitoring serverless with Sumo Logic

Sumo Logic is a cloud-native, Software-as-a-Service (SaaS) platform for machine data analytics. It’s used for storing, analyzing and creating insights from machine-generated data and logs. It’s also a powerful SIEM (Security Information and Event Management) tool.

Users can easily subscribe to Sumo Logic and start sending logs and performance data from their on-premise or cloud-hosted assets. The ingested data can be then meaningfully interpreted with Sumo Logic “apps”.

Apps are pre-configured searches and dashboards for different types of data sources. Sumo Logic comes with a number of out-of-the-box apps. These apps help quickly turn raw data into critical insights.

For serverless, the following Sumo Logic apps are available:

To see how Sumo Logic can help monitor your serverless footprint, sign up for free.

Sadequl Hussain

Sadequl Hussain is an information technologist, trainer, and guest blogger for Machine Data Almanac. He comes from a strong database background and has 20 years experience in development, infrastructure engineering, database management, training, and technical authoring. He loves working with cloud technologies and anything related to databases and big data. When he is not blogging or making training videos, he can be found spending time with his young family.

More posts by Sadequl Hussain.

People who read this also enjoyed