[OpenWhisk] The Promise Of Serverless (Part 1)

Motivation

For us “mostly front” developers, server maintenance and deployment can be a pain in the neck. Can I install and run a Linux server? Sure. Do I actually enjoy it? Not really.

This is not to say that it can’t be fun or engaging, just that when most of your time is spent pushing pixels to a screen, the vagaries of interlinked services that all have to run smoothly for your infrastructure to be limping along can be daunting. Not to mention all the security issues and whatnot. So, while I can trawl my way through man pages and have something that can host this blog, do some mail, run my various APIs, there’s always that sense that it is too complicated just for that.

Let me be clear, I participated in the early porting efforts of Yellowdog Linux to Mac 68k, and I’m very comfortable with the CLI. But sometimes, we all wish for things to be simpler, ya know?

Enters the container Thing

If you ask people like me what piece of technology gives them hope for server stuff, it’s the whole concept of containers (LXC, docker, whatever kind you like best). Sometimes they even attempt to explain what it actually is, and rarely does the average person understand the improvement to the quality of life it provides. I shall now attempt to do explain so that they do, but I know it’s an uphill battle.

Every computer needs an OS to function. Simplifying to the extreme, the Operating System is the layer that exists between your apps and processes and the actual hardware it runs on. Your app needs the contents of a file? It asks the OS to give it to them. Your app plays audio? it pumps out the bits to the OS that forwards it to the audio chip, which in turn makes the speakers speak. It’s also true for the hundreds of invisible programs that run without putting any pixel on a screen. Your cloud folder that syncs, your web server, your backup thing, etc.

What’s wrong with that? Nothing, it’s worked beautifully since, well, forever. But with the complexification of the various services and apps, all these things that sit on top of the OS are both interconnected and in competition for resources. What should happen if two programs open the same file? The OS has to handle that. It’s a simple enough problem to solve. But what if two programs rely on two different versions of the OS? “Simple! you just upgrade”. Well, sure, but then you’re totally dependent on the goodwill of whoever writes the software and maintains OS compatibility. And what happens if one program gets updated and breaks the interconnection with another? Same deal, you’ll have to wait, potentially forever.

Containers are an attempt at solving both of these problems in one fell swoop: they isolate a group of programs from the others so that you can be sure of the versions that “work well together” won’t be broken by the update of something one of them depends on, and they provide a sort of miniature version of an OS that provides everything those processes need to function. They also isolate them in the sense that two containers can’t use the same resources, because from their point of view they are alone on the OS.

Let’s imagine a fairly straightforward example:

  • I have a program that takes an image and grayscales it, using the version 1.3 of GraphicsLib
  • I have another program that scales it up to 1080p, but it relies on the version 1.5 of the same library, which doesn’t support grayscale anymore. I can’t use the 1.3 version because it doesn’t support 1080p yet.

Bummer right? Well, I could have two containers, one with version 1.3 of the library and the first program, and one with the 1.5 version and the second one. That way, my only problem is getting an image inside the first container, run the first program, get the result, get it out of the container and inside the second one, run the second program, and extract the result. Neither of them is the wiser about the existence of something that is potentially harmful to their proper behavior.

That, in a nutshell, is the power of containers: total mastery of the context a given program runs in, independently of whatever else is running elsewhere on the computer. Before containers, we used to use virtualization (running a whole new OS on top of the existing one), but it brings with it the same problems of setting up things, user management, and the rest. Containers are much more lightweight, because you only virtualize the things that are specific to a program.

It is not without drawbacks, of course, but in programming, what is?

  • access to “real” parts of the system that could maybe be shared between containers is hard and unreliable. Forget having several web servers that use the same html files and output logs in the same file. It’s not impossible, but it goes against the philosophy of the container, and while using a hammer does get the screws into the wall, it’s still better to use the tools as they were designed
  • networking quickly becomes an issue, as every container thinks it is its own computer, with its own address, range of ports, etc. If you’re new to containers, you’ll have to get a little bit of networking voodoo on.

Problem solved, right?

Almost, but not quite. When application developers (also called front devs, or user facing devs) work up their magic, these days they have to interact with servers and services more often than not. Let’s say I want to setup a todo app. I will probably want to sync my items across devices, and I will need to use a server for that. That server will have to have the “talk to the app” syncing parts, and the storage parts, as well as some kind of user management for the groups. Potentially, we’re talking of 3 or more programs running server-side: the sync server, the database for the data, and maybe some process that allows users to use their credentials from a third party like google, twitter or facebook. If those need to be containerized, and they probably do, I have no less than 3 things to configure carefully so that they can talk to each other as fast as possible. At least I don’t have to worry about the versions of the OS or libraries each of these things require, but they are still critical components that need to be running 24/7 even if they are severely underutilized.

“Serverless” isn’t without server, but it appears so

So… what if I could start those programs only when I need to, in order to perform their specific tasks? For instance, a user launches the app on a new device, they have to log in, then use the sync server to extract their data from the database. Wouldn’t it be cool if I could launch the authentication mechanism only then? They are my only user, sadly, so they sync server is only needed whenever they do that or create a new todo item, so I could also start the sync server only if and when a user attempts to sync some data.

That’s the promise behind Serverless (Lambdas, Functions, etc): there is a server running 24/7, but it’s minimal, and its job is to recognize that some process is needed and start the container that hosts it. Seen from the outside, the functionality is always available, but from the OS point of view, no program is running until you need it.

It’s a nascent field, but there is already some very interesting stuff going on here: for us developers, it’s akin to actual functions. The code is there, ready to be run, at all times, but it doesn’t use any memory or processing power until you call it.

Being the cagey paranoiac that I am, I ended up deploying and testing OpenWhisk, because I can see the source code, I can control how and where it’s run, and if I screw it up it impacts no one but me. The only thing that bothers me is that while the community is easy to reach and very friendly, the instructions are complex and assume a lot about the general knowledge of IT in general, and Linux in particular, to work out the kinks in using anything more recent than the recommended Ubuntu 14.04 it’s supposed to be running on.

Testing it on the approved version, I liked what I saw and promptly embarked on a journey to deploy it on a radically different flavor of Linux, to prove to myself I could hack it, and to understand the guts of this new toy better.

In the next post, I shall narrate the installation of OpenWhisk on the latest version of CentOS. Yum, new skills to be apt at (horrible puns intended).

  

Leave a Reply