A Practical Use For DictionaryCoding

Those of you who read the previous blog know that I'm a huge fan of OpenWhisk. It's one of those neat new-ish ways to do server side stuff without having to deal with the server too much.

Quick Primer on OpenWhisk

The idea behind serverless technologies is to encapsulate code logic in actions: each action can be written in any language, take a json as a parameter, and spit out a json as results. You can chain them, you have mechanisms to trigger them via cron jobs, urls, or through state modifications.

The example I used back in the day was a scraper that sent a mail if and when a certain page had changed:

cron triggered a lookup, lookup was written in Swift, checking HTTP code and content, passing its findings along an action written in Python that would compare contents if needed to its cache data stored in an ElasticSearch stack (yes I like to complicate things), before passing the data needed for a mail to be sent to a PHP action that would send the mail if needed.

It's obviously stupidly convoluted but it highlights the main advantage of using actions: you write them in the language and with the frameworks that suit your needs the best.

One other cool feature of OW is that when you don't use the actions for a while they are automatically spooled down, saving on resources.

OpenWhisk for iOS

There is a library that allows iOS applications to call the actions directly from your code, passing dictionaries and retrieving dictionaries. It's dead simple to use, and provides instant access to the serverside logic without having to go through the messy business of exposing routes, transcoding data in and out of JSON and managing weird edge cases.

For the purpose of a personal project, I am encapsulating access to  some Machine Learning stuff on a server through actions that will query databases and models to spit out useful data for an app to use.

I'll simplify for the purpose of this post, but I need to call an action that takes a score (or a level) and responds with a user that is "close enough" to be a good partner:

struct MatchData : Codable {
  let name : String
  let level : Int
  let bridgeID : UUID?

My action is called match/player, and my call in the code looks like this:

do {
  try WLAppDelegate.shared.whisk.invokeAction(
    qualifiedName: "match/player", 
    parameters: p as AnyObject, 
    hasResult: true, 
    callback: { (result, error) in
      if let error = error {
        // deal with it [1]
      } else if let r = result?["result"] as? [String:Any] {
        // deal with it [2]
      } else if let aid = result?["activationId"] as? String {
        // deal with it [3]
} catch {
  // deal with it [4]

A little bit of explanation might be required at this point:

  • [1] is where you would deal with an error that's either internal to OW or your action
  • [2] is the probably standard case. Your action has run, and the result is available
  • [3] is a mechanism that's specific to long running actions: if it takes too long to spool your action up, run it and get the result back, OW will not block you for too long, and give you an ID to query later for the result when it is available. Timeouts both for responding and for a maximum action running time can be tweaked in OW's configuration. Your app logic should accomodate for that, which you are doing already anyways because you don't always have connectivity, right?
  • [4] is about iOS side errors, like networking issues, certificate issues and the like

That Whole JSON/Dictionary/Codable Thing

So, OW deals with JSONs, which are fine-ish when you have a ton of processing power and don't mind wasting cycles looking up keys and values. The iOS client translates them to [String:Any] dictionaries which are a little bit better but not that much.

In my code I don't really want to deal with missing keys or type mismatches, so I map the result as fast as I can to the actual struct/class:

if let r = result?["result"] as? [String:Any], let d = try? DictionaryCoding().decode(MatchData.self, from: r) {
  // yay I have d.name, d.level and d.bridgeID 
  // instead of those pesky dictionaries

Of course, the downside is that decoders are called twice but:

  • DictionaryCoding is blistering fast
  • As long as I pass this object around, I will never have to check keys and values again, which is a huge boost in performance


My front facing actions are all written in swift, using the same Codable structures as the ones I expect in the app. That way, I can focus on the logic rather than route and coding shenanigans. OpenWhisk scales with activity from the app, and the app doesn't need a complicated networking layer.

My backend actions responsible for all the heavy lifting are written in whatever I need for that task. Python/TensorFlow, for me, but your mileage may vary.

My Workflow

It's not that I'm paranoid, it's just that I don't see why I wouldn't be careful
(unknown hero)

My entire development workflow is arrayed around tools I can install and maintain myself, insofar as it's possible. Especially the server side stuff. Students and clients ask, so here goes.

What are we collaborating on?

The first step and the most important one is documenting the expectations. We're about to embark on a multiple days/weeks/months journey together, and we hopefully agreed on what should be done. Right now we shake hands, and I'm already sure that the variations in our respective minds have already started to add up.

The first step is to get all the conception documents and all the available documentation which we validated into a shared space. Git repo, wiki, shared folder, anything that is accessible at any time to avoid the "by the way, what did we decide on X?" question when crunch time occurs. If we have already taken a decision, it goes into the shared documents. If not, we decide, and then it goes into the shared document.

Versionning / Issues

I use a self-hosted instance of Gitlab, which gives me git (obviously), but also issues in list or kanban forms, and access to CI/CD (more later), as well as a wiki. My code, and my customers' data, matter to me. It's not that it's a secret, just that I want to be able to control access.

For commits and tasks, I use Git Flow, or slight variants of it. It makes for easier collaboration if it's a multi-dev project, and allows for easier tracking of issues.

Speaking of issues, back before Agile was so heavily codified, a simple 4 boards system works just fine

  • to discuss
  • to do
  • doing
  • done

I guess I could split boards between various domains as well, but I use tags for that (UI, back, etc). It shows at a glance where I am on my project.

Testing, CI/CD, all that jazz

Up till this point, most people were probably nodding along, or arguing about tiny points of detail. The problem with testing (unit tests, UI tests, beta tests, etc etc etc) is that it's heavily dependant on what you are trying to build. If there's an API, then by all means, there should be unit tests on it. Otherwise...

The bottomline for me is this: if there are several people on a project, I want clearly defined ownership. It's not that I won't fix a bug in someone else's code, just that they own it and therefore have to have a reliable way of testing that my fix works.

Coding is very very artsy. After a while, you can recognize someone's coding style. Anyone who had to dive into someone else's code know that it's very hard to mimic exactly how the original author would have wrote.

Tests solve part of that problem. My code, my tests. If you fix my code, run my tests, I'm fairly confident that you didn't wreck the whole thing. And that I won't have to spend a couple of hours figuring out what it is that you did.

At interface points too, tests are super useful. Say I do that part of the app and you do that other part of the app. If we agree on how they should communicate, we can setup tests that validate the "rules". Easy as that.

For mobile app development, the biggest hassle is getting betas in the hands of clients/testers. Thankfully, between testflight, fastlane, and all the others, it's not that complex. The issue there is automating the release of new betas. I used to use Jenkins, but sometimes, the stuff in Gitlab itself is enough. Mileage will vary.

One thing's for certain: if you test manually your code, it's either a very very very small and simple project, or you're not actually testing it.

Chat / communications

Email sucks. I log in my computer, see 300+ emails, and am very quickly depressed. SMS/iMessage is for urgent things.

Back in the day, I would have used IRC, for teams. Slack came along and everyone uses it. But Slack comes with a lot of unknowns. The gitlab omnibus installation comes with Mattermost, which is like slack, but self-hosted. Done.

Coder & Codable

Working on CredentialsToken, it struck me as inconcievable that we couldn't serialize objects to dictionaries without going through JSON. After all, we had this kind of mapping in Objective-C (kind of), why not in Swift?

Thus started a drama in 3 acts, one wayyyyyyy more expository than the others.

TL;DR Gimme the code!

Obviously, someone has done it before. Swift is a few years old now and this is something a lot of people need to do (from time to time and only when absolutely needed, admittedly), right? JSON is what it is (🤢) but it's a standard, and we sometimes need to manipulate the data in memory without going through 2 conversions for everything (JSON <-> Data <-> String), right?

Open your favorite search engine and look for some Encoder class that's not JSON or Property List. I'll wait. Yea. As of this writing, there's only one, and I'm not sure what it does exactly: EmojiEncoder

So, next step is the Scouring of Stack Overflow. Plenty of questions pertaining to that problem, almost every single answer being along the lines of "look at the source code for JSONEncoder/JSONDecoder, it shouldn't be so hard to make one". But, I haven't seen anyone actually publishing one.

Looking at the source code for JSONDecoder is, however, a good idea, let's see if it's as simple as the "it's obvious" gang makes it to be.

Act 2: The Source

The JSONEncoder/JSONDecoder source is located here.

It's well documented and well referenced, and has to handle a ton of edge cases thanks to the formless nature of JSON itself (🤢).

To all of you who can read this 2500+ lines swift file and go "oh yea, it's obvious", congratulations, you lying bastards.

A Bit of Theory

At its heart, any parser/generator pair is usually a recursive, stack-based algorithm: let's look at a couple step-by-step examples.

Let's imagine a simple arithmetic program that need to read text input or spit text out. First, let's look at the data structure itself. Obviously, it's not optimal, and you need to add other operations, clump them together under an Operation supe-type for maximum flexibility, etc etc.

protocol Arith {
    func evaluate() -> Double

struct Value : Arith {
    var number : Double
    func evaluate() -> Double {
        return number

struct OpPlus : Arith {
    var left : Arith
    var right : Arith
    func evaluate() -> Double {
        return left.evaluate() + right.evaluate()

let op = OpPlus(left: OpPlus(left: Value(number: 1), right: Value(number: 1)), right: OpPlus(left: Value(number: 1), right: Value(number: 1)))

op.evaluate() // 4

How would we go about printing what that might look like as user input? Because those last couple of lines are going to get our putative customers in a tizzy...

"Easy", some of you will say! Just a recursive function, defined in the protocol that would look like this:

    func print() -> String

In Value, it would be implemented thus:

    func print() -> String {
        return String(number)

And in OpPlus:

   func print() -> String {
        return "(" + left.print() + " + " + right.print() + ")"

The end result for the example above would be "((1.0 + 1.0) + (1.0 + 1.0))"

The stack here is implicit, it's actually the call stack. left.print() is called before returning, the result is stored on the call stack, and when it's time to assemble the final product, it is popped and used.

That's the simple part, anyone with some experience in formats will have done this a few times, especially if they needed to output some debug string in a console. Two things to keep in mind:

  • we didn't have to manage the stack
  • there is no optimization of the output (we left all the parentheses, even though they weren't strictly needed)

How would we go about doing the reverse? Start with "((1.0 + 1.0) + (1.0 + 1.0))" and build the relevant Arith structure out of it? Suddenly, all these implicit things have to become fully explicit, and a lot fewer people have done it.

Most of the developers who've grappled with this problem ended up using yacc and lex variants, which allows to automate big parts of the parsing and making a few things implicit again. But for funsies, we'll try and thing about how those things would work in an abstract (and simplified) way.

I'm a program reading that string. Here's what happens:

  • An opening parenthesis! This is the beginning of an OpPlus, I'll create a temporary one, call it o1 and put it on the stack.
  • Another... Damn. OK, I'll create a second one, call it o2, put it on the stack.
  • Ah! a number! So, this is a Value. I'll create it as v1 and put it on the stack
  • A plus sign. Cool, that means that whatever I read before is the left side of an OpPlus. What's the currently investigated operation? o2. OK then, o2.left = v1
  • Another number. It's v2
  • Closing parenthesis! Finally. So the most recent OpPlus should have whatever is on top of the stack as the right side of the plus. o2.right = v2, and now the operation is complete, so we can pop it and carry on. We remove v1 and v2 from the stack.
  • A plus sign! Really? Do I have an open OpPlus? I do! it's o1, and it means that o2 is its left side. o1.left = o2
  • and we continue like this...
(I know actual LALR engineers are screaming at the computer right now, but it's my saga, ok?)

It's not quite as easy as a recursive printing function, now, is it? This example doesn't even begin to touch on most parsing issues, such as variants, extensible white space, and malformed expressions.

Why Is it Relevant?

The Encoder/Decoder paradigm of Swift 4 borrows very heavily from this concept. You "consume" input, spitting the transformed output if and when there is no error in the structure, recursively and using a stack. In the JSON implementation, you can see clearly that the *Storage classes are essentially stacks. The encode functions take items of a given structure, disassemble them, and put them on the stack, which is collapsed at the end to produce whatever it is you wanted as output, while decode functions check that items on stack match what is expected and pop them as needed to assemble the structures.

The main issue that these classes have to deal with is delegation.

The core types ( String, Int, Bool, etc...) are easy enough because there aren't many ways to serialize them. Some basic types, like Date are tricky, because they can be mapped to numbers (epoch, time since a particular date, etc) or to strings (iso8601, for instance), and have to be dealt with explicitely.

The problem lies with collections, i.e. arrays and dictionaries. You may look at JSON and think objects are dictionaries too, but it's not quite the case... 🤮

Swift solves this by differenciating 3 coding subtypes:

  • single value (probably a core type, or an object)
  • unkeyed (an array of objects) - which is a misnomer, since it has numbers as keys
  • keyed (a dictionary of objects, keyed by strings)

Any Encoder and Decoder has to manage all three kinds. The recursion part of this is that there is a high probability that a Codable object will be represented by a keyed decoder, with the property names as keys and the attached property values.

Our Value struct would probably be represented at some point by something that looks like ["number":1], and one of the simplest OpPlus by something like ["left":["number":1], "right":["number":1]]. See the recursion now? Not to mention, any property could be an array or a dictionary of Codable structures.

Essentially, you have 4 classes (more often than not, the single value is implemented in the coder itself, making it only 3 classes), that will be used to transcode our input, through the use of a stack, depending on what the input type is:

  • if it's an array, we go with the UnkeyedEncodingContainerProtocol
  • if it's a dictionary, we go with the KeyedEncodingContainerProtocol
  • if it's an object, we go with SingleValueEncodingContainerProtocol
    * if it's a core type, we stop the recursion and push a representation on the stack, or pop it from the stack
    * if it's a Codable object, we start a keyed process on the sub-data

Said like that, it's easy enough. Is coding it easy?

Act 3: The Code

You have managed to wade through all this without having to pop pill after pill to either stay awake or diminish the planetary size of your headache? Congratulations!

So is it that easy? Yes and no. All of the above does allow to follow along the code of how it works but there are a few caveats to write the Codable <-> [String:Any?] classes. It's all about the delegation and the (not so) subtle difference between an object and a dictionary.

If we look at our Value structure, it is "obvious" that it is represented by something like ["number":1]. What if we have nullable properties? What do we do with [] or ["number":1,"other":27]? The class with its properties and the dictionary are fundamentally different types, even though mapping classes to dictionaries is way easier than the reverse. On the other hand, type assumptions on dictionaries are way easier than structures. All 3 exemples above are indubitably dictionaries, whereas the constraint on any of them to be "like a Value" is a lot harder.

Enter the delegation mechanism. There is no way for a generic encoder/decoder to know how many properties a structure has and what their types may be. So, the Codable type requires your data to explain the way to map your object to a keyed system, through the decode(from: Decoder) and encode(to: Encoder) functions.

If you've never seen them, it's because you can afford to use only structs, which generate them automagically (you bastard).

In essence, those functions ask of you to take your properties (which have to be Codable) and provide a key to store or retrieve them. You will be the one who are going to ensure that the dictionary mapping makes sense.

Conclusion, Epilogue, All That Jazz

OK, so, either I'm dumb and it really was obvious, but it so happens that after 5 years, no one has ever coded it because no one needed it, or everyone has their own "obvious" implementation and no one published it. Or I'm not that dumb and that project will serve a purpose for somebody.

There are, however, a few particularities to my implementation that stem from choices I made along the way.

Certain types are "protected", that is they aren't (de)coded using their own implementation of Codable. For instance, Date is transformed into the number of milliseconds since its reference date, but given that we serialize to and from dictionaries in memory, there's no need to do it. They are considered as "core" types, even though they aren't labelled as such in the language. Those exception include:

  • Date / NSDate
  • Data / NSData
  • Decimal / NSDecimalNumber

Unlike JSON, they don't need to be transformed into an adjacent type, they are therefore allowed to retain their own.

The other elephant in the room is polymorphic in nature: if I allow decoding attemps of Any, or encoding attempts of Any, my functions can look wildly different:

  • decode can return a Codable, an array of Codable or a dictionary with Codable values
  • same goes for encode which should be consuming all 3 variants, plus nil or nullable parameters.

There is therefore an intermediary type to manage those cases. It's invisible from the outside in the case of decode, the function itself deciding what it's dealing with, but for encode, the function needs to return a polymorphic type, rather than an Any?.

My choice has been to use the following enumeration:

public enum CoderResult {
    case dictionary([String:Any?])
    case array([Any?])
    case single(Any)
    case `nil`

With attached types, you know exactly what you're getting:

public func encode<T : Encodable>(_ value: T) throws -> CoderResult { ... }

let r = try? DictionaryCoder.encode(v) ?? .nil
switch r {
    case .dictionary(let d): // d contains the [String:Any?] representation
    case .array(let a): // a contains the [Any?] representation
    case .single(let v): // v is the single value, which is kind of useless but there nonetheless
    case .nil: // no output

The repository is available on Github

We Suck As An Industry...

... primarily because we don't want to be an "industry".

XKCD's take on software
(from XKCD)

Let's face it: Computers are everywhere, and there are good reasons for that. Some bad ones as well, but that's for another time.

For a very long time, computers were what someone might call "force multipliers". It's not that you couldn't do your job without a computer, they just made it incredibly easier. Gradually, they became indispensable. Nowadays, there are very few jobs you can do without a computer.

Making these computers (and the relevant software) went from vaguely humanitarian (but mostly awesome nerdiness) to a hugely profitable business, managing the addiction of other businesses. Can you imagine a factory today that would go "look, those computer thingies are too expensive, complex, and inhumane, let's get back to skilled labor"?

And therefore, a "certified" computer for a doctor's office costs somewhere in the range of 15k, for a hefty 750% profit margin.

It's just market forces at play, offer and demand, some will say. After all, there are huge profit margins on lots of specialized tools that are indispensable. And I won't debate that. But I'll argue that we can't square the circle between being cool nerds with our beanbags and "creative environments", and being one of the most profitable of businesses out there.

One of the problems is that, because there is a lot of money in our industry, we attract workers who aren't into the whole nerd culture, and that causes a clash. We have no standards, no ethical safeguards, no safety nets. We never evolved passed the "computer club" mentality where everything is just "chill, dude". We never needed to, because all someone has to do if they don't feel like belonging to that particular group, is to move to another one. And for a lot of us, the job is still about being radical innovators, not purveyors of useful stuff.

Burnout is a rampant issue, bugs cost lives, the overall perceived quality of the tools decreases, but hey, we get paid for our hobby, so it's all right.

I have never seen any studies on that either, but my feeling is that because the techies don't actually want to be part of an "industry" ("we want to revolutionize the world, man"), the "jocks" and the money people rise to management positions, which skew the various discriminations our field is famous for towards the bad. I am not disculping the nerds from being aweful to women. But, from experience, they tend to be that way my mistake, not by malice, whereas the people who take over for power and money reasons have more incentive to be jerks in order to amass more power or money.

It's high time we, as a profession, realize we are a business like any other, and start having standards. Quality, ethics and stability are needed in every other industry. There are safeguards and "normal rules of conduct" in automobiles, architecture/building, even fricking eating ustensils manufacturing. Why is it that we continue valuing "disruption" and "bleeding edge-ness" more than safety and guarantees?


For a couple of projects, I needed a reusable username/password + token authentication system in Swift.

I like Kitura a lot, and decided to write my own plugin for that in this ecosystem.

Use it as you will, feedback appreciated

CredentialsToken <- Here at version 0.1.0