Coder & Codable

Working on CredentialsToken, it struck me as inconcievable that we couldn't serialize objects to dictionaries without going through JSON. After all, we had this kind of mapping in Objective-C (kind of), why not in Swift?

Thus started a drama in 3 acts, one wayyyyyyy more expository than the others.

TL;DR Gimme the code!

Obviously, someone has done it before. Swift is a few years old now and this is something a lot of people need to do (from time to time and only when absolutely needed, admittedly), right? JSON is what it is (🤢) but it's a standard, and we sometimes need to manipulate the data in memory without going through 2 conversions for everything (JSON <-> Data <-> String), right?

Open your favorite search engine and look for some Encoder class that's not JSON or Property List. I'll wait. Yea. As of this writing, there's only one, and I'm not sure what it does exactly: EmojiEncoder

So, next step is the Scouring of Stack Overflow. Plenty of questions pertaining to that problem, almost every single answer being along the lines of "look at the source code for JSONEncoder/JSONDecoder, it shouldn't be so hard to make one". But, I haven't seen anyone actually publishing one.

Looking at the source code for JSONDecoder is, however, a good idea, let's see if it's as simple as the "it's obvious" gang makes it to be.

Act 2: The Source

The JSONEncoder/JSONDecoder source is located here.

It's well documented and well referenced, and has to handle a ton of edge cases thanks to the formless nature of JSON itself (🤢).

To all of you who can read this 2500+ lines swift file and go "oh yea, it's obvious", congratulations, you lying bastards.

A Bit of Theory

At its heart, any parser/generator pair is usually a recursive, stack-based algorithm: let's look at a couple step-by-step examples.

Let's imagine a simple arithmetic program that need to read text input or spit text out. First, let's look at the data structure itself. Obviously, it's not optimal, and you need to add other operations, clump them together under an Operation supe-type for maximum flexibility, etc etc.

protocol Arith {
    func evaluate() -> Double
}

struct Value : Arith {
    var number : Double
    
    func evaluate() -> Double {
        return number
    }
}

struct OpPlus : Arith {
    var left : Arith
    var right : Arith
    
    func evaluate() -> Double {
        return left.evaluate() + right.evaluate()
    }
}

let op = OpPlus(left: OpPlus(left: Value(number: 1), right: Value(number: 1)), right: OpPlus(left: Value(number: 1), right: Value(number: 1)))

op.evaluate() // 4

How would we go about printing what that might look like as user input? Because those last couple of lines are going to get our putative customers in a tizzy...

"Easy", some of you will say! Just a recursive function, defined in the protocol that would look like this:

    func print() -> String

In Value, it would be implemented thus:

    func print() -> String {
        return String(number)
    }

And in OpPlus:

   func print() -> String {
        return "(" + left.print() + " + " + right.print() + ")"
    }

The end result for the example above would be "((1.0 + 1.0) + (1.0 + 1.0))"

The stack here is implicit, it's actually the call stack. left.print() is called before returning, the result is stored on the call stack, and when it's time to assemble the final product, it is popped and used.

That's the simple part, anyone with some experience in formats will have done this a few times, especially if they needed to output some debug string in a console. Two things to keep in mind:

  • we didn't have to manage the stack
  • there is no optimization of the output (we left all the parentheses, even though they weren't strictly needed)

How would we go about doing the reverse? Start with "((1.0 + 1.0) + (1.0 + 1.0))" and build the relevant Arith structure out of it? Suddenly, all these implicit things have to become fully explicit, and a lot fewer people have done it.

Most of the developers who've grappled with this problem ended up using yacc and lex variants, which allows to automate big parts of the parsing and making a few things implicit again. But for funsies, we'll try and thing about how those things would work in an abstract (and simplified) way.

I'm a program reading that string. Here's what happens:

  • An opening parenthesis! This is the beginning of an OpPlus, I'll create a temporary one, call it o1 and put it on the stack.
  • Another... Damn. OK, I'll create a second one, call it o2, put it on the stack.
  • Ah! a number! So, this is a Value. I'll create it as v1 and put it on the stack
  • A plus sign. Cool, that means that whatever I read before is the left side of an OpPlus. What's the currently investigated operation? o2. OK then, o2.left = v1
  • Another number. It's v2
  • Closing parenthesis! Finally. So the most recent OpPlus should have whatever is on top of the stack as the right side of the plus. o2.right = v2, and now the operation is complete, so we can pop it and carry on. We remove v1 and v2 from the stack.
  • A plus sign! Really? Do I have an open OpPlus? I do! it's o1, and it means that o2 is its left side. o1.left = o2
  • and we continue like this...
(I know actual LALR engineers are screaming at the computer right now, but it's my saga, ok?)

It's not quite as easy as a recursive printing function, now, is it? This example doesn't even begin to touch on most parsing issues, such as variants, extensible white space, and malformed expressions.

Why Is it Relevant?

The Encoder/Decoder paradigm of Swift 4 borrows very heavily from this concept. You "consume" input, spitting the transformed output if and when there is no error in the structure, recursively and using a stack. In the JSON implementation, you can see clearly that the *Storage classes are essentially stacks. The encode functions take items of a given structure, disassemble them, and put them on the stack, which is collapsed at the end to produce whatever it is you wanted as output, while decode functions check that items on stack match what is expected and pop them as needed to assemble the structures.

The main issue that these classes have to deal with is delegation.

The core types ( String, Int, Bool, etc...) are easy enough because there aren't many ways to serialize them. Some basic types, like Date are tricky, because they can be mapped to numbers (epoch, time since a particular date, etc) or to strings (iso8601, for instance), and have to be dealt with explicitely.

The problem lies with collections, i.e. arrays and dictionaries. You may look at JSON and think objects are dictionaries too, but it's not quite the case... 🤮

Swift solves this by differenciating 3 coding subtypes:

  • single value (probably a core type, or an object)
  • unkeyed (an array of objects) - which is a misnomer, since it has numbers as keys
  • keyed (a dictionary of objects, keyed by strings)

Any Encoder and Decoder has to manage all three kinds. The recursion part of this is that there is a high probability that a Codable object will be represented by a keyed decoder, with the property names as keys and the attached property values.

Our Value struct would probably be represented at some point by something that looks like ["number":1], and one of the simplest OpPlus by something like ["left":["number":1], "right":["number":1]]. See the recursion now? Not to mention, any property could be an array or a dictionary of Codable structures.

Essentially, you have 4 classes (more often than not, the single value is implemented in the coder itself, making it only 3 classes), that will be used to transcode our input, through the use of a stack, depending on what the input type is:

  • if it's an array, we go with the UnkeyedEncodingContainerProtocol
  • if it's a dictionary, we go with the KeyedEncodingContainerProtocol
  • if it's an object, we go with SingleValueEncodingContainerProtocol
    * if it's a core type, we stop the recursion and push a representation on the stack, or pop it from the stack
    * if it's a Codable object, we start a keyed process on the sub-data

Said like that, it's easy enough. Is coding it easy?

Act 3: The Code

You have managed to wade through all this without having to pop pill after pill to either stay awake or diminish the planetary size of your headache? Congratulations!

So is it that easy? Yes and no. All of the above does allow to follow along the code of how it works but there are a few caveats to write the Codable <-> [String:Any?] classes. It's all about the delegation and the (not so) subtle difference between an object and a dictionary.

If we look at our Value structure, it is "obvious" that it is represented by something like ["number":1]. What if we have nullable properties? What do we do with [] or ["number":1,"other":27]? The class with its properties and the dictionary are fundamentally different types, even though mapping classes to dictionaries is way easier than the reverse. On the other hand, type assumptions on dictionaries are way easier than structures. All 3 exemples above are indubitably dictionaries, whereas the constraint on any of them to be "like a Value" is a lot harder.

Enter the delegation mechanism. There is no way for a generic encoder/decoder to know how many properties a structure has and what their types may be. So, the Codable type requires your data to explain the way to map your object to a keyed system, through the decode(from: Decoder) and encode(to: Encoder) functions.

If you've never seen them, it's because you can afford to use only structs, which generate them automagically (you bastard).

In essence, those functions ask of you to take your properties (which have to be Codable) and provide a key to store or retrieve them. You will be the one who are going to ensure that the dictionary mapping makes sense.

Conclusion, Epilogue, All That Jazz

OK, so, either I'm dumb and it really was obvious, but it so happens that after 5 years, no one has ever coded it because no one needed it, or everyone has their own "obvious" implementation and no one published it. Or I'm not that dumb and that project will serve a purpose for somebody.

There are, however, a few particularities to my implementation that stem from choices I made along the way.

Certain types are "protected", that is they aren't (de)coded using their own implementation of Codable. For instance, Date is transformed into the number of milliseconds since its reference date, but given that we serialize to and from dictionaries in memory, there's no need to do it. They are considered as "core" types, even though they aren't labelled as such in the language. Those exception include:

  • Date / NSDate
  • Data / NSData
  • URL / NSURL
  • Decimal / NSDecimalNumber

Unlike JSON, they don't need to be transformed into an adjacent type, they are therefore allowed to retain their own.

The other elephant in the room is polymorphic in nature: if I allow decoding attemps of Any, or encoding attempts of Any, my functions can look wildly different:

  • decode can return a Codable, an array of Codable or a dictionary with Codable values
  • same goes for encode which should be consuming all 3 variants, plus nil or nullable parameters.

There is therefore an intermediary type to manage those cases. It's invisible from the outside in the case of decode, the function itself deciding what it's dealing with, but for encode, the function needs to return a polymorphic type, rather than an Any?.

My choice has been to use the following enumeration:

public enum CoderResult {
    case dictionary([String:Any?])
    case array([Any?])
    case single(Any)
    case `nil`
}

With attached types, you know exactly what you're getting:

public func encode<T : Encodable>(_ value: T) throws -> CoderResult { ... }

let r = try? DictionaryCoder.encode(v) ?? .nil
switch r {
    case .dictionary(let d): // d contains the [String:Any?] representation
    case .array(let a): // a contains the [Any?] representation
    case .single(let v): // v is the single value, which is kind of useless but there nonetheless
    case .nil: // no output
}

The repository is available on Github


We Suck As An Industry...

... primarily because we don't want to be an "industry".

XKCD's take on software
(from XKCD)

Let's face it: Computers are everywhere, and there are good reasons for that. Some bad ones as well, but that's for another time.

For a very long time, computers were what someone might call "force multipliers". It's not that you couldn't do your job without a computer, they just made it incredibly easier. Gradually, they became indispensable. Nowadays, there are very few jobs you can do without a computer.

Making these computers (and the relevant software) went from vaguely humanitarian (but mostly awesome nerdiness) to a hugely profitable business, managing the addiction of other businesses. Can you imagine a factory today that would go "look, those computer thingies are too expensive, complex, and inhumane, let's get back to skilled labor"?

And therefore, a "certified" computer for a doctor's office costs somewhere in the range of 15k, for a hefty 750% profit margin.

It's just market forces at play, offer and demand, some will say. After all, there are huge profit margins on lots of specialized tools that are indispensable. And I won't debate that. But I'll argue that we can't square the circle between being cool nerds with our beanbags and "creative environments", and being one of the most profitable of businesses out there.

One of the problems is that, because there is a lot of money in our industry, we attract workers who aren't into the whole nerd culture, and that causes a clash. We have no standards, no ethical safeguards, no safety nets. We never evolved passed the "computer club" mentality where everything is just "chill, dude". We never needed to, because all someone has to do if they don't feel like belonging to that particular group, is to move to another one. And for a lot of us, the job is still about being radical innovators, not purveyors of useful stuff.

Burnout is a rampant issue, bugs cost lives, the overall perceived quality of the tools decreases, but hey, we get paid for our hobby, so it's all right.

I have never seen any studies on that either, but my feeling is that because the techies don't actually want to be part of an "industry" ("we want to revolutionize the world, man"), the "jocks" and the money people rise to management positions, which skew the various discriminations our field is famous for towards the bad. I am not disculping the nerds from being aweful to women. But, from experience, they tend to be that way my mistake, not by malice, whereas the people who take over for power and money reasons have more incentive to be jerks in order to amass more power or money.

It's high time we, as a profession, realize we are a business like any other, and start having standards. Quality, ethics and stability are needed in every other industry. There are safeguards and "normal rules of conduct" in automobiles, architecture/building, even fricking eating ustensils manufacturing. Why is it that we continue valuing "disruption" and "bleeding edge-ness" more than safety and guarantees?


CredentialsToken

For a couple of projects, I needed a reusable username/password + token authentication system in Swift.

I like Kitura a lot, and decided to write my own plugin for that in this ecosystem.

Use it as you will, feedback appreciated

CredentialsToken <- Here at version 0.1.0


Script Kiddies

grep -r wp-login /var/log/http/ | grep 404 | wc -l
10765

That's the amount of requests to brute-force a login/password (with automatic banning rules on the IP after 5 404s on any request that contains the word login, thanks failban) since I've put up the new blog thing, a little less than 3 months ago.

That's more than 120 attempts per day.

Is that blog popular? no. Is that server critical to some widely used service? no. Is there a risk? probably, but I have a past in that domain and I like to think I take more precautions than most ( now I'm gonna get hit hard for sure... ).

Why do I react about it then?

Because brute-forcing is a stupid way to hack into things, takes a lot of resources and time (even if the effort to code such a brute force attack is minimal). It's lazy, with a very low probability of working. So, why is such an uninteresting target under a constant barrage of stupid attempts?

Because sometimes, it works. The only reason why you would have (and presumably pay for) a bunch of machines to use up so much resources doing something that dumb, is that you hit paydirt a large enough number of times to make it worth it.

It says more about the general state of server security than about the relative intelligence of people trying to break that security, and it's chilling.


ML Is Looking Over My Shoulder

Super fascinating attempt at creating a model that looks for errors and/or style issues in code using neural networks in this article from Sam Gentle.

Machine learning is the kind of thing where you can get a tantalising  result in a week, and then spend years of work turning it into something  reliable enough to be useful. To that end, I hereby provide a  tantalising result and leave the years of work as an exercise for the  reader.

Obviously, like any other piece of software... 😂