Double Precision (Not)

From this list, the gist is that most languages can't process 9999999999999999.0 - 9999999999999998.0

Why  do they output 2 when it should be 1? I bet most people who've never  done any formal CS (a.k.a maths and information theory) are super  surprised.

Before you read the rest, ask yourself this: if all you have are zeroes and ones, how do you handle infinity?

If  we fire up an interpreter that outputs the value when it's typed (like  the Swift REPL), we have the beginning of an explanation:

Welcome to Apple Swift version 4.2.1 (swiftlang-1000.11.42 clang-1000.11.45.1). Type :help for assistance.
  1> 9999999999999999.0 - 9999999999999998.0
$R0: Double = 2
  2> let a = 9999999999999999.0
a: Double = 10000000000000000
  3> let b = 9999999999999998.0
b: Double = 9999999999999998
  4> a-b
$R1: Double = 2

Whew, it's not that the languages can't handle a simple substraction, it's just that a is typed as 9999999999999999 but stored as 10000000000000000.

If we used integers, we'd have:

  5> 9999999999999999 - 9999999999999998
$R2: Int = 1

Are the decimal numbers broken? 😱

A detour through number representations

Let's  look at a byte. This is the fundamental unit of data in a computer and  is made of 8 bits, all of which can be 0 or 1. It ranges from 00000000 to 11111111 ( 0x00 to 0xff in hexadecimal, 0 to 255 in decimal, homework as to why and how it works like that due by monday).

Put like that, I hope it's obvious that the question "yes, but how do I represent the integer 999 on a byte?" is meaningless. You can decide that 00000000 means 990 and count up from there, or you can associate arbitrary values to the 256 possible combinations and make 999 be one of them, but you can't have both the 0 - 255 range and 999. You have a finite number of possible values and that's it.

Of  course, that's on 8 bits (hence the 256 color palette on old games). On  16, 32, 64 or bigger width memory blocks, you can store up to 2ⁿ different values, and that's it.

The problem with decimals

While  it's relatively easy to grasp the concept of infinity by looking at  "how high can I count?", it's less intuitive to notice that there is the same amount of numbers between 0 and 1 as there are integers.

So,  if we have a finite number of possible values, how do we decide which  ones make the cut when talking decimal parts? The smallest? The most  common? Again, as a stupid example, on 8 bits:

  • maybe we need 0.01 ... 0.99 because we're doing accounting stuff
  • maybe we need 0.015, 0.025,..., 0.995 for rounding reasons
  • We'll just encode the numeric part on 8 bits ( 0 - 255 ), and the decimal part as above

But that's already  99+99 values taken up. That leaves us 57 possible values for the rest of infinity. And that's not even mentionning the totally arbitrary nature of the  selection. This way of representing numbers is historically the first  one and is called "fixed" representation. There are many ways of  choosing how the decimal part behaves and a lot of headache when coding  how the simple operations work, not to mention the complex ones like  square roots and powers and logs.

Floats (IEEE 754)

To  make it simple for chips that perform the actual calculations, floating  point numbers (that's their name) have been defined using two  parameters:

  • an integer n
  • a power (of base b) p

Such that we can have n x bᵖ, for instance 15.3865 is 153863 x 10^(-4). The question is, how many bits can we use for the n and how many for the p.

The standard is to use 1 bit for the sign (+ or -), 23 bits for n, 8 for p, which use 32 bits total (we like powers of two), and using base 2, and n is actually 1.n.  That gives us a range of ~8 million values, and powers of 2 from -126  to +127 due to some special cases like infinity and NotANumber (NaN).

$$(-1~or~1)(2^{[-126...127]})(1.[one~of~the~8~million~values])$$

In theory, we have numbers from -10⁴⁵ to 1038 roughly, but some numbers can't be represented in that form. For  instance, if we look at the largest number smaller than 1, it's 0.9999999404. Anything between that and 1 has to be rounded. Again, infinity can't be represented by a finite number of bits.

Doubles

The  floats allow for "easy" calculus (by the computer at least) and are  "good enough" with a precision of 7.2 decimal places on average. So when  we needed more precision, someone said "hey, let's use 64 bits instead  of 32!". The only thing that changes is that n now uses 52 bits and p 11 bits.

Coincidentally, double has more a meaning of double size than double precision, even though the number of decimal places does jump to 15.9 on average.

We  still have 2³² more values to play with, and that does fill some  annoying gaps in the infinity, but not all. Famously (and annoyingly),  0.1 doesn't work in any precision size because of the base 2. In 32 bits  float, it's stored as 0.100000001490116119384765625, like this:

(1)(2⁻⁴)(1.600000023841858)

Conversely, after double size (aka doubles), we have quadruple size (aka quads), with 15 and 112 bits, for a total of 128 bits.

Back to our problem

Our value is 9999999999999999.0. The closest possible value encodable in double size floating point is actually 10000000000000000, which should now make some kind of sense. It is confirmed by Swift when separating the two sides of the calculus, too:

2> let a = 9999999999999999.0
a: Double = 10000000000000000

Our  big brain so good at maths knows that there is a difference between  these two values, and so does the computer. It's just that using  doubles, it can't store it. Using floats, a will be rounded to 10000000272564224 which isn't exactly better. Quads aren't used regularly yet, so no luck there.

It's  funny because this is an operation that we puny humans can do very  easily, even those humans who say they suck at maths, and yet those  touted computers with their billions of math operations per second can't  work it out. Fair enough.

The kicker is, there is a litteral infinity of examples such as this one, because trying to represent infinity in a finite number of digits is impossible.


A Practical Use For DictionaryCoding

Those of you who read the previous blog know that I'm a huge fan of OpenWhisk. It's one of those neat new-ish ways to do server side stuff without having to deal with the server too much.

Quick Primer on OpenWhisk

The idea behind serverless technologies is to encapsulate code logic in actions: each action can be written in any language, take a json as a parameter, and spit out a json as results. You can chain them, you have mechanisms to trigger them via cron jobs, urls, or through state modifications.

The example I used back in the day was a scraper that sent a mail if and when a certain page had changed:

cron triggered a lookup, lookup was written in Swift, checking HTTP code and content, passing its findings along an action written in Python that would compare contents if needed to its cache data stored in an ElasticSearch stack (yes I like to complicate things), before passing the data needed for a mail to be sent to a PHP action that would send the mail if needed.

It's obviously stupidly convoluted but it highlights the main advantage of using actions: you write them in the language and with the frameworks that suit your needs the best.

One other cool feature of OW is that when you don't use the actions for a while they are automatically spooled down, saving on resources.

OpenWhisk for iOS

There is a library that allows iOS applications to call the actions directly from your code, passing dictionaries and retrieving dictionaries. It's dead simple to use, and provides instant access to the serverside logic without having to go through the messy business of exposing routes, transcoding data in and out of JSON and managing weird edge cases.

For the purpose of a personal project, I am encapsulating access to  some Machine Learning stuff on a server through actions that will query databases and models to spit out useful data for an app to use.

I'll simplify for the purpose of this post, but I need to call an action that takes a score (or a level) and responds with a user that is "close enough" to be a good partner:

struct MatchData : Codable {
  let name : String
  let level : Int
  let bridgeID : UUID?
}

My action is called match/player, and my call in the code looks like this:

do {
  try WLAppDelegate.shared.whisk.invokeAction(
    qualifiedName: "match/player", 
    parameters: p as AnyObject, 
    hasResult: true, 
    callback: { (result, error) in
      if let error = error {
        // deal with it [1]
      } else if let r = result?["result"] as? [String:Any] {
        // deal with it [2]
      } else if let aid = result?["activationId"] as? String {
        // deal with it [3]
      }
    })
} catch {
  // deal with it [4]
}

A little bit of explanation might be required at this point:

  • [1] is where you would deal with an error that's either internal to OW or your action
  • [2] is the probably standard case. Your action has run, and the result is available
  • [3] is a mechanism that's specific to long running actions: if it takes too long to spool your action up, run it and get the result back, OW will not block you for too long, and give you an ID to query later for the result when it is available. Timeouts both for responding and for a maximum action running time can be tweaked in OW's configuration. Your app logic should accomodate for that, which you are doing already anyways because you don't always have connectivity, right?
  • [4] is about iOS side errors, like networking issues, certificate issues and the like

That Whole JSON/Dictionary/Codable Thing

So, OW deals with JSONs, which are fine-ish when you have a ton of processing power and don't mind wasting cycles looking up keys and values. The iOS client translates them to [String:Any] dictionaries which are a little bit better but not that much.

In my code I don't really want to deal with missing keys or type mismatches, so I map the result as fast as I can to the actual struct/class:

if let r = result?["result"] as? [String:Any], let d = try? DictionaryCoding().decode(MatchData.self, from: r) {
  // yay I have d.name, d.level and d.bridgeID 
  // instead of those pesky dictionaries
}

Of course, the downside is that decoders are called twice but:

  • DictionaryCoding is blistering fast
  • As long as I pass this object around, I will never have to check keys and values again, which is a huge boost in performance

Why?

My front facing actions are all written in swift, using the same Codable structures as the ones I expect in the app. That way, I can focus on the logic rather than route and coding shenanigans. OpenWhisk scales with activity from the app, and the app doesn't need a complicated networking layer.

My backend actions responsible for all the heavy lifting are written in whatever I need for that task. Python/TensorFlow, for me, but your mileage may vary.


My Workflow

It's not that I'm paranoid, it's just that I don't see why I wouldn't be careful
(unknown hero)

My entire development workflow is arrayed around tools I can install and maintain myself, insofar as it's possible. Especially the server side stuff. Students and clients ask, so here goes.

What are we collaborating on?

The first step and the most important one is documenting the expectations. We're about to embark on a multiple days/weeks/months journey together, and we hopefully agreed on what should be done. Right now we shake hands, and I'm already sure that the variations in our respective minds have already started to add up.

The first step is to get all the conception documents and all the available documentation which we validated into a shared space. Git repo, wiki, shared folder, anything that is accessible at any time to avoid the "by the way, what did we decide on X?" question when crunch time occurs. If we have already taken a decision, it goes into the shared documents. If not, we decide, and then it goes into the shared document.

Versionning / Issues

I use a self-hosted instance of Gitlab, which gives me git (obviously), but also issues in list or kanban forms, and access to CI/CD (more later), as well as a wiki. My code, and my customers' data, matter to me. It's not that it's a secret, just that I want to be able to control access.

For commits and tasks, I use Git Flow, or slight variants of it. It makes for easier collaboration if it's a multi-dev project, and allows for easier tracking of issues.

Speaking of issues, back before Agile was so heavily codified, a simple 4 boards system works just fine

  • to discuss
  • to do
  • doing
  • done

I guess I could split boards between various domains as well, but I use tags for that (UI, back, etc). It shows at a glance where I am on my project.

Testing, CI/CD, all that jazz

Up till this point, most people were probably nodding along, or arguing about tiny points of detail. The problem with testing (unit tests, UI tests, beta tests, etc etc etc) is that it's heavily dependant on what you are trying to build. If there's an API, then by all means, there should be unit tests on it. Otherwise...

The bottomline for me is this: if there are several people on a project, I want clearly defined ownership. It's not that I won't fix a bug in someone else's code, just that they own it and therefore have to have a reliable way of testing that my fix works.

Coding is very very artsy. After a while, you can recognize someone's coding style. Anyone who had to dive into someone else's code know that it's very hard to mimic exactly how the original author would have wrote.

Tests solve part of that problem. My code, my tests. If you fix my code, run my tests, I'm fairly confident that you didn't wreck the whole thing. And that I won't have to spend a couple of hours figuring out what it is that you did.

At interface points too, tests are super useful. Say I do that part of the app and you do that other part of the app. If we agree on how they should communicate, we can setup tests that validate the "rules". Easy as that.

For mobile app development, the biggest hassle is getting betas in the hands of clients/testers. Thankfully, between testflight, fastlane, and all the others, it's not that complex. The issue there is automating the release of new betas. I used to use Jenkins, but sometimes, the stuff in Gitlab itself is enough. Mileage will vary.

One thing's for certain: if you test manually your code, it's either a very very very small and simple project, or you're not actually testing it.

Chat / communications

Email sucks. I log in my computer, see 300+ emails, and am very quickly depressed. SMS/iMessage is for urgent things.

Back in the day, I would have used IRC, for teams. Slack came along and everyone uses it. But Slack comes with a lot of unknowns. The gitlab omnibus installation comes with Mattermost, which is like slack, but self-hosted. Done.


Coder & Codable

Working on CredentialsToken, it struck me as inconcievable that we couldn't serialize objects to dictionaries without going through JSON. After all, we had this kind of mapping in Objective-C (kind of), why not in Swift?

Thus started a drama in 3 acts, one wayyyyyyy more expository than the others.

TL;DR Gimme the code!

Obviously, someone has done it before. Swift is a few years old now and this is something a lot of people need to do (from time to time and only when absolutely needed, admittedly), right? JSON is what it is (🤢) but it's a standard, and we sometimes need to manipulate the data in memory without going through 2 conversions for everything (JSON <-> Data <-> String), right?

Open your favorite search engine and look for some Encoder class that's not JSON or Property List. I'll wait. Yea. As of this writing, there's only one, and I'm not sure what it does exactly: EmojiEncoder

So, next step is the Scouring of Stack Overflow. Plenty of questions pertaining to that problem, almost every single answer being along the lines of "look at the source code for JSONEncoder/JSONDecoder, it shouldn't be so hard to make one". But, I haven't seen anyone actually publishing one.

Looking at the source code for JSONDecoder is, however, a good idea, let's see if it's as simple as the "it's obvious" gang makes it to be.

Act 2: The Source

The JSONEncoder/JSONDecoder source is located here.

It's well documented and well referenced, and has to handle a ton of edge cases thanks to the formless nature of JSON itself (🤢).

To all of you who can read this 2500+ lines swift file and go "oh yea, it's obvious", congratulations, you lying bastards.

A Bit of Theory

At its heart, any parser/generator pair is usually a recursive, stack-based algorithm: let's look at a couple step-by-step examples.

Let's imagine a simple arithmetic program that need to read text input or spit text out. First, let's look at the data structure itself. Obviously, it's not optimal, and you need to add other operations, clump them together under an Operation supe-type for maximum flexibility, etc etc.

protocol Arith {
    func evaluate() -> Double
}

struct Value : Arith {
    var number : Double
    
    func evaluate() -> Double {
        return number
    }
}

struct OpPlus : Arith {
    var left : Arith
    var right : Arith
    
    func evaluate() -> Double {
        return left.evaluate() + right.evaluate()
    }
}

let op = OpPlus(left: OpPlus(left: Value(number: 1), right: Value(number: 1)), right: OpPlus(left: Value(number: 1), right: Value(number: 1)))

op.evaluate() // 4

How would we go about printing what that might look like as user input? Because those last couple of lines are going to get our putative customers in a tizzy...

"Easy", some of you will say! Just a recursive function, defined in the protocol that would look like this:

    func print() -> String

In Value, it would be implemented thus:

    func print() -> String {
        return String(number)
    }

And in OpPlus:

   func print() -> String {
        return "(" + left.print() + " + " + right.print() + ")"
    }

The end result for the example above would be "((1.0 + 1.0) + (1.0 + 1.0))"

The stack here is implicit, it's actually the call stack. left.print() is called before returning, the result is stored on the call stack, and when it's time to assemble the final product, it is popped and used.

That's the simple part, anyone with some experience in formats will have done this a few times, especially if they needed to output some debug string in a console. Two things to keep in mind:

  • we didn't have to manage the stack
  • there is no optimization of the output (we left all the parentheses, even though they weren't strictly needed)

How would we go about doing the reverse? Start with "((1.0 + 1.0) + (1.0 + 1.0))" and build the relevant Arith structure out of it? Suddenly, all these implicit things have to become fully explicit, and a lot fewer people have done it.

Most of the developers who've grappled with this problem ended up using yacc and lex variants, which allows to automate big parts of the parsing and making a few things implicit again. But for funsies, we'll try and thing about how those things would work in an abstract (and simplified) way.

I'm a program reading that string. Here's what happens:

  • An opening parenthesis! This is the beginning of an OpPlus, I'll create a temporary one, call it o1 and put it on the stack.
  • Another... Damn. OK, I'll create a second one, call it o2, put it on the stack.
  • Ah! a number! So, this is a Value. I'll create it as v1 and put it on the stack
  • A plus sign. Cool, that means that whatever I read before is the left side of an OpPlus. What's the currently investigated operation? o2. OK then, o2.left = v1
  • Another number. It's v2
  • Closing parenthesis! Finally. So the most recent OpPlus should have whatever is on top of the stack as the right side of the plus. o2.right = v2, and now the operation is complete, so we can pop it and carry on. We remove v1 and v2 from the stack.
  • A plus sign! Really? Do I have an open OpPlus? I do! it's o1, and it means that o2 is its left side. o1.left = o2
  • and we continue like this...
(I know actual LALR engineers are screaming at the computer right now, but it's my saga, ok?)

It's not quite as easy as a recursive printing function, now, is it? This example doesn't even begin to touch on most parsing issues, such as variants, extensible white space, and malformed expressions.

Why Is it Relevant?

The Encoder/Decoder paradigm of Swift 4 borrows very heavily from this concept. You "consume" input, spitting the transformed output if and when there is no error in the structure, recursively and using a stack. In the JSON implementation, you can see clearly that the *Storage classes are essentially stacks. The encode functions take items of a given structure, disassemble them, and put them on the stack, which is collapsed at the end to produce whatever it is you wanted as output, while decode functions check that items on stack match what is expected and pop them as needed to assemble the structures.

The main issue that these classes have to deal with is delegation.

The core types ( String, Int, Bool, etc...) are easy enough because there aren't many ways to serialize them. Some basic types, like Date are tricky, because they can be mapped to numbers (epoch, time since a particular date, etc) or to strings (iso8601, for instance), and have to be dealt with explicitely.

The problem lies with collections, i.e. arrays and dictionaries. You may look at JSON and think objects are dictionaries too, but it's not quite the case... 🤮

Swift solves this by differenciating 3 coding subtypes:

  • single value (probably a core type, or an object)
  • unkeyed (an array of objects) - which is a misnomer, since it has numbers as keys
  • keyed (a dictionary of objects, keyed by strings)

Any Encoder and Decoder has to manage all three kinds. The recursion part of this is that there is a high probability that a Codable object will be represented by a keyed decoder, with the property names as keys and the attached property values.

Our Value struct would probably be represented at some point by something that looks like ["number":1], and one of the simplest OpPlus by something like ["left":["number":1], "right":["number":1]]. See the recursion now? Not to mention, any property could be an array or a dictionary of Codable structures.

Essentially, you have 4 classes (more often than not, the single value is implemented in the coder itself, making it only 3 classes), that will be used to transcode our input, through the use of a stack, depending on what the input type is:

  • if it's an array, we go with the UnkeyedEncodingContainerProtocol
  • if it's a dictionary, we go with the KeyedEncodingContainerProtocol
  • if it's an object, we go with SingleValueEncodingContainerProtocol
    * if it's a core type, we stop the recursion and push a representation on the stack, or pop it from the stack
    * if it's a Codable object, we start a keyed process on the sub-data

Said like that, it's easy enough. Is coding it easy?

Act 3: The Code

You have managed to wade through all this without having to pop pill after pill to either stay awake or diminish the planetary size of your headache? Congratulations!

So is it that easy? Yes and no. All of the above does allow to follow along the code of how it works but there are a few caveats to write the Codable <-> [String:Any?] classes. It's all about the delegation and the (not so) subtle difference between an object and a dictionary.

If we look at our Value structure, it is "obvious" that it is represented by something like ["number":1]. What if we have nullable properties? What do we do with [] or ["number":1,"other":27]? The class with its properties and the dictionary are fundamentally different types, even though mapping classes to dictionaries is way easier than the reverse. On the other hand, type assumptions on dictionaries are way easier than structures. All 3 exemples above are indubitably dictionaries, whereas the constraint on any of them to be "like a Value" is a lot harder.

Enter the delegation mechanism. There is no way for a generic encoder/decoder to know how many properties a structure has and what their types may be. So, the Codable type requires your data to explain the way to map your object to a keyed system, through the decode(from: Decoder) and encode(to: Encoder) functions.

If you've never seen them, it's because you can afford to use only structs, which generate them automagically (you bastard).

In essence, those functions ask of you to take your properties (which have to be Codable) and provide a key to store or retrieve them. You will be the one who are going to ensure that the dictionary mapping makes sense.

Conclusion, Epilogue, All That Jazz

OK, so, either I'm dumb and it really was obvious, but it so happens that after 5 years, no one has ever coded it because no one needed it, or everyone has their own "obvious" implementation and no one published it. Or I'm not that dumb and that project will serve a purpose for somebody.

There are, however, a few particularities to my implementation that stem from choices I made along the way.

Certain types are "protected", that is they aren't (de)coded using their own implementation of Codable. For instance, Date is transformed into the number of milliseconds since its reference date, but given that we serialize to and from dictionaries in memory, there's no need to do it. They are considered as "core" types, even though they aren't labelled as such in the language. Those exception include:

  • Date / NSDate
  • Data / NSData
  • URL / NSURL
  • Decimal / NSDecimalNumber

Unlike JSON, they don't need to be transformed into an adjacent type, they are therefore allowed to retain their own.

The other elephant in the room is polymorphic in nature: if I allow decoding attemps of Any, or encoding attempts of Any, my functions can look wildly different:

  • decode can return a Codable, an array of Codable or a dictionary with Codable values
  • same goes for encode which should be consuming all 3 variants, plus nil or nullable parameters.

There is therefore an intermediary type to manage those cases. It's invisible from the outside in the case of decode, the function itself deciding what it's dealing with, but for encode, the function needs to return a polymorphic type, rather than an Any?.

My choice has been to use the following enumeration:

public enum CoderResult {
    case dictionary([String:Any?])
    case array([Any?])
    case single(Any)
    case `nil`
}

With attached types, you know exactly what you're getting:

public func encode<T : Encodable>(_ value: T) throws -> CoderResult { ... }

let r = try? DictionaryCoder.encode(v) ?? .nil
switch r {
    case .dictionary(let d): // d contains the [String:Any?] representation
    case .array(let a): // a contains the [Any?] representation
    case .single(let v): // v is the single value, which is kind of useless but there nonetheless
    case .nil: // no output
}

The repository is available on Github


We Suck As An Industry...

... primarily because we don't want to be an "industry".

XKCD's take on software
(from XKCD)

Let's face it: Computers are everywhere, and there are good reasons for that. Some bad ones as well, but that's for another time.

For a very long time, computers were what someone might call "force multipliers". It's not that you couldn't do your job without a computer, they just made it incredibly easier. Gradually, they became indispensable. Nowadays, there are very few jobs you can do without a computer.

Making these computers (and the relevant software) went from vaguely humanitarian (but mostly awesome nerdiness) to a hugely profitable business, managing the addiction of other businesses. Can you imagine a factory today that would go "look, those computer thingies are too expensive, complex, and inhumane, let's get back to skilled labor"?

And therefore, a "certified" computer for a doctor's office costs somewhere in the range of 15k, for a hefty 750% profit margin.

It's just market forces at play, offer and demand, some will say. After all, there are huge profit margins on lots of specialized tools that are indispensable. And I won't debate that. But I'll argue that we can't square the circle between being cool nerds with our beanbags and "creative environments", and being one of the most profitable of businesses out there.

One of the problems is that, because there is a lot of money in our industry, we attract workers who aren't into the whole nerd culture, and that causes a clash. We have no standards, no ethical safeguards, no safety nets. We never evolved passed the "computer club" mentality where everything is just "chill, dude". We never needed to, because all someone has to do if they don't feel like belonging to that particular group, is to move to another one. And for a lot of us, the job is still about being radical innovators, not purveyors of useful stuff.

Burnout is a rampant issue, bugs cost lives, the overall perceived quality of the tools decreases, but hey, we get paid for our hobby, so it's all right.

I have never seen any studies on that either, but my feeling is that because the techies don't actually want to be part of an "industry" ("we want to revolutionize the world, man"), the "jocks" and the money people rise to management positions, which skew the various discriminations our field is famous for towards the bad. I am not disculping the nerds from being aweful to women. But, from experience, they tend to be that way my mistake, not by malice, whereas the people who take over for power and money reasons have more incentive to be jerks in order to amass more power or money.

It's high time we, as a profession, realize we are a business like any other, and start having standards. Quality, ethics and stability are needed in every other industry. There are safeguards and "normal rules of conduct" in automobiles, architecture/building, even fricking eating ustensils manufacturing. Why is it that we continue valuing "disruption" and "bleeding edge-ness" more than safety and guarantees?