From Designer News:

Perhaps the biggest technical obstacle, however, is converting human understanding into robotic intelligence. The intelligence that enables  human beings to drive a car is largely taken for granted, and  replicating it is proving to be a bigger chore than engineers foresaw.

Color me surprised... 🙄

Two things here:

• "converting human understanding into robotic intelligence" requires knowing what understanding and intelligence are
• human brain are exceptionally good at filtering the useless (you don't have to think about chewing, until you choke on something), whereas, for now, computers can only deal with the explicit

If you read this blog, you might have seen quite a few articles about things that look like maths. This is my classical training seeping in. But, and that's a big one, because CS is born out of mathematics doesn't mean that in order to be good at computers, you need to be Ramanujan.

I guess that you can say that technically most of CS is run on mathematics, but it's the kind of maths that mathematicians don't like. The one where you can't use one of the most powerful tool in their arsenal: infinity.

Go back to your last somewhat high-level class involving maths. It's full of little things like "and when you go to infinity" or "when it tends to infinity" or the ever-present "..." that means "carry on to infinity".

When you do a maths proof, either you need infinity (or an infinite process), or you're dealing with a not-that-interesting problem.

#### The infinity issue

For us CS majors, infinity is a tricky concept. We like it in general, but we hate it when we deal with it. The very first case of "bad" infinity every coder faces earlier than they like is the so-called "infinite loop". Of course, it's not really infinite as it is bound to the program that's stuck in it, which probably won't run forever, but everyone agrees it's bad.

We don't do infinity. We have finite amounts of RAM, disk space, time, etc... Even the numbers we manipulate tend to have finite minima and maxima. Of course, that realization often comes at the cost of a major catastrophy, when coders forget that infinite is a thing we can't actually do, but hey...

The main thing you have to take away from all that stuff should be that if you manipulate something that grows all the time, or shrinks all the time, or splits all the time, you're in trouble. Because "all the time" tends to infinity.

So, what parts of maths should you be looking at to improve your CS?

##### The finite arithmetic stuff

Modulo arithmetic (or modular arithmetic) is a way of looking at numbers without infinity. In past articles, I've illustrated (hopefully successfully) that the integer stuff that we do in our programs is always problematic in some way. That is, it's problematic if you don't take the modulus into account:

255_{8 bits} + 1 = 0

A lot of exploits use that bug: if your "admin ID" is 0 and you have UInt16 IDs for your users, then the 65536th user will be admin as well. Maybe you don't plan on having that many users... Buuuuuuuuuuuuuuut...

##### Complexity

Ah, the BigO notation. Even if you've never had to calculate it by hand (I have...), you've seen it in algorithm discussions.

Complexity is about evaluating the resources a program will take. Because we live in a world obsessed with speed, most of you think that complexity is about the speed efficiency of an algorithm. And, fair point, more often than not, people will advance a nice BigO notation measuring the rough estimate of the number of operations your algorithm will take:

Dijkstra \, has \, a \, complexity \, of \, O(|V^{2}|)

That means that the number of steps for Dijkstra's algorithm is probably going to be in the order of magnitude of the square of the number of vertices.

That's an interesting metrics, because it will tell us roughly what happens in a growth scenario:

• I write my Dijkstra algorithm for my super duper GPS app (I really should use more modern algorithms, but that's the one I know)
• Being a good developer, I test it on a fairly large case, let's say 1000 crossroads
• I time the result to 1s. Cool, good enough
• But with a larger case, say the number of crossroads in a county, the time will grow exponentially:
time(x) = (x/1000)^{2}s \newline time(1000) = 1s \newline time(10000) = 100s \, (almost \, 2 \, min) \newline time(100000) = 10000s \, (almost \, 3 \, hours)

So, okay, maybe you need to optimize your algorithm, or redo it.

But one aspect that tends to be overlooked is that BigO notation doesn't only measure time. Complexity also deals with space. Most data is kind of linear in terms of space complexity: you add one more item in your list, it takes up one more "chunk of space", right?

Unless your data happens to have interdependency. Take the crossroad example above... Adding one vertex (node, choice, etc) actually adds at least the same amount of edges, because otherwise your crossroad isn't connected to, you know, roads... So, the space your "map" takes is kind of linear but it will take a multiple of your node counts of space to make it fit. And as soon as your single bit of data may in theory be connected to every other bit of data, the space it takes could be exponential.

##### The recursion problem

This leads me to a problem that is ours and not mathematicians'. Because of the way computers work, if you have a recursive function, say

func factorial(_ x: Int) -> Int {
if x <= 0 { return 1 } // why not
return x * factorial(x-1)
}


Every time factorial is called, it "saves" in a special bit of memory the state where it was to come back to it later. Here, it saves the x value before doing the new calculation, then restores it to be able to do the multiplication.

Whatever memory that state takes (it's more than just x), this simple function will take xtimes the  memory to perform. And since we don't have infinite memory, at some point, we will have exhausted it, and our program, which is all neat and perfectly reasonable, will crash... if the modulus kink doesn't get us first.

##### Application to the modern Web stuff

We are at a point where we will severely need to optimize our code to use less energy (looking at you bitcoin), but that's only one side of the problem. Because most stuff goes through the Internet nowadays, and massive calculation infrastructures are reasonably cheap, we have largely forgotten about the need to optimize for space. You can read every old where on the web old farts like me complaining about page sizes (for reference, a typical page on this blog weighs about half a meg, with only half of that transmitted on the pipe thanks to compression - the home page of GitHub weighs a whooping 5MB, 4.3MB compressed).

Optimizing page loads means optimizing transfer times. Transfer time is a function of bandwidth and size.

So, your brand new app talks to your server using JSON files, huh? Did you stop to think that JSON grows with the number of characters it contains?

Even a big number such as 981282348613667519 takes only 8 bytes in memory. But in your favorite text format (JSON, XML, and the like), it takes 19 bytes, almost 3 times as much. So, if you want to optimize your app's speed, a low hanging fruit might be to look at ways to transfer as little as possible.

##### Other areas of mathematical interest for CS people

Geometry, of course. As long as you put pixels on a screen and have a coordinate system in your user inputs (touch, click, etc), you need your trigonometry.

Dealing with animations and vector graphics means having a basic grasp of Bezier curves, and a tiiiiiiiny teeeeeeeeny bit of taste for derivative and tengents.

If, for some reason, you need to deal with audio, or any other kind of "waveform", you'll get by a lot better by knowing the basics of Fourier transformation, which, I'll freely admit, hurt my brain really badly when I was learning about them. But you only need to know what they do, not how they work, for most uses we developers have to deal with. Don't let the scary formula scare you, it's just numbers, and no one asks you to prove they work on a mathematical level. If someone is asking that of you, then I hope maths is your thing.

Any kind of modern compression leans heavily on difference stuff, and yes, it helps to have some knowledge in differential mathematics (a.k.a. looking at the evolution of the rate of change). Again, this is a biiiiiiiiiiig and scary field of mathematics to most, but CS can use a fraction of its core concepts to make things work better for us.

If you do 3D, then you live and breathe topology, as well as quaternions, and projection, and there's little I can do for you. Of course, projection stuff is kind of a database thing as well, so if you're doing heavy-duty SQL stuff, you should probably read up on that too.

Machine-learning, and all the other dataset manipulation techniques out there, rely mostly on statistics and algebra.

##### You said maths != CS

I did. In all of these cases above, the simple fact that we just take the infinity away makes it all a lot simpler in some cases, and a bit more complicated in others.

Knowing about the maths that idealize our problem gives us an interesting perspective to work with, but the fact we don't have the luxury of "etc ad infinitum" makes most of the maths we take inspiration from too broad for us to use directly. We only need special cases, we only use a subset, we only take a specific branch of a field.

It's kind of like our instinct that tells us that if we launch a ball straight up in the air, and there's some wind, it might not land exactly where it started from. If we have general knowledge of certain fields of mathematics, we can see the path we expect the ball to take more accurately. We just don't need to know the whole field of fluid dynamics and newtonian physics to compensate for it, just their general trends.

This project allows to run OCaml programs on PIC microcontrollers.

What a blast from the past. Emmanuel Chailloux (credited as the "meta-supervisor") was my advisor on my CS master's degree, back when I did "industrial" coding (C, Java and Objective-C) during nights, and "fun" research coding in OCaml during days.

Over the past years (decades, who am I kidding?), I've seen the language go from vaguely obscure academic torture device for CS students to the de-facto standard for certain types of developments (including Facebook's Messenger), and I use it as an counter example to my own students' "but you're teaching us something we'll never use in real life" kind of objections.

And now, people may start thinking it's a good tool to assail one of the last bastions of "but code should look like the machine executes it", PIC programming. So cool.

More programming languages, more ways to look at a problem, more varied solutions.

Aaaaaah the math tricks! When you know 'em, you love 'em, and when you don't, you pay for extra computing resources.

Today's math trick has to do with averages. Averages are easy, right? you take all the numbers in the list, you sum them, and then you divide by the count... Pft, that's no trick!

A_{list} = \frac{\sum_{i=0}^{list.size - 1} list[i]}{list.size}

Except... there is a little something called overflow. Let's take the case of integers, and let's assume we're working with UInt8 objects. What's the average of [233,212]? It is 222.5 which gets rounded to 223. But our good'ol summation doesn't work:

 1> let v1 : UInt8 = 233
2> let v2 : UInt8 = 212
3> let sum = v1 + v2

EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)

Depending on who you ask, 233+212 either wraps around or causes an error. 255 is the maximum value, after which there is nothing. Either way, we wouldn't be happy with the go-around either: 233+212 = 190, which gives an average of 95 when divided by 2.

Musical Interlude : "Zino, dear, I don't care, I can use BIGGER numbers!"

Yes, you can, up to a point. Most languages have a maximum integer width, and sure, you can probably find unbounded implementations of integers for your language. In Swift, you can check this out, it's really nice. BUT while it's technically possible to handle arbitrary precision, you start hitting all sorts of issues with storing that data ("I love using blobs in my database"), converting it for practical use ("My users can remember 200+ digits numbers easyyyyyyyyy"), etc. Plus, you generally don't want to replace every single Int use in your code by something coming from an external dependency, with all the headache that implies, just for the sake of type safety.

Enter the Maths (royal fanfare ♫♩🎺)

v1+v2 = ({\frac{v1}{2} + \frac{v2}{2}})*2

If I divide by two, then multiply by two, I've done nothing. In the case of integers, it's not quite true, as the division will be rounded to the closest value, but for big numbers, it's not that bad. But what does that give us?

Well...

\frac{v1}{2} + \frac{v2}{2} \lt Int_{max}

The sum of the two halves will fit in an integer, because each is guaranteed to be smaller than half the maximum. Right? Then we can multiply the result by 2 to get the sum, maybe. But it might overflow. Good thing we are trying to get the average, because we were about to divide by two, which cancels out the multiplication.

A_{(233,212)} = ((\frac{233}{2}+\frac{212}{2})*2)/2 = \frac{233}{2}+\frac{212}{2}

Musical Interlude: "Mind => Blown"

Sure, we kind of lose some precision: 233/2 will be rounded to 117 so the average calculated will be 223, but it could easily have been rounded down at some point.

Anyways... Onward and upward! What can we do with a big list of numbers? We could use the same trick, and just divide wholesale. The major issue is that we would severely compound the rounding errors. Imagine we're still playing with UInt8 elements and you have 200 of them. Any of them divided by 200 would result in 0 or maybe 1. Your average wouldn't look very good.

Cue the Return of the Maths (royal fanfare ♫♩🎺)

{\sum_{i=0}^{list.size - 1} list[i]} = {\sum_{i=0}^{list.size - 2} list[i]} + list[list.size - 1]

(As in (x+y+z+t) = (x+y+z) + t)

• Okay, and?
• Let's divide by list.size, and we get the average
A_{list} = \frac{\sum_{i=0}^{list.size - 2} list[i] + list[list.size-1]}{list.size}

The top-left part looks familiar, it's almost as if it was the average of the list minus the last element... 😬

All we would need to do is to divide by list.size - 1... But if we multiply and divide by the same thing... 🤔

\frac{(list.size-1)*\frac{\sum_{i=0}^{list.size - 2} list[i]}{list.size-1} + list[list.size-1]}{list.size}

Which is

A_{list} = \frac{(list.size-1)*A_{list-last} + list[list.size-1]}{list.size}

Musical Interlude: Smells like recursion

So... The code will basically look like this:

• If the list is empty (because we're good programmers and handle edge cases), the result is 0
• If the list contains one element, the average is easy
• If the list contains two elements, we can use the divide by two trick, the rounding error shouldn't be that bad
• If the list contains more elements, we do the average by aggregate, and hope the rounding errors will be somewhat contained.

Side note on the rounding errors: the bigger the divider, the higher the rounding error (potentially). But by doing a rolling average, we have a rounding error that worsens as we go through the list rather than being bad at every step. It's not ideal, but it's still better.

So, let's set the stage up: I have a list of big numbers I want the average of.

2988139172152746883
4545331521850540616
5693938727954663282
5884889191787885217
3111881160526182838
8720326064806005009
8427311181199404053
7983003740783657027
2965909035096967706
1211883882534796072
5703029716464526164
8424273336993151821
774296368044414872
14130533330426236
2230589047337383318
8337015733785964014
9153431205551083918
3249272057022384528
8254667294021634003
6758234862357239854

They are all Int64 integers, which is the highest bit native signed variant available (Int128 has been coming since 2017). They come from a PostgreSQL database that stores big numbers for a very good reason I won't get into.

Now, if I plug these numbers into an unbounded calculator, the average should be 5221577691680052871.55 or so I'm told.

My recursive Swift function looks like this:

func sumMean(_ input: [Int64]) -> Int64 {
if input.count == 0 { // uninteresting
return 0
}
if input.count == 1 { // easy
return input[0]
}

// general trick : divide by two (will introduce rounding errors)
if input.count == 2 {
let i1 = input[0] / 2
let i2 = input[1] / 2
let mean = (i1+i2) // (/2, then *2)
return mean
}

let depth = Int64(input.count) - 1
// rolling average formula
let last = input.last!
var rest = [Int64](input)
rest = rest.dropLast()

let restMean = sumMean(rest)
// should be (depth * restMean + last) / depth+1, but overflow...
let num = (restMean/2) + ((last/2)/depth)
let res = (num / (depth+1)) * depth * 2
return res
}

The reason for why num and res exist is left as an exercise.

Here's the calling code and the output:

var numbers : [Int64] = [
2988139172152746883,
4545331521850540616,
5693938727954663282,
5884889191787885217,
3111881160526182838,
8720326064806005009,
8427311181199404053,
7983003740783657027,
2965909035096967706,
1211883882534796072,
5703029716464526164,
8424273336993151821,
774296368044414872,
14130533330426236,
2230589047337383318,
8337015733785964014,
9153431205551083918,
3249272057022384528,
8254667294021634003,
6758234862357239854
]

print(sumMean(numbers))
5221577691680052740

As expected, we have rounding errors creeping in. This isn't the exact mean, but it's close enough: the difference is 131.55, which is a whopping 0.0000000000000025193534936693344360660675977627565% deviation.

As a side note, ordering matters:

• unordered and sorted crescendo yield the same error
• ordered reversed yields a 169.55 error margin

Given the scale, it's not a big deal, but keep in mind that this trick is only useful for fairly large numbers in a fairly large list, not for the extremes.

Minor updates to better handle setting defaults (was missing) and type coercion.

Grab it while it's hot: repository