Maths != CS

If you read this blog, you might have seen quite a few articles about things that look like maths. This is my classical training seeping in. But, and that's a big one, because CS is born out of mathematics doesn't mean that in order to be good at computers, you need to be Ramanujan.

I guess that you can say that technically most of CS is run on mathematics, but it's the kind of maths that mathematicians don't like. The one where you can't use one of the most powerful tool in their arsenal: infinity.

Go back to your last somewhat high-level class involving maths. It's full of little things like "and when you go to infinity" or "when it tends to infinity" or the ever-present "..." that means "carry on to infinity".

When you do a maths proof, either you need infinity (or an infinite process), or you're dealing with a not-that-interesting problem.

The infinity issue

For us CS majors, infinity is a tricky concept. We like it in general, but we hate it when we deal with it. The very first case of "bad" infinity every coder faces earlier than they like is the so-called "infinite loop". Of course, it's not really infinite as it is bound to the program that's stuck in it, which probably won't run forever, but everyone agrees it's bad.

We don't do infinity. We have finite amounts of RAM, disk space, time, etc... Even the numbers we manipulate tend to have finite minima and maxima. Of course, that realization often comes at the cost of a major catastrophy, when coders forget that infinite is a thing we can't actually do, but hey...

The main thing you have to take away from all that stuff should be that if you manipulate something that grows all the time, or shrinks all the time, or splits all the time, you're in trouble. Because "all the time" tends to infinity.

So, what parts of maths should you be looking at to improve your CS?

The finite arithmetic stuff

Modulo arithmetic (or modular arithmetic) is a way of looking at numbers without infinity. In past articles, I've illustrated (hopefully successfully) that the integer stuff that we do in our programs is always problematic in some way. That is, it's problematic if you don't take the modulus into account:

255_{8 bits} + 1 = 0

A lot of exploits use that bug: if your "admin ID" is 0 and you have UInt16 IDs for your users, then the 65536th user will be admin as well. Maybe you don't plan on having that many users... Buuuuuuuuuuuuuuut...


Ah, the BigO notation. Even if you've never had to calculate it by hand (I have...), you've seen it in algorithm discussions.

Complexity is about evaluating the resources a program will take. Because we live in a world obsessed with speed, most of you think that complexity is about the speed efficiency of an algorithm. And, fair point, more often than not, people will advance a nice BigO notation measuring the rough estimate of the number of operations your algorithm will take:

Dijkstra \, has \, a \, complexity \, of \, O(|V^{2}|)

That means that the number of steps for Dijkstra's algorithm is probably going to be in the order of magnitude of the square of the number of vertices.

That's an interesting metrics, because it will tell us roughly what happens in a growth scenario:

  • I write my Dijkstra algorithm for my super duper GPS app (I really should use more modern algorithms, but that's the one I know)
  • Being a good developer, I test it on a fairly large case, let's say 1000 crossroads
  • I time the result to 1s. Cool, good enough
  • But with a larger case, say the number of crossroads in a county, the time will grow exponentially:
time(x) = (x/1000)^{2}s \newline time(1000) = 1s \newline time(10000) = 100s \, (almost \, 2 \, min) \newline time(100000) = 10000s \, (almost \, 3 \, hours)

So, okay, maybe you need to optimize your algorithm, or redo it.

But one aspect that tends to be overlooked is that BigO notation doesn't only measure time. Complexity also deals with space. Most data is kind of linear in terms of space complexity: you add one more item in your list, it takes up one more "chunk of space", right?

Unless your data happens to have interdependency. Take the crossroad example above... Adding one vertex (node, choice, etc) actually adds at least the same amount of edges, because otherwise your crossroad isn't connected to, you know, roads... So, the space your "map" takes is kind of linear but it will take a multiple of your node counts of space to make it fit. And as soon as your single bit of data may in theory be connected to every other bit of data, the space it takes could be exponential.

The recursion problem

This leads me to a problem that is ours and not mathematicians'. Because of the way computers work, if you have a recursive function, say

func factorial(_ x: Int) -> Int {
  if x <= 0 { return 1 } // why not
  return x * factorial(x-1)

Every time factorial is called, it "saves" in a special bit of memory the state where it was to come back to it later. Here, it saves the x value before doing the new calculation, then restores it to be able to do the multiplication.

Whatever memory that state takes (it's more than just x), this simple function will take xtimes the  memory to perform. And since we don't have infinite memory, at some point, we will have exhausted it, and our program, which is all neat and perfectly reasonable, will crash... if the modulus kink doesn't get us first.

Application to the modern Web stuff

We are at a point where we will severely need to optimize our code to use less energy (looking at you bitcoin), but that's only one side of the problem. Because most stuff goes through the Internet nowadays, and massive calculation infrastructures are reasonably cheap, we have largely forgotten about the need to optimize for space. You can read every old where on the web old farts like me complaining about page sizes (for reference, a typical page on this blog weighs about half a meg, with only half of that transmitted on the pipe thanks to compression - the home page of GitHub weighs a whooping 5MB, 4.3MB compressed).

Optimizing page loads means optimizing transfer times. Transfer time is a function of bandwidth and size.

So, your brand new app talks to your server using JSON files, huh? Did you stop to think that JSON grows with the number of characters it contains?

Even a big number such as 981282348613667519 takes only 8 bytes in memory. But in your favorite text format (JSON, XML, and the like), it takes 19 bytes, almost 3 times as much. So, if you want to optimize your app's speed, a low hanging fruit might be to look at ways to transfer as little as possible.

Other areas of mathematical interest for CS people

Geometry, of course. As long as you put pixels on a screen and have a coordinate system in your user inputs (touch, click, etc), you need your trigonometry.

Dealing with animations and vector graphics means having a basic grasp of Bezier curves, and a tiiiiiiiny teeeeeeeeny bit of taste for derivative and tengents.

If, for some reason, you need to deal with audio, or any other kind of "waveform", you'll get by a lot better by knowing the basics of Fourier transformation, which, I'll freely admit, hurt my brain really badly when I was learning about them. But you only need to know what they do, not how they work, for most uses we developers have to deal with. Don't let the scary formula scare you, it's just numbers, and no one asks you to prove they work on a mathematical level. If someone is asking that of you, then I hope maths is your thing.

Any kind of modern compression leans heavily on difference stuff, and yes, it helps to have some knowledge in differential mathematics (a.k.a. looking at the evolution of the rate of change). Again, this is a biiiiiiiiiiig and scary field of mathematics to most, but CS can use a fraction of its core concepts to make things work better for us.

If you do 3D, then you live and breathe topology, as well as quaternions, and projection, and there's little I can do for you. Of course, projection stuff is kind of a database thing as well, so if you're doing heavy-duty SQL stuff, you should probably read up on that too.

Machine-learning, and all the other dataset manipulation techniques out there, rely mostly on statistics and algebra.

You said maths != CS

I did. In all of these cases above, the simple fact that we just take the infinity away makes it all a lot simpler in some cases, and a bit more complicated in others.

Knowing about the maths that idealize our problem gives us an interesting perspective to work with, but the fact we don't have the luxury of "etc ad infinitum" makes most of the maths we take inspiration from too broad for us to use directly. We only need special cases, we only use a subset, we only take a specific branch of a field.

It's kind of like our instinct that tells us that if we launch a ball straight up in the air, and there's some wind, it might not land exactly where it started from. If we have general knowledge of certain fields of mathematics, we can see the path we expect the ball to take more accurately. We just don't need to know the whole field of fluid dynamics and newtonian physics to compensate for it, just their general trends.