It's Not Me, It's You

Nothing new, people will do very distasteful things to track you

That said, it's in a state I feel you can confirm the script works as  advertised: it tracks the user on the site, and it sends performance  data as per the above information.

I have several problems with the JavaScript language itself, but they are linked to a bias I acquired in my CS degree: I like to have my code legible, and reproducible. JS doesn't satisfy that one bit, although recent efforts in normalization (ES) and evolutions (TypeScript) do reassure me I'm not the only dinosaur around.

What really bothers me is that this messy and arcane language (how many of you JS "fans" know how it works under the hood, or how to guarantee formally anything written in it?) is silently run on every webpage there is. You load an innocuous kitten laden website, and a ton of invisible code is run to track, exploit, or mislead you. Fan-ta-stic.

3D Ray-Tracing

I used to do 3D. A lot. I have a few leftovers from that era of my life, and I am still knowledgeable enough to follow along the cool stuff that's coming out of the race between GPU manufacturers (when they aren't competing over who mines cryptocurrencies the best 🙄).

It's always been super hard for me to explain how ray-tracing works, because it's a very visual thing and it seemed like it required a good deal of spatial awareness from the people I was trying to explain it to. But it was probably because I suck at explaining how ray-tracing works without showing it.

So, I was super happy to find a video that explains it all better than I always have. Enjoy, then download blender and have fun rendering stuff.

That Should Work

To get your degree in <insert commerce / political school name here>, there is a last exam in which you need to talk with a jury of teachers. The rule is simple, if the student is stumped or hesitates, the student has failed. If the student manages to last the whole time, or manages to stump the jury or makes it hesitate, the student passes.
This particular student was having a conversation about geography, and a juror thought to stump the candidate by asking "what is the depth of <insert major river here>?" to which the student, not missing a beat answered "under which bridge?", stumping the juror.

Old student joke/legend

Programming is part of the larger tree of knowledge we call computer science. Everything we do has its roots in maths and electronics. Can you get by with shoddy reasoning and approximate "that should work" logic? Sure. But in the same way you can "get by" playing the piano using only the index finger of your hands. Being able to play chopsticks makes you as much of a pianist as being able to copy/paste stackoverflow answers makes you a programmer/developer.

The problem is that in my field, the end-user (or client, or "juror", or "decision maker") is incapable of distinguishing between chopsticks and Brahms, not because of a lack of interest, but because we, as a field, have become experts at stumping them. As a result, we have various policies along the lines of "everyone should learn to code" being implemented worldwide, and I cynically think it's mostly because the goal is to stop getting milked by so-called experts that can charge you thousands of monies for the chopsticks equivalent of a website.

To me, the problem doesn't really lie with the coding part. Any science, any technical field, requires a long formation to become good at. Language proficiency, musical instruments, sports, dancing, driving, sailing, carpentry, mechanical engineering, etc... It's all rather well accepted that these fields require dedication and training. But somehow, programming should be "easy", or "intuitive".

That's not to say I think it should be reserved to an elite. These other fields aren't. I have friends who got extremely good at guitars by themselves, and sports are a well known way out of the social bog. But developers seem to be making something out of nothing. They "just" sit down and press keys on a board and presto, something appears and they get paid. It somehow seems unfair, right?

There are two aspects to this situation: the lack of nuanced understanding on the person who buys the program, and the overly complicated/flaky way we programmers handle all this. I've already painted with a very broad brush what we developers feel about this whole "being an industry" thing.

So what's the issue on the other side? If you ask most customers (and students), they respond "obfuscation" or a variant of it. In short, we use jargon, technobabble, which they understand nothing of, and are feeling taken advantage of when we ask for money. This covers the whole gamut from "oh cool, they seem to know what they are talking about, so I will give them all my money" to "I've been burned by smart sounding people before, I don't trust them anymore", to "I bet I can do it myself in under two weeks", to "the niece of the mother of my friend is learning to code and she's like 12, so I'll ask her instead".

So, besides reading all of Plato's work on dialectic and how to get at the truth through questions, how does one differentiate between a $500 website and a $20000 one? Especially if they look the same?

Well, in my opinion as a teacher, for which I'm paid to sprinkle knowledge about computer programming onto people, there are two important things to understand about making software to evaluate the quality of a product:

  • Programming is exclusively about logic. The difficulty (and the price) scales in regards to the logic needed to solve whatever problem we are hired to solve
  • We very often reuse logic from other places and combine those lines of code with ours to refine the solution

Warning triggers that make me think the person is trying to sell me magic pixie dust include:

  • The usual bullshit-bingo: if they try to include as many buzzwords (AI, machine learning, cloud, big data, blockchain,...) as possible in their presentation, you have to ask very pointed question about your problem, and how these things will help you solve it
  • If they tell you they have the perfect solution for you even though they asked no question, they are probably trying to recycle something they have which may or may not work for your issues

A word of warning though: prices in absolute aren't a factor at all. In the same way that you'd probably pay quite naturally a whole lot more money for a bespoke dinner table that is exactly what you envision in your dreams than the one you can get in any furniture store, your solution cannot be cheaper than off-the-shelf. Expertise and tailoring cannot be free. Balking at the price when you have someone who genuinely is an expert in front of you, and after they announced their price is somewhat insulting. How often do you go to the bakery and ask the question "OK, so your cake is really good, and all my friends recommend it, and I know it's made with care, but, like, $30 is way too expensive... how about $15?"

I have also left aside the question of visual design. it's not my field, I suck at it, and I think that it is an expert field too, albeit more on the "do I like it?" side of the equation than the "does it work?" one, when it comes to estimating its value. It's like when you buy a house: there are the foundations, and the walls, and the roof, and their job is to answer the question "will I still be protected from the outside weather in 10 years?", whereas the layout, the colors of the walls, and the furniture are the answer to the question "will I still feel good in this place in 10 years?". Thing is, with software development as well, you can change the visuals to a certain extent (up to the point when you need to change the position of the walls, to continue with the metaphor), but it's hard to change the foundations.

DocumentDB vs MongoDB

From AWS gives open source the middle finger:

Bypassing MongoDB’s licensing by going for API comparability, given that  AWS knows exactly why MongoDB did that, was always going to be a  controversial move and won’t endear the company to the open-source  community.

MongoDB is hugely popular, although entirely for the wrong reasons in my mind, and it's kind of hard to scale it up without infrastructure expertise, which is why it makes sense for a company to offer some kind of a turnkey solution. Going for compatibility rather than using the original code also makes a lot of sense when you're an infrastructure-oriented business, because your own code tends to be more tailored to your specific resources.

But in terms of how-it-looks, after having repeatedly been accused of leeching off open-source, this isn't great. One of the richest services divisions out there, offloading R&D to the OSS community, then, once the concept proves to be a potential goldmine, undercutting the original?

The global trend of big companies is to acknowledge the influence of open-source in our field and give back. Some do it because they believe in it, some because they benefit from fresh (or unpaid) eyes, some because of "optics" (newest trendy term for "public relations"). I'm not sure that being branded as the only OSS-hostile name in the biz' is a wise move.

Double Precision (Not)

From this list, the gist is that most languages can't process 9999999999999999.0 - 9999999999999998.0

Why do they output 2 when it should be 1? I bet most people who've never done any formal CS (a.k.a maths and information theory) are super surprised.

Before you read the rest, ask yourself this: if all you have are zeroes and ones, how do you handle infinity?

If we fire up an interpreter that outputs the value when it's typed (like the Swift REPL), we have the beginning of an explanation:

Welcome to Apple Swift version 4.2.1 (swiftlang-1000.11.42 clang-1000.11.45.1). Type :help for assistance.
  1> 9999999999999999.0 - 9999999999999998.0
$R0: Double = 2
  2> let a = 9999999999999999.0
a: Double = 10000000000000000
  3> let b = 9999999999999998.0
b: Double = 9999999999999998
  4> a-b
$R1: Double = 2

Whew, it's not that the languages can't handle a simple substraction, it's just that a is typed as 9999999999999999 but stored as 10000000000000000.

If we used integers, we'd have:

  5> 9999999999999999 - 9999999999999998
$R2: Int = 1

Are the decimal numbers broken? 😱

A detour through number representations

Let's look at a byte. This is the fundamental unit of data in a computer and is made of 8 bits, all of which can be 0 or 1. It ranges from 00000000 to 11111111 ( 0x00 to 0xff in hexadecimal, 0 to 255 in decimal, homework as to why and how it works like that due by monday).

Put like that, I hope it's obvious that the question "yes, but how do I represent the integer 999 on a byte?" is meaningless. You can decide that 00000000 means 990 and count up from there, or you can associate arbitrary values to the 256 possible combinations and make 999 be one of them, but you can't have both the 0 - 255 range and 999. You have a finite number of possible values and that's it.

Of course, that's on 8 bits (hence the 256 color palette on old games). On 16, 32, 64 or bigger width memory blocks, you can store up to 2ⁿ different values, and that's it.

The problem with decimals

While it's relatively easy to grasp the concept of infinity by looking at "how high can I count?", it's less intuitive to notice that there is the same amount of numbers between 0 and 1 as there are integers.

So, if we have a finite number of possible values, how do we decide which ones make the cut when talking decimal parts? The smallest? The most common? Again, as a stupid example, on 8 bits:

  • maybe we need 0.01 ... 0.99 because we're doing accounting stuff
  • maybe we need 0.015, 0.025,..., 0.995 for rounding reasons
  • We'll just encode the numeric part on 8 bits ( 0 - 255 ), and the decimal part as above

But that's already  99+99 values taken up. That leaves us 57 possible values for the rest of infinity. And that's not even mentionning the totally arbitrary nature of the selection. This way of representing numbers is historically the first one and is called "fixed" representation. There are many ways of choosing how the decimal part behaves and a lot of headache when coding how the simple operations work, not to mention the complex ones like square roots and powers and logs.

Floats (IEEE 754)

To make it simple for chips that perform the actual calculations, floating point numbers (that's their name) have been defined using two parameters:

  • an integer n
  • a power (of base b) p

Such that we can have n x bᵖ, for instance 15.3865 is 153863 x 10^(-4). The question is, how many bits can we use for the n and how many for the p.

The standard is to use 1 bit for the sign (+ or -), 23 bits for n, 8 for p, which use 32 bits total (we like powers of two), and using base 2, and n is actually 1.n. That gives us a range of ~8 million values, and powers of 2 from -126 to +127 due to some special cases like infinity and NotANumber (NaN).

(-1 or 1)(2^[-126...127])(1.[one of the 8 million values])

In theory, we have numbers from -10⁴⁵ to 1038 roughly, but some numbers can't be represented in that form. For instance, if we look at the largest number smaller than 1, it's 0.9999999404. Anything between that and 1 has to be rounded. Again, infinity can't be represented by a finite number of bits.

Doubles

The floats allow for "easy" calculus (by the computer at least) and are "good enough" with a precision of 7.2 decimal places on average. So when we needed more precision, someone said "hey, let's use 64 bits instead of 32!". The only thing that changes is that n now uses 52 bits and p 11 bits.

Coincidentally, double has more a meaning of double size than double precision, even though the number of decimal places does jump to 15.9 on average.

We still have 2³² more values to play with, and that does fill some annoying gaps in the infinity, but not all. Famously (and annoyingly), 0.1 doesn't work in any precision size because of the base 2. In 32 bits float, it's stored as 0.100000001490116119384765625, like this:

(1)(2⁻⁴)(1.600000023841858)

Conversely, after double size (aka doubles), we have quadruple size (aka quads), with 15 and 112 bits, for a total of 128 bits.

Back to our problem

Our value is 9999999999999999.0. The closest possible value encodable in double size floating point is actually 10000000000000000, which should now make some kind of sense. It is confirmed by Swift when separating the two sides of the calculus, too:

2> let a = 9999999999999999.0
a: Double = 10000000000000000

Our big brain so good at maths knows that there is a difference between these two values, and so does the computer. It's just that using doubles, it can't store it. Using floats, a will be rounded to 10000000272564224 which isn't exactly better. Quads aren't used regularly yet, so no luck there.

It's funny because this is an operation that we puny humans can do very easily, even those humans who say they suck at maths, and yet those touted computers with their billions of math operations per second can't work it out. Fair enough.

The kicker is, there is a litteral infinity of examples such as this one, because trying to represent infinity in a finite number of digits is impossible.