Back To School

With 10.15, my old and dependable workhorse of a machine - a souped up Mac Pro "Cheesegrater" from 2010, upgraded in every possible way - will have to retire. Catalina, Swift UI, and I suspect most of the ML stuff, now require AVX instructions in the processor, and there is no replacement that I could find that would slot in the socket.

I don't consider it to be "planned obsolescence" or anything of this ilk, given that this computer has been my home office's principal work station - and game station, mostly on Kerbal Space Program - for almost a decade. It will live on as my test Linux server, and I will slot a bunch of cheap video cards in it to run my ML farm, so it will probably see another decade of service.

However, the question of replacement arose. You see, I'm an avid Blender enthusiast, and I often run ML stuff on it, which nowadays means I need a decent video card that I can upgrade. The new Mac Pro would be perfect for that, but it's on the expensive side, given that I mostly use the high end capacity of the cards for personal projects or for self-education.

I settled on a 2018 mac mini with tons of ram and a small 512GB internal drive. The 16TB of disks I had now live in a USB 3.1 external bay, and the video card will reside in a eGPU box. That way, if and when I need to change the mac again, All I have to do is change the mac mini... hopefully.

Since the sound setup is of some importance to me (my bird/JBL setup has been with me forever), and I sometimes need to plug old USB/FireWire stuff, I dusted my Belkin Express Dock, and plugged everything in it.

The thing is, every migration is an opportunity for change. I've been very satisfied with my OmniFocus/Tyme combo for task management, but the thing I've always wanted to do and was never able to due to lack of time was managing my Gitlab issues outside of a web browser. I've been working on a couple of projects this summer, with lots of issues on the board, and I have old issues in old projects that I keep finding by accident.

As far as I can tell, there is no reliable way to sync that kind of stuff in an offline fashion. This trend has been going on for a long time, and color me a fossil, but I don't live on the web. I like having a twitter client that still works offline, I like managing my tasks and timers and whatever offline if I need to, "the web" coming in only as a sync service.

This migration (with its cavalcade of old software refusing to work, or working poorly under new management) will force me to write some software to bridge that gap (again). The web is cool and all, but I need unobtrusive, integrated, and performant tools to do my work. 78 opened tabs with IFTTT/Zapier/... integrations to copy data from one service to another won't cut it.

The Engineer's Triangle

Fast, cheap, or good, pick two
(an unknown genius)

It's a well-known mantra in many fields, including - believe it or not - in the project manager's handbook. Except they don't like those trivial terms, so they use schedule, cost, scope, instead.

So, why do a lot of developers feel like this doesn't apply to their work? Is it because with the wonders of CI/CD, fast and cheap are a given, and good will eventually happen on its own? But enough of the rant, let's look at the innards of computers to see why you can't write a program that ignores the triangle either.

Fast

The performance of our CPUs have more or less plateaued. We can expand the number of cores, but by and large, a single process will not be done in half the time two years from now anymore, if the developer doesn't spend some time honing the performance. GPUs have a little more legroom, but in very specific areas, which are intrinsinctly linked to the number of cores. And the user won't (or maybe even can't) wait for a process for a few minutes anymore. Gotta shave those milliseconds, friend.

Cheap

In terms of CS, the cost of a program is about the resources it uses. Does running your program forbid any other process from doing anything at the same time? Does it use 4 GB of RAM just to sort the keys of a JSON file? Does it occupy 1TB on the drive? Does it max out the number of threads, opened files and sockets and ports that are available? Performance ain't just measured in units of time.

Good

This is about completude and completeness. Does your software handle gracefully all the edge cases? Does it crash under load? Does it destroy valuable user data? Does it succumb to a poor rounding error, or a size overflow? Is it safe?

Pick the right tool for the right job

And so, it's a very very very hard thing to get all three in a finite amount of time, especially in the kind of timescales we work under. Sometimes, it's even lucky if we get only one of those.

It's important to identify as soon as possible the cases you want to pursue:

  • Cheap and fast: almost nothing except maybe tools for perfectly mastered workflows (where the edge cases and the rounding errors are left to the user to worry about)
  • Fast and good: games, machine learning, scientific stuff
  • Good and cheap: pro tools (dev tools, design tools, 3d modelers, etc) where the user is informed enough to wait for a good result

[BETA] Fun With Combine

I'm an old fart, that's not in any way debatable. But being an old fart, I have done old things, like implementing a databus-like system in a program before. So when I saw Combine, I thought I'd have fun with re-implementing a databus with it.

First things first

Why would I need a databus?

If you've done some complex mobile programming, you will probably have passed notifications around to signal stuff from one leaf of your logic tree to another, something like a network event that went in the background and signalled its task was done, even though the view controller that spawned it has gone dead a long time ago.

Databuses solve that problem, in a way. You have a stream of "stuff", with multiple listeners on it that want to react to, say, the network went down or up, the user changed a crucial global setting, etc. And you also have multiple publishers that generate those events.

That's why we used to use notifications. Once fired, every observer would receive it, independantly of their place in the logic (or visual) tree.

The goal

I wanted to have a databus that could do two things:

  • allow someone to subscribe to certain events or all of them
  • allow to replay the events (for debug, log, or recovery purposes)

I also decided I wanted to have operators that reminded me of C++ for some reason.

The base

Of course, for replay purposes, you need a buffer, and for a buffer, you need a type (this is Swift after all)

public protocol Event {
    
}

public final class EventBus {
    fileprivate var eventBuffer : [Event] = []
    fileprivate var eventStream = PassthroughSubject<Event,Never>()

PassthroughSubject allows me to avoid implementing my own Publisher, and does what it says on the tin. It passes Event objects around, and fails Never.

Now, because I want to replay but not remember everything (old fart, remember), I decided to impose a maximum length to the replay buffer.

    public var bufferLength = 5 {
        didSet {
            truncate()
        }
    }
    
    fileprivate func truncate() {
        while eventBuffer.count > bufferLength {
            eventBuffer.remove(at: 0)
        }
    }

It's a standard FIFO, oldest in front latest at the back. I will just pile them in, and truncate when necessary.

Replaying is fairly easy: you just pick the last x elements of a certain type of Event and process them again. The only issue is reversing twice: once for the cutoff, and once of the replay itself. But since it's going to be seldom used, I figured it was not a big deal.

    public func replay<T>(count: UInt = UInt.max, handler: @escaping (T) -> Void) {
        var b = [T]()
        for e in eventBuffer.reversed() {
            if b.count >= count { break }
            if let e = e as? T {
                b.append(e)
            }
        }
        for e in b.reversed() {
            handler(e)
        }
    }

Yes I could have done it more efficiently, but part of the audience is new to swift (or any other kind) of programming, and, again, it's for demonstration purposes.

Sending an event

That's the easy part: you just make a new event and send it. It has to conform to the Event protocol , so there's that. Oh and I added the << operator.

    public func send(_ event: Event) {
        eventBuffer.append(event)
        truncate()
        eventStream.send(event)
    }
    static public func << (_ bus: EventBus, _ event: Event) {
        bus.send(event)
    }

From now on, I can do bus << myEvent and the event is propagated.

Receiving an event

I wanted to be able to filter at subscription time, so I used the stream transformator compactMap that works exactly like its Array counterpart: if the transformation result is nil, it's not included in the output. Oh and I added the >> operator.

    public func subscribe<T:Event>(_ handler: @escaping (T) -> Void) {
        eventStream.compactMap { $0 as? T }.sink(receiveValue: handler)
    }
    static public func >><T:Event> (_ bus: EventBus, handler: @escaping (T) -> Void) {
        bus.subscribe(handler)
    }

The idea is that you define what kind of event you want from the block's input, and Swift should (hopefully) infer what to do.

I can now write something like

bus >> { (e : EventSubType) in
    print("We haz receifed \(e)")
}

EventSubType implements the Event protocol, and the generic type is correctly inferred.

The End (?)

It was actually super simple to write and test (with very high volumes too), but I'm guessing there would be memory retention issues as I can't figure out a way to properly unsubscribe from the bus, especially if you have self references in the block.

Then again it's a beta and this is a toy sample. I will need to dig deeper in the memory management stuff, but at first glance, it looks like the lifetime of the blocks is exactly the lifetime of the bus, which makes it impractical in real cases. Fun stuff, though.

[Security] Tracking via Image Metadata

From Edin Jusupovic

Facebook is embedding tracking data inside photos you download

Of course they do.

[Rant] Hyperbolae

The backlash over some of the community's lack of enthusiasm for SwiftUI - mine included - was a lot milder than I thought it would be given the current trend of everything being absolutely the best thing that ever happened or the worst in history.

While that definitely surprised me in a positive way, it also made me think about the broader topic of the over-abundance of hyperbolae (or hyperboles if you really insist) in our field.

The need to generate excitement over something that is fundamentally boring information manipulation science drives me fairly bonkers, and in my opinion has some very bad side effects.

Here's a few headlines in my newsfeed (pertaining to CS)

  • "Nokia reveals 5G-ready lithium nanotube battery with 2.5X run time"
[at the end of the article]
As is commonly the case with new battery technologies, the researchers are providing no specific timetable for commercialization.
  • "AI was everywhere in 2018 and it will continue to be a major topic in 2019 as we begin to witness AI breakthroughs across businesses and society"

    No we don't. "AI" doesn't exist, and machine learning algorithms are the same as they were 20 years ago.
  • "Flutter will change everything, and Apple won't do anything about it"

    Yea, well... I guess predictions aren't that guy's forte.
  • "Apple's AR glasses arriving in 2020, iPhone will do most of the work"
[just below the title, emphasis mine]
Apple's long-rumored augmented reality headset could arrive mid-way through 2020 prominent analyst Ming-Chi Kuo believes

And that's just stuff people have thrown at me in the past few days... You can add quantum computing, superfine processor lithography, AR/VR news, etc if you feel like it. I will spare you the most outrageous ones.

OK, and?

Look, I get it. Websites that are paid through advertising need to generate traffic and they will use every single clickbait they can find.

My problem is that people that are supposed to be professionals in my field are heavily influenced by those headlines and by, well, influencers, who hype things. It's like everyone, including people who are supposed to actually implement those things for actual paying customers, are succumbing to a mass hysteria. No wonder clients are so harsh and sometimes downright hostile when it comes to evaluating the quality of the work done.

Because "AI" is so hyped up these days, I've had people refusing to pay me for ML work, because the percentage of the predictions were "too low", based on a fluff piece Google had posted somewhere on the theoretical capabilities of their future product that will do better... As if an on-device computationally expensive model built by one guy could outperform a theoretical multi-million cloud-based computer farm, on which a couple of hundred techs will have worked on for a few years...

The five star problem

By over-hyping everything, we end up in a situation where something that's good is a "four stars", something that's good with a very good (and expensive) marketing campaign and tech support mayyyyyyy get a "four and a half", and everything else is a "one star".

Is a new piece of tech like SwiftUI good? No one knows, because we haven't used it in production yet. It's interesting, for sure. It seems to be fairly performant and well-written. Does it have limitations? Of course it does. Will it serve every single purpose? Of course not. Why can't we be interested in a mild manner, recognizing the positive and the drawbacks?

Is ML a useful tool? Yes indeed. To the point where we can build useful stuff with it anyways... Is it going to replace a smart human anytime soon? Of course not.

Lately, it seems like I have to remind people all the time that computer science is a science. Is chemistry cool? Of course it is! Do you see chemists run around everywhere clamoring that they have found a new molecule that will save humanity every goddamn week? No, because despite the advances in AI (without quotes, it means something fairly different than "machine learning"), we still haven't found a way to predict the future.