Web Services And Data Exchange

I may not write web code for a living (not much backend certainly and definitely no front-end stuff, as you can see around here), but interacting with webservices to the point of sometimes having to “fix” or “enhance” them? Often enough to have an Opinion.

There is a very strong divide between web development and more “traditional” heavy client/app development : most of the time, I tell people I write code for that these are two very distinct ways of looking at code in general, and user interaction in particular. I have strong reservations about the current way webapps are rendered and interacted with on my screen, but I cannot deny the visual and overall usage quality of some of them. When I look at what is involved in displaying that blog in my browser window, from the server resources that it takes to the hundreds of megabytes of RAM to run a couple paltry JS scripts in the window, the dinosaur that I am reels back in disgust, I won’t deny it.

But I also tell my students that you use the tool that’s right for the job, and I am not blind : it works, and it works well enough for the majority of the people out there.

I just happen to be performance-minded and nothing about the standard mysql-php-http-html-css-javascript standard pipeline of delivering stuff to my eyeballs is exactly doing that. Sure, individually, these nodes have come a long long way, but as soon as you start passing data along the chain, you stack up transformation and iteration penalties very quickly.

The point

It so happens that I wanted to do a prototype involving displaying isothermic-like areas on a map, completely dynamic, and based on roughly 10k points whenever you move the camera a bit in regards to the map you’re looking at.

Basically, 10 000 x 3 numbers (latitude, longitude, and temperature) would transit from a DB to a map on a cell phone every time you moved the map by a significant factor. The web rendering on the phone was quickly abandoned, as you can imagine. So web service it is.

Because I’m not a web developer, and fairly lazy to boot, I went with something that even I could manage writing in : Silex (I was briefly tempted by Kitura but it’s not yet ready for production when involved with huge databases).

Everyone told me since forever that SOAP (and XML) was too verbose and resource intensive to use. It’s true. I kinda like the built-in capability for data verification though. But never you mind, I went with JSON like everyone else.

JSON is kind of anathema to me. It represents everything someone who’s not a developer thinks data should look like :

  • there are 4 types that cover everything (dictionary, array, float, and string)
  • it’s human readable
  • it’s compact enough
  • it’s text

The 4 types thing, combined with the lack of metadata means that it’s not a big deal to any of the pieces in the chain to swap between 1, 1.000, “1”, and “1.000”, which, to a computer, is 3 very distinct types with hugely different behaviors.

But in practice, for my needs, it meant that my decimal numbers, like, say, a latitude of 48.8616138, gobbles up a magnificent 10 bytes of data, instead of 4 (8 if you’re using doubles). That’s only the start. Because of the structure of the JSON, you must have colons and commas and quotes and keys. So for 3 floats (12 bytes – 24 bytes for doubles), I must use :

{lat:48.8616138,lng:2.4442788,w:0.7653901}

That’s the shortest possible form – and not really human readable anymore when you have 10k of those -, and it takes 41 bytes. That’s almost four times as much.

Furthermore

Well, for starters, the (mostly correct) assumption that if you have a web browser currently capable of loading a URL, you probably have the necessary bandwidth to load the thing – or at least that the user will understand page load times – fails miserably on a mobile app, where you have battery and cell coverage issues to deal with.

But even putting that aside, the JSON decoding of such a large datasets was using 35% of my cpu cycles. Four times the size, plus a 35% performance penalty?

Most people who write webservices don’t have datasets large enough to really understand the cost of transcoding data. The server has a 4×2.8Ghz CPU with gazillions of bytes in RAM, and it doesn’t really impact them, unless they do specific tests.

At this point, I was longingly looking at my old way of running CGI stuff in C when I discovered the pack() function in PHP. Yep. Binary packing. “Normal” sized data.

Conclusion

Because of the large datasets and somewhat complex queries I run, I use PostgreSQL rather than MySQL or its infinite variants. It’s not as hip, but it’s rock-solid and predictable. I get my data in binary form. And my app now runs at twice the speed, without any optimization on the mobile app side (yet).

It’s not that JSON and other text formats are bad in themselves, just that they use a ton more resources than I’m ready to spend on “just” getting 30k numbers. For the other endpoints (like authentication and submitting new data), where performance isn’t really critical, they make sense. As much sense as possible given their natural handicap anyways.

But using the right tool for the right job means it goes both ways. I am totally willing to simplify backend-development and make it more easily maintainable. But computers work the same way they have always done. Having 8 layers of interpretation between your code and the CPU may be acceptable sometimes but remember that the olden ways of doing computer stuff, in binary, hex, etc, also provide a way to fairly easily improve performance : less layers, less transcoding, more cpu cycles for things that actually matter to your users.

  

Demodynamics

It should be clear by now: I am a geek. Aside from all the normal quirks, I’m a computer geek, which means that I dream about systems and I subcounciously try to optimize things, make them more rational if not more efficient… I’m told it’s borderline rude, sometimes.

Anyway.

There is one thing geeks and non geeks who actually encounter large amounts of people all at once agree on: we suck at demodynamics.

Look at a school of fish or a flight of sparrows. Even though they have no brain to speak of compared to ours, you don’t see them bumping into each other even though their speed and group density is a receipe for disaster. Imagine a bunch of people you say “run around for a half hour, but you have to stay together as a group” to. When you’re done laughing, you’ll know what I mean.

Why am I rambling about demodynamics anyway?

Well, professionally, you can draw a lot of parallels between the two following situations:

  • a group of people is supposed to run together towards a common goal without knowing the route and finding some difficulties along the way
  • a group of people is supposed to deliver a product that has been outlined in somewhat vague (from an engineer’s point of view) fashion

And you see the same kind of dynamics: people shoving, people showing off, but also people helping each other when facing a wall etc…

Yesterday, I was in the subway (but you can have similar occurences when driving), and a couple of ladies rushed past me in a corridor, only to go half my speed ahead of me, effectively blocking me, because they were side by side.

Now, the worst part is I don’t think they even realized. They were side by side because they were chatting, and going slower for the same reason. Whoever is placed in that situation will undoubtedly sigh heavily, at the very least. But the same can be said for people who honk at you when you can’t pass the truck in front of you, etc…

As I said, people suck at demodynamics. Evaluating the right time to yield a priority you do have, in order to fluidifying traffic for everyone, including you, is a hard thing to do, since you basically can’t trust anyone around you to act with the same plan, let alone intent.

When you think about it, it’s all about two things: telegraphing your intent (and your plan), and being on the lookout for other people telegraphing their intent. That’s level zero. Then you have to know when to enforce and when to yield, and telegraphing that as well.

Most people think the problem lies in the second layer. We are a competitive race, and we naturally expect our solution to be followed. But my impression is that we completely lack the understanding of level zero. It’s not that our plan is the best one… It’s that it’s the only one.

Talking about this to my friends in the business and outside of it, we kind of agreed that people who like to do things when they have to relinquish control to have a better time are the ones looking around for cues and avoid bumping into other people (as understood in a general sense): people who dance a lot, musicians, construction workers, military or military inspired people,…

In any project I go with, it is painstakingly obvious that if someone I depend on fails, I’m screwed. If for nothing else, that makes a duty of mine to help this person. To some degree, the same can be said about people “above” me. I have to point at potential problems early and help them make a decision.

Unfortunately, as with the people in the subway or on the road, it doesn’t seem to be that obvious. Here in France, we go back and forth on a mandatory class taught to all kids that’s called “civic instruction”, or whatever the name that thing might have these days. Is there any way we could make that a demodynamics course, or a dance class?

TBC

  

We Are What You Call Experts

OK, so France now has an experts board of digital something or other. Most companies I work with have hired, or will hire an expert to recommend stuff or audit stuff. And of course, even I get hired for expertizing stuff every now and again (go figure)…

As I stated before here and there, there’s something troubling about experts in what is perceived as my field. Experts in demolition or piloting, or botany, I get. These are highly specialized fields where it’s easy to spot an expert: they clearly know what they are talking about. You test them by asking them to do what they are experts of.

But in computer science, the field is so vast that it’s quite easy to pretend to be (or be mistaken as) an expert on one of the gazillions of subfields this domain has. Even family sometimes doesn’t get the fact that a coding or design expert isn’t the person best suited to repairing the printer…

Sometimes I feel like we are the doctors from the 1600s. We use jargon, we give off a vibe of tightly knitted hermetic community, and we wield an inordinate amount of power in regard to what we actually do, or know. Would you go to a vet to reattach your cut finger? Or go “hey come on, you have a medical degree, you can give me meds for my heart condition I think I have!” to a cousin who’s studying to be a chiropractor? (no offense to either of these fine specializations, it’s just to illustrate a point, I wouldn’t ask a heart surgeon to set a splinter either)

We live in a world of experts. Because of the high specialization of everything, you have to be certified, it’s harder and harder to switch fields, and the amateur sports are loosing spectators. But as soon as we are talking about computers, the expert status is somewhat murky. How many times do we freelancers have to “compete” with the second cousin of the daughter’s hairdresser, who’s “making websites”? Dude, I’m an app developer, I have a score of people I trust who can build an awesome website for you, why would I know anything about web technologies? I rely on… experts… for that… Can I dabble in it and commit an atrocity that would pass in poor light for something acceptable? Sure! Should I get paid for that? Hell no, there are people way better suited for that job. Could I? Probably.

Enough ranting, how can anyone rate somebody as an expert? In computer science, diplomas are not a sure way. Look at Mike, who’s clearly an expert, yet came from journalism. Portfolios are a good indicator, but only an expert can gauge the difficulty of the thing. Publications are yet another indicator (thank for reading this, by the way!), but with the Internet, the number of plagiarism cases is going through the roof.

“You are pretty bleak” I hear you think… But in all seriousness, I wouldn’t even know how to prove to you that I’m an expert. And by proving, I mean convincing you I know what I’m doing, without having to work for you for free to build something in an Internet-shielded room for a week. When all’s said and done, it’s just a matter of marketing myself. It helps that most people see computer science as some kind of magic, and are therefore highly susceptible to buzzwords and “hey I make loads of money in my work, that must mean I’m good, right?”. Wrong. Buzzwords are easy to acquire (read Plato’s Gorgias if you don’t believe me), and the money argument is a tautology and a self-fulfilling prophecy.

So to all of you awesome developers and designers and computer geeks out there who are really and immensely competent, yet don’t have the respect and credibility they deserve, kudos! And I’m sorry. I fear it will be a long time before there’s an objective way, accepted by most people, to finally get how good you all are. It took medicine a couple of millennia to go from “having a diploma” to “having a somewhat clearer way of discerning experts from fakers”. Hopefully it won’t be that long this time around, but I can make no promises. I’m not an expert in such matters…