[CoreData] Migrating To A New Model With An Extra Entity

As I mentioned in the previous post, I ran into an annoying problem with CoreData very recently:

Problem: We needed to have an evolution of the model, which in its simplest form meant adding an entity.

Naive solution: Well, duh. Lightweight migration will work just fine. If not, just make a mapping model, and you’ll be fine.

Well, no. Every automatic migration ended up into a “Error: table ZRandomNumber_*WhateverRandomEntityFromYourModel* already exists”

After a lot of digging around, annoying any and every contact who remotely knew anything about CoreData, I managed to extract the actual SQL commands it was trying to execute on the SQLite database.

Guess what? it was trying to recreate some of the tables based on relationships (rightly assuming that some of them had changed or had been added, since I added an Entity). But one of them was getting mangled, because it incorrectly assumed none of the relationships preexisted.

Naive solution v2: Well duh, export the database to an agnostic format, then reimport it in the new model.

Yep. That actually works. But I have 140 entities, and close to 300000 rows. After 4h of crunching, I decided to stop the test.

Naive (and I mean REALLY naive) solution that actually works: Find a way to add your new entity at the end of the alphabetical list. That way it creates the missing relationships after having made everything was kosher. I’m not even kidding, I added ZZ in front, and everything just worked. Try it before you loose your own hair.

  

CoreData, iCloud, And “Failure”

CoreData is a very sensitive topic. Here and elsewhere, it’s a recurrent theme. Just last week I had a hair-pulling problem with it that was solved in a ridiculous manner. I’ll document it later for future reference.

This week, triggered by an article on the verge, the spotlight came once again on the difficulties of that technology, namely that it just doesn’t work with iCloud, which by all other accounts works just fine.

It is kind of frustrating (yet completely accurate) to hear from pundits and users that iCloud just works for them for most things, especially Apple’s own products, and that CoreData-based apps work unreliably, if at all. The perception of people not actually trying to make it work is that it’s somehow the developer’s fault for not supporting it. Hence this article on the verge, which highlights the fact that it’s not the developer’s fault. This is a good intent, but unfortunately doesn’t solve anything, since it kind of waggles the finger at Apple and doesn’t explain anything.

But what is the actual problem?

CoreData is a framework for storing an application’s data in an efficient (hopefully) and compact way. It was introduced in 2005 for a very simple purpose: stopping the developers from storing stuff on the user’s disk in “messy” ways. By giving access to a framework that would help keeping everything tidied up in a single (for the “messy” part) database (for the “efficient” part), Apple essentially said that CoreData was a solution to pretty much every storage ailment that plagued the applications: custom file formats that could be ugly and slow, the headache of having “relationships” between parts of documents that would end up mangled or inefficient, etc.

CoreData is a simplification of storage techniques maintained by Apple and therefore reliable, is the underlying tenet. And for the most part, it is reliable and efficient.

iCloud, on the other hand, is addressing another part of the storage problem : syncing. It is a service/framework meant to make the storage on every device a user owns kind of the same storage space. Meaning, if I create a file on device A, it is created on B and C as well. If I modify it on C, the modification is echoed on A and B without any user interaction. Behind the scene, the service keeps track of the modifications in the storage it’s responsible for, pushes them through the network, and based on the last modification date and some other factors, every device decides which files on disk to replace with the one “in the cloud”. The syncing problem is a hard one, because of all the fringe cases (what if I modified a file on my laptop, then closed it before it sent something, then made another modification on my iPad? Which version is the right one? can we mix them safely?), but for small and “atomic” files, it works well enough.

iCloud is a simplification of syncing techniques maintained by Apple, and therefore reliable, to keep the tune playing. And for the most part, it does work as advertised.

But when you mix the two, it doesn’t work.

When you take a look at the goals of the two technologies, you can see why it’s a hard problem to solve: CoreData aims at making a monolithic “store-it-all” file for coherence and efficiency purposes, while iCloud aims at keeping a bunch of files synchronized across multiple disks, merging them if necessary. These two goals, while not completely opposed, are at odds: ideally, iCloud should sync the difference between two files.

But with a database file, it’s hard. It’s never a couple of bytes that are modified, it’s the whole coherence tracking metadata, plus all the objects referenced by the actual modification. Basically, if you want to be sure, you’d have to upload and replace the whole database. Because, once again, the goal of CoreData is to be monolithic and self-contained.

The iCloud philosophy would call for incremental changes tracking to be efficient: the original database, then the modification sets, ideally in separate files. The system would then be able to sync “upwards” from any given state to the current one, by playing the sets one by one until it reaches the latest version.

As you can see, a compromise cannot be reached easily. A lot of expert developers I highly respect have imagined a number of ways to make CoreData+iCloud work. Most of them are good ideas. But are they compatible with Apple’s vision of what the user experience should be? Syncing huge files that have been partially modified isn’t a new problem. And it’s one none of my various version control systems have satisfactorily addressed. Most of them just upload the whole thing.

Just my $.02.

  

Reviews : Average And Mean{;ness;ing}

So… Highlight has been out for a few months now.

First off, I am extremely surprised it’s getting as much attention as it does (by that, I mean more than 0. I can’t quite retire with the money it’s made me so far). So, my deep and sincere thanks to all of you who actually went and shelled out a whole dollar on that program! I honestly didn’t expect to get the few hundreds of downloads I got.

Back to the topic at hand: for years, I have heard, and read, how dreadful one can feel after getting a bad review. Given the ones I got, I can sincerely sympathize.

The very first thing I did when I pushed the app on the store, was to set up a page explaining how the software works. And I made really sure that the main feature (global toggle for on/off) was visible in the first two lines of the description of the app.

Out of the 5 (yes… 5) reviews listed, 3 of them are along the lines of “hey this program is ok, but an awesome feature would be to have an easy way to toggle it on/off”. Let me state again that this feature has been built in the software since 2006. And that’s in the first 2 lines of the description. And it’s stated clearly in the help page. Oh and there’s a video to demonstrate the actual principle. So… Hum… Well… Duh? And that is the basis for a bad review… Thank goodness I wasn’t planning on living off this app.

The other two are a bit harsh in terms of language, but I think the point underneath is fair: the users seem to think it’s too limited for their uses. It is indeed a simple app, that has 6 “pens” and one color setting. It does nothing extremely advanced, but it works perfectly for me when I’m showing off a demo to a somewhat large audience. And there’s some really clever things going on under the hood, but hey… That’s not really something the users are interested in.

The most interesting part, however, is that I have included a way to contact me at pretty much every level of the help. And the interaction with people who wanted features, or reported bugs has been overwhelmingly positive.

I have no background in, and very little knowledge of, the marketing sciences. But my scientist/engineer mind can’t explain how all these statements can be true simultaneously:

  • everyone I had an exchange with loves the app
  • I am an obscure developer, making an obscure app without any buzz or marketing
  • the only reviews are bad AND wrong for the majority
  • I have a steady sale average week-on-week

And honestly, this really doesn’t make me want to care about reviews. Maybe if the reviews were all positive, I’d sell more stuff. But most of all, it makes me think that the whole review thing is biased.

To write a review, you need to either have a good reason, or some time on your hand. When you’re pissed about something (albeit wrongly, since the feature you are clamoring for is actually already in there… since day one), that’s a good reason. When you feel like the seller needs a leg up or cheering up, that’s a good reason, but the sentiment on injustice has to be quite strong, I guess. So in order to get good reviews (and maybe max out the sales), the seller needs to put an extraordinary amount of effort to elicit that feeling in the buyer.

I choose to treat my customers right: if I find a bug, I’ll fix it. If a user finds a bug, I’ll fix it. If a user requests a feature, I’ll try to talk about it to see if it can be a fit, and explain my decision to the requester. But an anonymous review on the store, by somebody who hasn’t even taken the time of reading the 10 lines for the manual? I’ll pass, thank you.

Am I wrong to think like that?

  

The Joy Of Dependencies

Whether on our favorite UNIX flavor, or on our projects, we are pretty much all dreading the time when we have to update the dependencies. What will be broken? How long will it take? How many platforms/users will we have to leave hanging?

We obviously depend on a number of libraries and pieces of code written in another time, for another project, and/or by somebody else. Whether it’s something big (like the OS itself — just read the forums for cries of despair), or small (OK, so, apparently, that lib takes UTF-8 strings instead of UTF-16 ones, now…), no dependency update is hassle-free and painless. None. Never.

Once upon a time, a long long time ago, in a city not that far away, I had to write tools for a small printing business. My major dependencies were the print driver (which drove a huge triple-rotor, 12 ink barrels, monstrosity), Quark XPress, QuickTime and MacOS (Classic, but I can’t recall the exact version number).

Once it was up and running, after a lot of sweating and swearing, they didn’t dare upgrade anything. If a customer came with a newer version of their XPress document, they just told them to export it again in a more compatible format. And that was it.

Nowadays, with users being informed constantly about updates (and not trained for it), it’s up to us developers to make sure we have the backward compatibility (what if a user upgrades our app, but not the system?), and the forward one (try to keep up with the general thrust of evolution and prepare the code for “easy” updates), and of course fixing the existing bugs.

This obviously adds a lot of overhead to our work, which is very hard to convey. “It’s the fault of that OS maker”, or “yeah but they broke it in that version of the lib”, is something a fellow developer might accept as an excuse, but a paying customer?

And we can’t blame them! How about if someone told you your car could only go on this freeway or in this town, but nowhere else, until they have upgraded the road? Especially if you don’t actually see the difference…

So we’re stuck between dependencies which evolve according to their own constraints and objectives, and users who legitimately want the thing they paid for to work. Not mentioning the apparently dreadful one-star-review (for me, it’s mostly people who don’t read the instructions, then complain about things that are stated somewhere, but hey…), it’s a reputation do-or-die. And our code is literally filling up with workarounds for this or that version of the OS, the libs we depend on.

Why do I go on about this anyway? Well three things: Yoann (@ygini on twitter) and I both had to upgrade our ports on a BSD, which is long and/or painful, because each lib depends on at least 3 others, and I tried to fix a few bugs in an old project of mine I will definitely release someday. Both of them took wayyyyyyyyyyy longer than any “regular user” would accept. We are talking weeks or days of down or crippled-service time here.

I am still looking for a way to make sure people depending on me make the difference between a lame excuse (“they changed something so it doesn’t work anymore”) and a good excuse (“they changed something so it doesn’t work anymore”), and will take any good advice on the topic.

  

Rule Of Thumb

Principles, rules and laws are essentially the same thing. I won’t bother you with a paraphrasing of my recent reading list, which includes Plato, Kepler and Douglas Adams, but for a freelancer, it’s important to differentiate what is what, especially for the Other Guy.

A principle is a lighthouse on the horizon, and it’s OK to veer left and right, or even ignore it altogether. That’s one end of the spectrum. At the other end of the spectrum, you have the Law, which, to quote Morpheus, will break you if you try to break them (and get caught, obviously).

There are varying degrees of rules in between, from the rule of thumb to the house rule. Which apparently is akin to law. Or so I’m told.

Moving on…

Developing a program is kind of a ninja split between the two: some rules are unbreakable, because of maths, and contracts and stuff, and some people try to impose on us rules that can (and sometimes should) be gladly ignored. Just look at some interface designs blatantly ignoring the rule that someone somewhere edicted, and look just plain awesome. Right?

I took a roundabout way to make that point but programmers tend to consider rules with a clear downshift on the “have to” slider.

But, as computers are very attached to their governing rules, humans go a long way to actually enforce them. Case in point: you’re asked to make a mockup app that will illustrate some concept or something. It’s about as easy as making a working prototype, sometimes, so we bend the Prime Beancounter Directive: we go beyond what’s asked. But it’s not what was covered in the Contract. So we don’t get paid. Or at least it’s very hard.

So the appreciation of this particular rule was apparently wrong.

The problem is twofold: the question of the rigidness in the expression of the rule, and whether the Other Person tends to respect the spirit of the rule rather than the letter of it.

For the second part, it’s a lot easier to hide behind wording and you-have-to-s than to imagine what the intent of the rule is. That’s how we get “warning hot” on coffee cups (wait, what? I specifically ordered a lukewarm boiled cup of coffee, not that seemingly delicious cup of joe!), or “do not dry your pet in it” on microwaves (I won’t even bother). As weird as it sound, stupidity is foolproof. Adhering completely to blatantly stupid explicit rules is what makes the world tick smoothly it seems. For more on that, see the Miss Susan vs Jason, in Thief of Time.

You soon learned that ‘No one is to open the door of the Stationery Cupboard’ was a prohibition that a seven year-old simply would not understand. You had to think, and rephrase it in more immediate terms, like, ‘No one, Jason, no matter what, no, not even if they thought they heard someone shouting for help, no one – are you paying attention, Jason? – is to open the door of the Stationery Cupboard, or accidentally fall on the door handle so that it opens, or threaten to steal Richenda’s teddy bear unless she opens the door of the Stationery Cupboard, or be standing nearby when a mysterious wind comes out of nowhere and blows the door open all by itself, honestly, it really did, or in any way open, cause to open, ask anyone else to open, jump up and down on the loose floorboard to open or in any other way seek to obtain entry to the Stationery Cupboard, Jason!’

Loophole. The Dreaded Word by the Rulemakers. The Golden Sesame for the Rulebreakers.

But the power of a loophole relies solely on the fact that the rule is rigid to the point of absurdity. Of course, there should be an unbreakable rule that states that it’s not allowed to come to my home and take my hard-won computer for themselves. Of course there should be one for being able to tell a power hungry person that they overstep said power.

I guess the whole point is finding out where the rule protects a group of people from others and also from themselves. But when we’re talking about breaking a rule, in order to make something better for everyone, it’s an epitome of everything that’s wrong with our reasoning abilities.

And yet… I hear some of you think along the lines of “yea but if some rules should be put aside, how can that not be an argument for that there should be no rule, at least with some people?”. Strictly respecting all the rules makes it easier to have others respect all the rules as well, right?

Wrong.

Again, I think it’s a matter of harm. If by breaking a rule you harm no one (including yourself) in any way (except maybe their ego, but that has nothing to do with anything), then the rule is stupid. And should be ignored. Like, say, going beyond expectations. Actually, breaking a stupid rule should be grounds for an award, a compensation, something stating “wow, that rule was stupid. This awesome programmer deserves a raise. And he’s so cute too… <fawns>“.

Ahem. Anyways…

So then, I hear you think from here on my spaceship, how do you know you’re doing no harm? to anyone?

Dude, the daily personal and professional interactions we have are rarely a matter of life and death for entire nations. Business laws are supposed to protect me from getting screwed over by customers with no scruple. Not to prevent me from doing my job better than I’m supposed to. Fact is, most of the time, to enforce a “common sense” rule (getting paid for a job), I have to go through stupid rules first. And since the Other Guy is usually better equipped than I am to handle these first stupid hurdles, they win the race. So it spirals down: stupidity being the most efficient way, it becomes the norm. And we have to edict new rules to kind of balance the most stupid of our actions, or to close the loophole. Oh wait, another set of stupid rules to follow!

Stupidity is recursive. Thinking is hard.

The end doesn’t justify the means. Life shouldn’t be a permanent chess game either.

  

Wall? What Wall?

The excellent Mike Lee (@bmf on twitter) has a hilarious way of handling theoretical problems: he ignores them to solve them.

In a case of life imitating art imitating life, programmer Mike Lee explained his writing a solution to the halting problem, with the simple explanation that, lacking a formal education in computer science, he didn’t realize it was considered unsolvable.

To solve the halting problem is to write a function that takes any function—including itself—to determine whether it will run forever, or eventually stop.

The classical approach is to wait and see if the function halts, but the necessity to accept itself as input means it will end up waiting for itself.

This paradox is what makes the problem unsolvable, but Lee’s function avoids the paradox by using a different approach entirely.

“It simply returns true,” Lee explained. “Of course it halts. Everything halts. No computer is a perpetual motion machine.”

That being said, the scientists vs engineers problem is an old one. Computer science started out as a branch of mathematics, and was treated as such for the longest time. When I was in college, we didn’t have any exam on an actual machine. It was all pen and paper!

Part of the problem of any major “it can’t be done” block on the road is the sacrosanct “it’s never been done before” or “such and such guys have said it can’t be done”. The truth, though, is that technology and knowledge make giant leaps forward these days, mostly because of people like Mike who just want to get things done.

Just remember that a few decades ago, multi-threading was science fiction. Nowadays, any programmer worth their salt can have a builtin “hang detector” to monitor if a part of their program is stuck in an infinite loop, or has exited abnormally. Hell, it’s hard to even buy a single-core machine!

I distinctly remember sitting in a theoretical computer science class, listening to a lesson on Gödel’s numbers. To oversimplify what I was hearing, the theorem was about how any program could be represented by a single number, however long. And about 5 minutes in, I was saying in my head “duh, it’s called compiling the program”. Had I said that out loud though, I’d probably have gotten in a lot of trouble.

Don’t get me wrong though, I think that mathematical analysis of computer programs is important and worthwhile. I’d like to be able to talk about optimization to a whole lot more people (how you just don’t use an O(n³) sorting algorithm, please…). But whereas I see it as a valuable tool to prove something positive, I stop listening whenever something is deemed impossible.

Trust Mike (and to a lesser extent me) on this: if something is impossible, it’s probably because the right tools haven’t been used yet. Maybe they don’t exist. And I’m ready to acknowledge that there is a probability they won’t exist any time soon. But “never” is a long time for anything to (not) happen.

UPDATE: it seems that people link it with the skeuomorphism ranting from before. True, it does ring familiar: we do things like we’ve always done, because we can’t do otherwise. Right?