CoreData is a very sensitive topic. Here and elsewhere, it’s a recurrent theme. Just last week I had a hair-pulling problem with it that was solved in a ridiculous manner. I’ll document it later for future reference.
This week, triggered by an article on the verge, the spotlight came once again on the difficulties of that technology, namely that it just doesn’t work with iCloud, which by all other accounts works just fine.
It is kind of frustrating (yet completely accurate) to hear from pundits and users that iCloud just works for them for most things, especially Apple’s own products, and that CoreData-based apps work unreliably, if at all. The perception of people not actually trying to make it work is that it’s somehow the developer’s fault for not supporting it. Hence this article on the verge, which highlights the fact that it’s not the developer’s fault. This is a good intent, but unfortunately doesn’t solve anything, since it kind of waggles the finger at Apple and doesn’t explain anything.
But what is the actual problem?
CoreData is a framework for storing an application’s data in an efficient (hopefully) and compact way. It was introduced in 2005 for a very simple purpose: stopping the developers from storing stuff on the user’s disk in “messy” ways. By giving access to a framework that would help keeping everything tidied up in a single (for the “messy” part) database (for the “efficient” part), Apple essentially said that CoreData was a solution to pretty much every storage ailment that plagued the applications: custom file formats that could be ugly and slow, the headache of having “relationships” between parts of documents that would end up mangled or inefficient, etc.
CoreData is a simplification of storage techniques maintained by Apple and therefore reliable, is the underlying tenet. And for the most part, it is reliable and efficient.
iCloud, on the other hand, is addressing another part of the storage problem : syncing. It is a service/framework meant to make the storage on every device a user owns kind of the same storage space. Meaning, if I create a file on device A, it is created on B and C as well. If I modify it on C, the modification is echoed on A and B without any user interaction. Behind the scene, the service keeps track of the modifications in the storage it’s responsible for, pushes them through the network, and based on the last modification date and some other factors, every device decides which files on disk to replace with the one “in the cloud”. The syncing problem is a hard one, because of all the fringe cases (what if I modified a file on my laptop, then closed it before it sent something, then made another modification on my iPad? Which version is the right one? can we mix them safely?), but for small and “atomic” files, it works well enough.
iCloud is a simplification of syncing techniques maintained by Apple, and therefore reliable, to keep the tune playing. And for the most part, it does work as advertised.
But when you mix the two, it doesn’t work.
When you take a look at the goals of the two technologies, you can see why it’s a hard problem to solve: CoreData aims at making a monolithic “store-it-all” file for coherence and efficiency purposes, while iCloud aims at keeping a bunch of files synchronized across multiple disks, merging them if necessary. These two goals, while not completely opposed, are at odds: ideally, iCloud should sync the difference between two files.
But with a database file, it’s hard. It’s never a couple of bytes that are modified, it’s the whole coherence tracking metadata, plus all the objects referenced by the actual modification. Basically, if you want to be sure, you’d have to upload and replace the whole database. Because, once again, the goal of CoreData is to be monolithic and self-contained.
The iCloud philosophy would call for incremental changes tracking to be efficient: the original database, then the modification sets, ideally in separate files. The system would then be able to sync “upwards” from any given state to the current one, by playing the sets one by one until it reaches the latest version.
As you can see, a compromise cannot be reached easily. A lot of expert developers I highly respect have imagined a number of ways to make CoreData+iCloud work. Most of them are good ideas. But are they compatible with Apple’s vision of what the user experience should be? Syncing huge files that have been partially modified isn’t a new problem. And it’s one none of my various version control systems have satisfactorily addressed. Most of them just upload the whole thing.
Just my $.02.