As part of our ongoing efforts to document engineering challenges at Realm, this is an account of our recent work to support fast and seamless sharing of Realm files between multiple processes on Cocoa. A necessary feature for most iOS 8 App Extensions, including Apple Watch Extensions, it proved surprisingly difficult to implement.
Thomas Goyne details four attempts at a solution.
Whenever a write transaction is committed, we need to notify everything that has the same file open, signaling that new data is available. This lets you do very useful things, like immediately update your UI after a write is made on a background thread. When we first launched, we only supported opening a Realm file in a single process at a time, which made this quite simple. We looped over all of the
RLMRealm instances for the Realm file, enqueued a notification to be processed on their associated thread, and that was it. When we added support for opening a Realm file in multiple processes at once, we wanted the multi-process case to appear identical to the user, with no loss of performance or explicit setup required. Needless to say, things became a little more complicated.
Get more development news like this
Complicated, but fairly easy to achieve: notifying other processes when an event occurred is the simplest form of inter-process communication, with countless well-known ways to achieve it. But where there is complexity, there are often complications.
A First Attempt: shared pthread condition variables
We started with the obvious solution of a shared pthread condition variable, since condition variables have exactly the semantics we needed.
pthread_cond_t can be placed in shared memory and used from multiple processes at once. We already had several shared mutexes in the
.lock file, for things like ensuring that there’s only a single write transaction at a time, so initially, we just added a condition variable and went on our way.
Unfortunately for us, this only works on Darwin when the shared memory is created in a single parent process that
fork()s, and not when two entirely independent processes open the shared memory.
pthread_cond_wait takes both a locked mutex and a condition variable to wait on, atomically unlocking the mutex to wait on the condition variable, then re-locking the mutex before returning when the condition variable is signaled.
POSIX tells us that waiting on a single condition variable with multiple mutexes is undefined behavior, and may have dangerous unintended effects. On Darwin, pthreads tries to help, returning an error if you attempt to wait on a single condition variable with two different mutexes. This check is done by storing a pointer to the mutex used, and verifying that everything else waiting on the condition variable passes in the same pointer. Unfortunately, this check doesn’t work with mutexes stored in shared memory — a single mutex may be mapped to different virtual addresses by each of the separate processes which share the condition variable and mutex.
We were able circumvent the pointer check by accessing the private
pthread_cond_t data structures, clearing the pointer to the last used mutex before waiting on the condition variable. Though this works with the current version of libpthreads, it had the potential to break catastrophically with new releases of iOS and OS X, so we decided not to ship. The bug was reported as rdar://19600706.
A Second Attempt: POSIX semaphore
Looking elsewhere, for our second attempt we employed a POSIX semaphore. Emulating a condition variable with a semaphore is fairly straightforward, and our initial implementation worked as hoped on OS X. However, when running inside the app sandbox, the name of the named semaphore has to be prefixed with the application group identifier. There is a very short limit on the length of semaphore names (at most 31 characters), and a standard reverse-DNS application group identifier will often be over 31 characters by itself. We also needed extra characters to include information linking it to the related file. Clearly, this was far too restrictive, so the idea was quickly dropped.
A Third Attempt: libnotify
For our third attempt, we turned to libnotify, the C version of
NSDistributedNotificationCenter. For once, there were no surprises! The API worked exactly as described, and a functioning solution was quick to construct. There, the good news ended. Performance fell far short of our goal: low-latency bidirectional communication between processes. At its worst, libnotify was an order of magnitude slower than a condition variable. If your use case isn’t especially latency-sensitive, we’d still recommend it as an excellent choice.
A Side Note: Mach messages
We briefly investigated using Mach messages. Two downsides quickly ruled them out:
As with POSIX named semaphores, sharing mach ports between processes in an app sandbox requires the port name to be prefixed with the application group identifier. Mach ports don’t have a restrictive limit on the length of the name, but there was no good way to get the app group ID from a path within that app group’s container. The path doesn’t contain the ID, there’s no function to get the ID from the path, and there’s no way to find out which app groups an application is part of at runtime. We would have had to require that the user explicitly pass in the app group ID for shared files, or otherwise detail the app groups an app is part of. In short, things wouldn’t “just work”.
Mach ports can only have a single reader at a time. Ideally, we wanted multicast notifications. With multicast notifications, the process committing a write transaction doesn’t have to monitor the number of processes listening for changes (if any). This greatly simplifies the implementation, and eliminates several opportunities for problematic race conditions. With mach ports, we would need to keep an array of port names in the shared lock file, and synchronize the active ports between every process whenever a listener was added or removed.
On our fourth attempt, we found a solution! Our answer: use a named pipe in conjunction with kqueue. We create and manage a named pipe in the directory containing the Realm file, then
kqueue to wait for data to be written to the pipe. Whenever a write transaction is committed, we write a single byte to the pipe, which wakes up everything waiting for data to read. As from Realm Swift v0.92 and Realm Objective-C 0.91, simply put the Realm file in your app group container (on iOS or in sandboxed OS X applications), and the named pipe will be automatically inserted in the same directory. Everything works exactly the same as when you open a Realm file on multiple threads, with no sandboxing complications, and with performance as fast as a condition variable. It’s a solution we’re very happy with!
If you want to know more, you can see the full implementation on GitHub.
You can use Inter-Process Communication in Realm Swift v0.92 onwards (download) the latest version, or read the docs), and Realm Objective-C v0.91 onwards (download) the latest version, or read the docs).
About the content
This content has been published here with the express permission of the author.