Slug wendy lu header

Data Consistency in an Unpredictable World

Pinterest recently completed a rewrite of the iOS app for a faster, cleaner experience. One of the goals of the re-architecture was to move to a completely immutable model layer. “Immutable models” is a term people hear a lot of these days - many apps have converted to immutability. However, immutability poses challenges in handling data consistency of a complex object graph. In this Swift Language User Group talk, Wendy Lu discusses the motivation behind the Pinterest migration and explores how the new system handles updating models, loading new information from the API, and other data integrity concerns.


Introduction (0:00)

My name’s Wendy. Today I’m going to be talking about data, and how to keep data consistent in this crazy ever changing world that we live in.

A little bit about myself: I work as an iOS developer at Pinterest and this is my first time speaking at Realm. I did give a variant of this talk at Swiss Summit. If you were at Swiss Summit, some of this might look familiar to you.

I want to thank Realm, first of all, for hosting this meetup. I enjoy coming here, both as a speaker and as an attendee and just meeting all you awesome iOS developers. I go to a lot of meetups like this, and conferences; this is definitely one of my favorites.

Emojicon (0:49)

I got back from a conference a couple months ago in San Francisco, called Emojicon. If you haven’t heard of it, don’t feel bad, I hadn’t heard of it either. Emojicon is described as a multi-day celebration of everything Emoji.

Even after hearing that, I had no idea what it was. It sounded awesome. I had to go check it out. Emojicon is actually a technical conference. They have talks such as “Emoji and deep learning.” I don’t know if I have the AI background for this: I’ll definitely check it out once the videos come online.

They also had a very diverse set of sponsors, such as General Electric and Panda Express. I love Panda Express, so this is perfect for me.

New Emojis (1:53)

One of the most controversial topics at Emojicon was the new iOS 10.2 emojis. We had a lot of people mourning the death of the peach butt, which they brought back in iOS 10.3. Even the most standard data sets, such as Unicode, are constantly changing and constantly updating.

Today, we’ll go over some strategies to manage that type of change and to maintain consistency.

Pinterest Growth (2:33)

In my four years working at Pinterest, we’ve grown from a team of four iOS developers to over three dozen iOS developers. It’s been great, and I love watching the team grow.

With that growth, you need to do a lot to make sure that your app is prepared for the future, and that it can withstand the growth of both your user base and your team.

Network Speed (2:55)

For example, when we started expanding internationally, we realized that not all users were on the best phones or the best networks. Speed became really important for us.

If you’ve ever used your phone on the subway or on the BART train, you know just how frustrating it can be to use an app on that type of slow, or spotty network.

For users in some parts of the world, that’s as good as they get; the network that’s so frustrating to you is their status quo, at least for now.

How do we ensure that people have a good experience when using our app, regardless of what country they’re in, what device they’re on, or what type of bandwidth they have?

Network Optimization (3:41)

We started looking everywhere for speed improvements. Network usage was first–optimizing the amount of bytes we send in our API requests and responses.

We also looked at app startup time, as well as doing things concurrently. For app startup time, there’s a method, application(_:didFinishLaunchingWithOptions:) in your app delegate. For us developers, that’s often the first hook that we have to run our own code in the app.

Get more development news like this

When your team gets big enough and you have enough legacy code, that method starts looking something like this:


AVPlayer.resetAVAudiOSessionCategoryToDefault()

FBSDKSettings.configureForUseForApplication(application,
withLaunchOptions:launchOptions)

GSDAppIndexing.sharedInstance().registerApp(kAppStoreID)

iRate.configureForUse()

Adjust.appDidLaunch(adjustConfig)

Stripe.configureForUse()

DDLog.addLogger(DDTTYLogger.sharedInstance())
let fileLogger = DDFileLogger()
fileLogger.logFileManager.maximumNumberOfLogFiles = 3
DDLog.addLogger(fileLogger)

PIDeadlockDetector.enable()

PICrash.sharedInstance().configureForUse()

PINRemoteImageManager.configureForUse()

CBLExperienceManager.configureForUse()

NSValueTransformer.setValueTransformer(PIDateValueTransformer(), forName:kPINModelDateValueTransformerKey)

CBLDeepLinkManager.sharedManager().configureServicesWithLaunchOptions(launchOptions)

That is from our code base. We’re initializing several things. We have some third party frameworks like Stripe and Adjust, some debug tools, our crash detector, some manager classes, and various caches.

All those tools were slowing down our app startup. By moving most of the processes to a low priority background queue, we freed up resources on startup to do more crucial tasks: get that initial feed request sent off; get that response parsed, and start showing useful information to the users.

Reduced Startup Time (5:14)

Now on startup, we initialize only what we need to send and receive that initial feed request.

We moved all those other processes to a low priority concurrent queue. With that alone, we reduced our startup time by 50%. In some countries, we were able to get startup time down to a third of the original startup time.

We looked at doing UI work concurrently, and we moved to a framework called Async Display Kit, which allows us to configure our views and lay them out from the main thread.

This is really exciting because we can take advantage of the multiple cores that most modern iOS devices have. It allows us to move our most computationally expensive code–in our case layout code–to background queues.

The Immutable Model Layer (6:07)

With our app becoming more and more concurrent and multi-threaded, we needed to make sure that our model layer, the foundation of our app, was prepared as well. We moved to a completely immutable model layer. Once models are created, they can’t be changed. This is awesome!

In a standard mutable system, with two view controllers, both referencing the same Pin, bad things can happen if View Controller One changes the Pin and View Controller Two doesn’t expect that Pin to change.

What’s more, if View Controller Two is reading from Pin while View Controller One is writing to it, then View Controller Two could potentially read an intermediate value or an invalid value, which could cause our app to crash.

An Example (6:53)

(Slide shows a chat log image.)

Here I’m in a chat with Taylor Swift and Kanye West. Taylor is totally down to be in my Swift Language User Group talk. She is amazing, so of course she would say yes. But Kanye interrupts with, “I’mma let you finish but,” and sends us a cat photo.

Taylor goes, “Wtf, Kanye?” He just sends us some more cat spam. I will have to block Kanye, as cute as these kittens are.

I call this method blockUser(users[1]) and block the user with the first element in the users array. But before the method gets a chance to execute, another thread comes in and modifies the user’s array.

Maybe it gets an updated response from a server, or something, and it changes this array on a different thread. Now when blockUser is called, it retrieves the first element in the array, but returns Taylor instead of Kanye. So I end up blocking the wrong person.

Reading the wrong value could lead to potentially awkward situations that I want to avoid.

Immutable is Thread Safe (8:06)

Most of the problem of mutability lies in shared state. In an immutable system, once Pin is initialized, we know that it won’t change out from under us. This way, we can safely have multiple readers, all reading concurrently from the Pin, without worrying about reading an intermediate value or an invalid statement.

Immutable models are inherently thread safe. That’s why we moved to an Immutable Model System.

Updating Models (8:36)

Since models can’t ever change, how would we update them or mutate them?

In an immutable system, models can’t be modified once they are created. Thus, the only way to update a model is to create an entirely new instance of a model object. In our code base, we have two ways to do this.

Update Model from JSON (9:02)

The first is to create an updated model from a JSON dictionary. It doesn’t have to be a JSON dictionary but that’s usually the most common case.


{
"board" =  {
"created_at" = "Tue, 13 Aug 2013 16:38:36 +0000";
"id" = 418131215342691718;
"name" = "spaces";
};
"comment_count" =  0;
"description" = "At the top of my wish list for this fall is a giant chunky knit wool blanket.";
"id" = "AVpd31ttshLHlWbcG9g_Kt3uVzZHjfHNvzwT20p6YnO6qzvQnqs_Z5A";
"Image_square_url" = "https://s-media-cache-ak0.pinimg.com/b58cc94084407a39d62c83885ce4699e.jpg";
}

let Pin = Pin(dictionary:pinJSON)

This is pretty straightforward: we have a JSON response that we get from the server for our Pin. A Pin on Pinterest is our most basic model, basically like a post. I can simply initialize a Pin model with this dictionary.

Update Model from Builder Object (9:28)

The second way is to create an updated model from a builder object. A builder object is pretty common–a mutable representation of a model that takes on all of that model’s fields.

Here is an immutable Pin with three properties: image URL, title, and board. My Pin builder will take on these exact same values for image URL, title and board.

But because my Pin builder is mutable, I can modify whatever I need. For example, I can change the title from “The Best Pin in the World” to “Meow” and I can initialize a new Pin with this new builder object.


let pin = Pin(builder:pinBuilder)

Loading and Caching (10:18)

Let’s address loading and caching data from our server. We’ve all heard a variant of this quote, “There are only two hard things in computer science: cache invalidation, naming things, and off-by-one errors.”

We already talked about off-by-one errors, so let’s talk about caching.

Our API allows us to request partial JSON models from the server, with a subset of that model’s fields. For example in the Pin feed view, we need fields such as the image URL, the description, and the user. We don’t need the full Pin information, such as recipe information until the user actually taps to navigate to this detail view.

By allowing us to get these partial JSON models, this cuts down on the amount of data we’re sending over the API, as well as cuts down on backend processing time spent retrieving unnecessary fields from the server.

Pin Cache (11:22)

We make this work in our new immutable system by keeping a central model cache based on Pin Cache, which is an object cache-set we built and open sourced. The keys to this cache are the unique server-specified IDs to our models.

(Slide shows PinJSON with ID, image_url, and recipe fields. Cache model has a matching ID field, a different image_url field, and a board field)

When I receive a new server response, I’ll first check the cache to see if there is a model with the same ID. If there is an existing model, I merge the new server response with the existing cache model.

In this example, both the server response and the existing cache model have an ID of 123, so my merge model will take on an ID of 123.

Both PinJSON and cache have an image_url but the image_url fields are different, so we use the more recent response.

Only the cache model has a board field, so my merge model will take on that value for board. Finally, only the new server response has a recipe field. Therefore, my model will take on that value for recipe.

Finally, we insert this merge model back into the cache. In this way, we can make sure that the cache always contains the most recent superset of all the fields that we’ve ever requested from the server.

Merge After Initialization (12:55)

We perform this merge step each time after initializing a model, either with a dictionary or a builder.

We also have one more way to initialize the model, which is initWithCoder, which is basically just for NS-Secure coding compliance, but dictionary or builder are the most common.

Because we merge after every initialization, we can hold that cache as a source of truth, and we know that the cache will always contain the most recent fields.

Observing For Changes (13:28)

Now that we have a way to load data from the API, and update models, we need a way to alert other parts of an the app that a model that they care about might have changed, such as our views, or our view controllers. These display our data, and need to be updated when a model that they are dependent on has changed.

Previously, we used Key-Value Observing for this. KVO does not work in an immutable system, because you can only observe on one instance of a model object, which would not be changing anyway.

NS Notification (14:05)

We switched to a NS notification based system to do this.

Our profile view controller is an example. (See Taylor Swift slide.) It needs to observe on its user object because an updated user object might have updated fields such as Pin count, or follower count, which need to be reflected in the UI.

Our developers interact with the following method to do this, using a helper class, called notification manager, which simply calls addObserver for the updated model.


notificationManager.addObserverForUpdatedModel(user, block:{ (NSNotification) in // Update profile view here!
})

Under the hood, it uses the NS notification center block-based API. If you haven’t heard of the block based API, it’s very similar to the standard selector based API, except instead of passing in a selector, you pass in a block. That block will get executed whenever the notification is fired.

It’s a long story how we chose the block-based API. Basically, there are some subtle race conditions in notification center pre-iOS 9. These usually happen when using notification center in multi-threaded environment, and those are in practice, much harder to hit using this block based API.

Block Based API (15:18)

If we take a closer look at the method signature for this API, we’ll see that it actually returns an object conforming to NSObjectProtocol.


NSNotificationCenter.defaultCenter ( ) .addObserverForName ("name", object: nil, queue: nil) { note in // …
}

public func addObserverForName (name: String?, object obj: AnyObject?, queue: NSOperationQueue?, usingBlock block: (NSNotification) -> Void) -> NSObjectProtocol

Dig a bit deeper, you’ll see that these objects are actually of a private class called __NSObserver. These NSObserver objects are the actual registered listeners that notification center is using.

Unregistering an Observer (15:43)

With the standard API, to unregister an observer, you would call NSNotificationCenter and remove the observer with self.


NSNotificationCenter.defaultCenter().removeObserver(self)

With the block based API, you want to call removeObserver, and with that return the observer object.


NSNotificationCenter.defaultCenter().removeObserver(observer)

Notification Manager Helper Class (15:57)

We do need to keep track of these return observer objects because we’re going to be unregistering them later. This is easy, however. Remember the Notification Manager Helper Class?


notificationManager.addObserverForUpdatedModel(user, block:{ (NSNotification) in
// Update profile view here!
})

It keeps strong reference to all of those return observers.

Let’s take a closer look at this class.


Class NotificationManager: NSObject {
    private var observerTokens: [String: AnyObject] = [:]

    deinit {
        unregisterAll()
    }
    
    func unregisterAll() {
        for token in observerTokens.values {
            NSNotificationCenter.defaultCenter().removeObserver(token)
        }
    }
}

The Notification Manager Helper Class has a dictionary called observerTokens which maps the notification name that we registered to that return NS observer object.

Because this notification manager object is meant to be used as a property of the view or the view controller that’s doing the observing, we know that the deinit of this class, or this object will get called right after the deinit of the view controller.

Because the view controller is the only one holding a reference to it, this object will be de-allocated right afterwards. In the deinit, we can loop through our observer tokens dictionary and unregister all our observers.

Our developers did not actually have to remember to manually unregister observers. As our team grew, that gave us a lot of peace of mind because even if someone forgot to unregister something, we knew that it wouldn’t cause our app to crash.

Post iOS 9, you don’t have to manually unregister, but we do support down to iOS 7 on our team. So, we had to be aware of and be conscious of unregistering our observers.

I got this idea from the More Indirection blog. If you’re interested to learn more about this, check out their blog post. There are some great ideas in there.

Not so Complicated (17:40)

This was an in-depth look at how our observer system works. However, most developers don’t need to worry about any of that. All they are interacting with is this one method notificationManager.addObserverForUpdatedModel.


notificationManager.addObserverForUpdatedModel(user, block:{ (NSNotification) in
// Update profile view here!
})

New Model Notification (17:57)

When a new model is created or updated, we can post a notification for it, by calling NotificationManager.postModelUpdatedNotificationWithObject.


let newUser = PIUser(builder:builder)
NotificationManager.postModelUpdatedNotificationWithObject(newUser)

This method will first check that cache for any object with the same ID. This cache object will contain the most recent fields received from the server. If there are any observers, it’ll post it out to those observers.

Making UI Updates (18:28)

With our updated object, we can finally go ahead and make any UI updates accordingly.

The new model object will be passed in the object field of the NS notification, so we can simply grab that new model object and then update our UI accordingly.


notificationManager.addObserverForUpdatedModel(user, block: { [weak self] 
notification in 
if let user = notification.object as ? PIUser {self ?.user = user 

// Update profile view here!
Self ?.titleLabel.text = user.name 
self ?.imageView.setImageWithURL(user.imageURL
}
})

For example, here, we are setting the title labels, text and image views URL. We can do also do things like check that these values actually change before setting them on the UI elements.

Wrapping Up (18:59)

That was a look at what our data model layer looks like at Pinterest, and how we maintain data consistency. If you’re interested in immutability in general, or moving your own apps to immutability, here are some additional resources.

There’s a blog post on the Pinterest engineering blog that goes over a lot of what I just talked about in greater detail. It’s a series of blog posts on our iOS app rewrite, and there’s also one more blog post on immutable models in there.

Facebook also has some really great posts on moving to immutability and how it helped speed up parts of their apps, such as newsfeed.

Finally, LinkedIn just open sourced a data consistency solution for immutable models. If you’re interested in an open source implementation, I would definitely recommend checking this out.

(See Additional Resources slide.)

Questions and Answers (19:56)

Q: This is about the merge model. When you can request partial Pins, how do you deal with having only part of the data? Do you have a ton of optionals in those classes, or how do you do that?

A: Right now, our code base is actually in Objective-C. I’ve translated most of it for this. We have a lot of properties that which just don’t have data. All of our views and view controllers are built with the idea that you might not have some of these fields.

Q: About the observation technique you are using. Do you observe the objects by the server’s ID key for each of those?

A: Yes. We use the unique server specified ID. The notification name is just the class name with the ID appended.

Q: Is that ID something that’s a contract you have with the back end engineers? Is there a contract such as, “Please don’t reuse these, please don’t reset them.”

A: Yes. Most of our backend models have a unique ID that they’re stored on.

Q: Do you write any of this in memory caching to disk? And if so, how do you, restore it? How do you handle invalidation?

A: I don’t think we’re currently writing anything to disk, although we do have the ability to. Pin cache is nice in the sense that it has one interface-set object for key, and you can configure it to use it as cache, and an in-memory portion, so you don’t have to actually check that it’s in memory, and then write it to disk. It handles all of that for you, and if the memory crashes, or runs out of space, it will write it to disk. We do have the ability to do that, but we aren’t doing anything with it right now. I can see us doing something around it in the future with persistence and offline usage.

Q: For data observation, have you considered things like ReactiveCocoa, or RxSwift, like those solutions?

A: We have considered ReactiveCocoa and RxSwift. I think at this point our team’s three dozen people, and to train that many people up on it would probably take a longer time than we want to right now. It’s definitely something that we would consider in the future.

Q: Regarding builders. For each object you have, do you have an associated builder object for it, mapping all the properties? Or do you CodeGen it?

A: We do CodeGen it. I think there’s a blog post about this on our engineering blog. All our model classes are automatically generated, so we have a Swift script that will take JSON schema, which is a JSON representation of what a JSON server respond should look like, and then it will output Objective-C model files, which is pretty cool for us.

Q: So I just realized that your code base is in Objective-C, but the example was in Swift. Have you considered making the properties variable, but the object itself, at wherever you’re holding it, let it be ‘let.’ In that manner, you can mutate the object but not the property without a builder.

A: Yeah that’s a good idea. I think when we move to Swift, I’ll get in touch with you.

Q: For object generation, do you use a library? You have this schema, JSON schema file. Is that an in-house library that generates it, and do you use those models only for data mapping and you don’t operate with them on the UI layer? Or do you just use them throughout the app?

A: It’s something in house that we wrote. It’s pretty easy, we look at the type of each JSON schema property and map that to a corresponding Objective-C type. Then our dot-M files, just have initWithDictionary, initWithBuilder and initWithCoder. A lot of that is boilerplate code, that’s the same that you need to write for every model. It just made sense for us to have it auto-generated. I think Facebook has something similar, called Remodel, that will auto generate model files. If you’re interested in an open source implementation that might be a good one to look at.

Q: Do you use those models throughout the app? On the UI layer as well, or it’s only for mapping your network data to apps data?

A: Yes, we do use them on the UI. For example, for every Pin cell in our grid, we have a setPin method on it that will configure that cell. In some places, we use view models where it makes sense but, mostly we use the models directly a lot.

Next Up: New Features in Realm Obj-C & Swift

General link arrow white

About the content

This content has been published here with the express permission of the author.

Wendy Lu

Wendy is an iOS engineer and has been working at Pinterest for the last 4 years. There, she led the launch of their commerce product on mobile and has also touched everything from the data layer to the ads product. She previously spoke at Swift Summit on Apple Pay and moderates a mobile development panel at Grace Hopper.

4 design patterns for a RESTless mobile integration »

close