Safe vs Deep Integration of Realm

Safe vs Deep Integration of Realm

by Viraj Tank

Jan 10 2017

Introduction to Realm (0:36)

I work at Sociomantic Labs Berlin as an Android developer and mobile team lead. Today’s post will be about the new database system, Realm. To understand why Realm was created at the first place, look at the first database option we had for Android: SQLite, which was introduced in mid-2000. It was well tested and documented, and that is why Android has provided out-of-the-box support for SQLite since 1.0.

But SQLite has three major limitations:

It is a relational database, which means relationships are difficult to set up and maintain
It has a lot of boilerplate code
It has a lot of string-based operations in Java’s object world

To solve these problems, different libraries came into picture for Android, namely wrapper and object relational libraries. Wrapper libraries like SQLBrite and SQLDelite help us with the boilerplate code. ORM libraries like ORMLite and greenDAO not only help us with the boilerplate code, but also offer a solution for the object world problem. They all use SQLite as the storage engine, which means they are all relational database systems.

Realm was written as a no SQL, non-relational database to fix the fundamental issue of having a relational database system.

How Relationships are Maintained in Realm (2:03)

Realm uses Object Store to store the data, which means all the way from your application to Realm storage engine, it is all objects. Relationships are maintained in Realm using a tree-like data structure. Unlike SQLite, relationships are one-to-one and many-to-many, which makes it really easy to set up and maintain. Suppose you want to get an object of class B which is part of class A; we first get A, then get B. It is as easy as accessing a variable of a class.

Other advantages (3:04)

Performance - When Realm released their first stable version, we tried benchmarking. There is a significant improvement with init and write operations, but the read operation is where Realm really shines. Read operations directly affect user experience.

Security - In SQLite, if you want to secure your data, you either have to encrypt the data you stored or you have to use a library, which means additional code, dependencies, and configurations. In Realm, encryption is built in. Realm encrypts the entire database using the AES-256 algorithm.

Documentation - On the website, Realm provides API references, detailed documentation, and hello world examples. Once you get the basics of Realm, you can go to the GitHub where they have provided use cases for specific advanced examples. The Realm team is also very active on Stack Overflow if you have any questions.

Safe Integration (3:04)

Let’s create a sample application to display a list of Github users. When you click on a name, it should open a user profile. We’ll use Realm in the project, so first you define the Realm plugin in your project Gradle file, the apply the plugin in the application-level Gradle file.

Get more development news like this

Note that he APK size increases by 5 MB after adding Realm and the native libraries. Compared to SQLite, where the size is zero because it comes as part of Android, this is a big addition. If the APK file size is a big deal breaker, Realm might not be a good option for you.

In actuality, the number is around 800kB. The answer to why this is lies in Realm’s internal architecture. Because Realm has its own engine written in C++, known as Realm Core, it is ported onto different mobile platforms like Android, Objective-C, React Native, Swift, and Xamarin, which makes it a cross-platform system.

On Android, the C++ engine is ported using Java Native Interface and on top of it a Java API layer. As we know, different Android devices run on different processors. Realm has to provide a cross-compile version of the native library for each architecture, and this is the reason Realm adds five mB in the APK size which we upload to the Playstore. However, when the user downloads the APK, we already know which processor they are using, and that is why the installer APK size is smaller at around 800kBs.


Realm.init(this);
RealmConfiguration config = new RealmConfiguration
                                  .Builder()
                                  .name("gitHubDB")
                                  .encryptionKey(key)
                                  .deleteRealmIfMigrationNeeded()
                                  .build();
Realm.setDefaultConfiguration(config);

Realm realm = Realm.getDefaultInstance();

To initialize Realm, we have to create an instance of RealmConfiguration which uses the builder pattern, so it is much easier to use and set up. Realm provides a powerful set of config options like encryption and migration policies. Though Realm is a noSQL, it is still a schema full database. This means if a model definition changes, we either have to provide migration or allow Realm to delete existing data when migration is detected.

Realm can maintain a static reference, and down the line in any part of application we need a database instance, we could simply ask for the default one. With only six lines of code, we have a database instance up and running.

Model setup


/* used for all (realm + json + view) */
 @Getter
 public class GitHubUser extends RealmObject {
  @PrimaryKey
  private String login;
  private int id;
  private String avatar_url;
}

Usually in SQLite, there is a lot of boilerplate code required just to define tables, variable names, and the types. In Realm, all we have to do is extend RealmObject and Realm takes care of the rest by creating a proxy class.

Code structure

In your app, you might have different patterns like model view presenter and clean architecture. You might also have some useful libraries like RxJava. Let’s dive into how Realm works with that. The sample application code structure would look something like this: there’s a standard MVP, and the DataSource or repository layer on top of that. We have three different sources of data: cache, database, and network, which are isolated by a wrapper layer.

Let’s have a quick look at each layer before we get into Realm-specific implementation.


public class View extends Fragment {
  private Presenter mPresenter;
}

A View might look like this, where we have a Fragment which has a reference of a Presenter. In our Presenter, we have a reference of a View and a reference of a DataSource.


public class Presenter {
  private View mView;
  private Datasource mDataSource;

  public void loadData() {
    .DataSource
      .getData()
      .subscribeOn(computationalThread())
      .observeOn(mainThread())
      .subscribe(mData -> mView.setData(mData));
  }
}

When we bind the view, we would first call loadData() where we will instruct the DataSource to get the data for the view, do the processing on a computation thread, and update the view on the main thread. In our DataSource, we have a cache, data access object (DAO) and retrofit for network. DataSource will first check or ask for data in cache.


public class DataSource {
  private Cache mCache;
  private Dao mDao;
  private Retrofit mRetrofit;

  private Observable<List<GitHubUser>> fromCache() {
    return Observable.fromCallable(() -> mCache.getData());
  }

  private Obserable<List<GitHubUser>> fromRealm() {
    return Observable.fromCallable(() -> mDao.getData());
  }

  private Observable<List<GitHubUser>> fromRetrofit() {
    List<GitHubUser> gitHubUserList = mRetrofit.getData();
    mDao.storeData(gitHubUserList);
    return Observable.just(gitHubUserList);
  }
}

If not found, it will ask for database for the data. If not found, it will ask the network and update the local storage. In our DAO wrapper, we would have Realm-specific implementation.


public class Dao {
  Realm mRealm;

  public List<GitHubUser> getData() {
    return mRealm.where(GitHubUser.class).findAll();
  }

  public void storeData(List<GitHubUser> gitHubUserList) {
    mRealm.executeTransaction(mRealm1 -> mRealm1.insertOrUpdate(gitHubUserList));
  }
}

Suppose we inject Realm using Dagger. In getData, we would see our first Realm query. Realm queries are a bit different than SQLite queries. Realm uses fluent interface, which makes it easy to use and provides a powerful set of surge and filter query options. For simplicity, let’s say findAll(), which gives us a full list of GitHub users. In storeData(), we would do the first write operation. In Realm, all write operations must be protected by a transaction.

In this transaction, we simply have to call insertOrUpdate. It will first look for the primary key match, and if it finds that, then it will update the values. If not, it will create a new one.

When we run the application, we’ll get an exception about Realm access from an incorrect thread. This is because Realm objects can only be accessed on the thread where they were created. If you create an instance of Realm, RealmObject, or RealmResults on the main or computation thread, you cannot transfer nor access them from a different thread.

In this sample app, the limitation is coming from the DAO code. Because we injected Realm using Dagger, and Dagger creates an inject instance on main thread, accessing it on a computation thread produces this exception. We are also querying the data on computation thread, but when we go back to Presenter, we are accessing the data on main thread.

Motivation behind this limitation

Thread safety! This is not a Realm specific problem, but a generic software problem. If one thread is reading data from it, and at the same time another thread tries to write, edit, or delete the same data, we’ll have a race condition. As a solution, we can impose a lock, so that long as a thread is being written, no other thread can read or write on the same data. However, this isn’t a good limitation to have for mobile, because our read operations directly affect the user experience.

To solve this problem, Realm uses a concept called MVCC: Multi-version concurrency control. This idea is similar to Git in that there’s a master branch and sub-branches that must be merged back into the master - the single source of truth. At any point of time, Realm maintains a version number for the database. Any thread can read data from it. If a thread tries to perform a write operation on an object of class A and B, Realm will internally create a copy of the object of class A and B, maintaining the lengths or relationships and names in the new version as version two.

Until this write operation completes, other write operations are blocked, which is the reason all write operations in Realm need to be part of a transaction. But, as a write operation does not block the read operations, when the write operation completes, Realm will send a change notification to all the threads which were interested in updated data.

In Realm, when we query the data, we don’t get an in-memory copy of the data. Instead, we get pointers or references to get the values. This is the reason we don’t have to query again: when we get the change notification from Realm, all we have to do is reload the view and get the updated values.

Solutions to the Exception (19:03)

To solve the problem, we will not inject Realm, and we will not pass RealmResults across threads. First, query the data in Presenter itself on main thread, then make an in-memory copy of the results and pass it to the Presenter. We only need to modify the DAO wrapper layer.


public class Dao {
  public void storeData(List<GitHubUser> getHubUserList) {
    Realm mRealm = Realm.getDefaultInstance();
    mRealm.executeTransaction(mRealm1 -> mRealm1.insertOrUpdate(gitHubUserList));
    mRealm.close();
  }

  public List<GitHubUser> getData() {
    Realm mRealm = Realm.getDefaultInstance();
    List<GitHubUser> getHubUserList = mRealm.copyFromRealm(
                                        mRealm.where(GitHubUser.class)
                                          .findAll());
    mRealm.close();
    return gitHubUserList;
  }
}

We are no longer injecting Realm, but instead opening a new instance every time we need it. In storeData(), we open the instance, then the write operation, then immediately close the instance to avoid memory leaks. Similarly, in getData, we open the instance and perform the query, but instead of returning the results as-is, we make an in-memory copy of the data. Once we do that, it is completely safe to transfer Realm results across thread without any exception.

Deep Integration (21:26)

Suppose there’s a new requirement that allows a user to edit their ID and name on the user profile page. Once they do that, we have to update the values on our user list. There are two available options to accomplish this:

EventBus: Once the user updates the values, we send an event notification and update the view
RxJava: We can make our models reactive, so any further modification would get updated on the view, but for those solutions we have to write additional boilerplate code

In the first screen of the app, we do a read query and wait for the change notification. In the second screen, we do a write operation, and once we do that without any additional code, Realm will send a change notification to the first screen and we can update the view without any additional code. To achieve this, we need auto-update and zero-copy features of Realm, which means we have to use deep integration.

In the code structure, there is no cache. Presenter reads directly from Realm on main thread, and Realm will update Presenter for any data update. When the view binds for the first time, Presenter will ask Realm for data. If not found, it will update, if found, it will update the view. Presenter then asks DataSource to check the data status. DataSource will first check with Realm. If data found, then it means View is already loaded with data, and everything is fine. If not, DataSource will get the data from network layer and then it will update the values in Realm. Realm will then send a change notification to the Presenter. To achieve this, we have to modify our Presenter layer.


public class Presenter {
  private View mView;
  private Realm mRealm;
  private DataSource mDataSource;

  public void bind() {
    mRealm = Realm.getDefaultInstance();

    mRealm.where(GitHubUser.class)
                .findAllAsync()
                .asObservable()
                .filter(RealmResults::isLoaded)
                .filter(RealmResults::isValid)
                .filter(realmResults -> realmResults.size() > 0)
                .subscribe(gitHubUsers -> mView.setData(gitHubUsers));
  }

  public void loadData() {
    mDataSource
      .getData()
      .subscribeOn(getComputationalThread())
      .observeOn(getMainThread())
      .subscribe(statusValue -> statusValue);
  }

  public void unSubscribe() {
    mRealm.removeAllChangeListeners();
    mRealm.close();
  }
}

If we look at the updated Presenter code, on top of View and DataSource references, we also have a Realm instance. When the view binds for the first time, we have to get the new instance of Realm. Instead of using findAll() to perform the query, we will use findAllAsync because we don’t want to block main thread. Realm also provides out-of-the-box RxJava support which means that like Retrofit, we can get the queried results as an observable. Realm creates an auto-observable, and every time there’s an update on the queried data, we get an onNext call with updated data.

Because we are using the asynchronous method the first time when we get control back, we will not receive the valid data. This is why we need to apply the isloaded and isValid filters. After that, we can apply our regular filter to check the size of the data and then update the view. We will still have to ask DataSource to update the Realm data if not found. Instead of getting the data, we would get the statusValue in case there’s an error with network or data.

At the end, it is really important to do some cleanup for Realm. We remove all the ChangeListeners and close the Realm. It is very important to keep the Realm instance open as long as the view is visible. Otherwise, the queried data is no longer valid.

Summary (25:46)

How do you decide whether to use safe or deep integration?

Safe integration is easy. All Realm operations are performed on a computation thread, which is easy to test and debug. On the other hand, with deep integration, we get auto-update and zero-copy features.

Choose the integration approach that fits your requirements. At Sociomantic Labs, we use safe integration at the moment, but if the requirements change, we might move to deep integration.

Regardless of safe or deep integration, always perform your write, edit, and delete operations on a computation thread by opening a new Realm instance and closing it right after. If you are using safe integration, perform your read operation on a computation thread, but make sure you make an in-memory copy of the queried data.

Next Up: Realm for Android #9: Uni-Directional Architecture on Android Using Realm

About the content

This content has been published here with the express permission of the author.

Viraj Tank

Viraj works at Sociomantic Labs Berlin as an Android developer and mobile team lead. He has been working on Android since 2008. When not working on Android, he likes playing Age of Mythology.

Twitter