Extending Node.js Using C++

Oredev geisshirt node cover?fm=jpg&fl=progressive&q=75&w=300

Extending Node.js Using C++

by Kenneth Geisshirt

Feb 1 2016

With Node.js, server-side JavaScript has been become highly popular. The ecosystem around Node.js is rich, and you can find extensions for anything. Most extensions are written in JavaScript, but Node.js builds on the V8 engine, which is written in C++. This talk, by Kenneth Geisshirt at Øredev 2015, dives into the deep sea of development of extensions in C++. He’ll even tell you why would you want to write extensions in C++.

Realm is a (primarily) mobile database. We have a storage engine written in C++. My role in Realm is to bridge this C++ engine to other languages, including Node.js. This talk touches on:

The basics of V8 internals and API
How to wrap C++ classes, thus enabling you to…
Write your own extensions

Why Extensions in C++? (3:08)

JavaScript and C++ are very different languages. Although both are object oriented, JavaScript has no classes. Moreover, JavaScript is a dynamic language, and C++ is a strongly typed language.

Node.js is about JavaScript; if you are a JavaScript developer, you have probably never tried C++. By writing extensions in C++, you get access to system resources (system calls, access your I/O devices, GPUs). You may use this processing unit to do some computations: you want to offload your CPU (C++ is faster as a compile language, you may want the performance of C++). At Realm three quarters of our code base are written in C++: it is our way of sharing code among platforms and languages. Also, there is legacy code (often written in C, C++, or even Fortran) that you may want to use.

Demo C++ Classes (4:48)

I have two C++ classes: Person, which can have a firstname() a lastname() and a birthday() (methods, and a string to print a person); and Book class (where you store all your persons in, i.e. address book application; methods: add or look up a person, get a person by the area operator, remove them or check the number of persons). The first name(), lastname() and birthday() are getters and setters. You can set and get the first name of a person. Very simple classes: if you want to add a person to a book, you use a person object; look up can return a person object (tricky to do in Node).

V8 Concepts (6:35)

Node.js is built upon V8. Isolate is an isolated instance of V8; its own object in one isolate cannot be moved to another one. Handles are references to a JavaScript objects. The garbage collector reclaims the memory when the object or the handle is not used anymore. Local handles are allocated on the stack; the life-time is scope-based. Persistent handles can live through more than one function call and survive the change of the scope.

Get more development news like this

Functions, in most programming languages, can return a value; but in V8 you do not return an object - you set the return value using GetReturnValue() .Set(). You cannot return a local object with a local handle: a local object is in that scope only, and that is stack allocated - that would be reclaimed by the garbage collector when you return from the function.

In JavaScript you have a number of different “classes” or object types (String, Number, Array, Object, …). They are represented as C++ classes in the V8 API. The object is the most generic. A number can be an integer or a double.

Breaking Changes: 0.10 → 0.12 (10:10)

In February 2015, Node.js 0.12 was released: they were upgrading V8 API. I wrote an extension using Node 0.10: after the changes, we could no longer compile it. This talk it about 0.12+. The version numbering scheme of Node has now changed [incremented from 0.12 (beginning of the year) to 5.0 (last week)].

Returning values from C++ to JavaScript is now different. The type names for the arguments have also changed. Isolate is new. In the old API, when you create strings you have to specify the encoding (UTF-8 is common). An extension cannot support both 0.10 and 0.12+ (there is one attempt to leverage that, but it is not easy: check this post).

Building Extensions (12:51)

If you want to build an extension, you have to write a number of wrapper classes (e.g. if I have a person.cpp class, I make a person_wrap.cpp file). Then, you write a bindings.gyp file explaining the build process (e.g. the target name, ‘funstuff.cpp’; source code files; list the classes I want to wrap, and the wrapper classes; OS X specific extensions). ‘funstuff.cpp’ is setting up the extension; it calls two functions, called Init. For wrapper classes, there is a method setting up the wrapper class, initializing it, and adding it to the V8 engine: InitAll. There is a macro called Node module, which sets it all up. You then type Node-gyp configure, and build. Next, you want to wrap a class.

Wrapping a Class (16:03)

This is the header file of BookWrap (see video for code). We want to inherit from ObjectWrap, then is a node::ObjectWrap. You have to have an Init function, and a new function creating a new object. We have to keep a reference to the object we are going to wrap (Book* m_book). This is one of the persistent handles, and a function: static v8::Persistent<v8::Function> Constructor. Functions in JavaScript are just an object. We have to set up this constructor. This is used for calling new on this class.

Adding a Class to V8 (17:39)

We have Init(). This function template new will call BookWrap::New:, constructing a new object. Node_set_prototype_method adds the method. Then, we set up the constructor. If we want to index properties in JavaScript, we have to implement getters and setters, deleters and enumerators (which are methods in my C++ class). When I require “funstuff”, it will call this method.

Instantiate an Object (19:45)

If we want to allocate a new object of this type (Book), it calls a new method: it creates the wrapper object and the wrapped object, and adds it to V8 runtime. I have to GetReturnValue(), and set the return value. I ask IsConstructCall(): in JavaScript you can call a constructor without new (as a function call). If I want to implement that, I have to do an else branch (not included in this example). I can only do new Book() with this module. I should probably either implement it or throw an exception saying it is not allowed. We have a new object in deviate runtime, and the garbage collector can take care of it. I am asking for the args.length(); if my constructor needed to take arguments, I could do that, if I need to to set that up in my constructor.

Methods (22:12)

Methods are implemented in C++. However, in JavaScript, you do not have methods; we have properties, which are functions.

This is the length method. I have book.length (see video for code). I get the current scope by getting the isolate, and setting up the scope for that isolate. The call of that function is a reference to the object calling. I unwrap it (in 95% of all your methods in a Node.js extension, this one of the first lines you do), and I call size method (e.g. to get the number of persons in that address book). I then create a JavaScript integer (an object within the JavaScript V8 engine) with the value count, and I set the return value. I did not do argument checking. If I call Length() with 200 arguments, it will still work because there is no check here.

If I have this lookup method that returned (e.g. I have this book, and you could look up a person, by his first name or her first name, and then return that object), I have to be able to take an object in one class and return another object of another type.

Instantiate Objects (25:35)

In my person wrap class I want to create a new person wrap object, using a book only. I set up a call to the constructor (first lines; see video). Then I add the person, and return it. I use an EscapableHandleScope; “escapable”: you can take your local object and put it outside. Instead of just returning it, I return this escapable (it removes it from one scope to another scope) to instruct the garbage collector that the scope has changed. In the documentation it says that you should not call escape twice on an object (although it does not say why or what will happen). This is a call to constructor: instead of calling the constructor from JavaScript, now I am calling the constructor in C++. Calling a new person, but in C++ code.

Indexed Getters and Setters (27:54)

We have book[4]. If you want to implement that, you will need the getters and setters. You have a UInt32, which means that an index cannot be negative (it is unsigned). In some programming languages (Python), if you access an array with a negative index, it count back from the back instead of from the front, from left, from the other side of it. But you are not allowed to do that in Node js. Also, since it is a 32 bit integer, your arrays cannot be that big (only four billion elements). I have to validate the input: if I access an element that is not there, I will get a C++ exception (not a good idea, because the user will have no clue what went wrong; it will probably say segmentation fault). I have to validate my input, and throw an exception. Then I set this return value, and return my object. Setters are similar. There is an exit parameter with the value you want to set. You can also use the delete operator in JavaScript; you can say delete, then an object. Enumerators are easy; they just generate a list of all the indexes that are allowed.

Accessors (30:48)

Accessors are useful for known properties. I tried to set up named property handlers, but they do not work anymore. Very often in C++ you have a limited number of getters and setters (only the ones in your C++ class); you only want to wrap them. The accessors are simple to use. You set them up by this SetAccessor: you include it into your Init, where you set up your class.

Callbacks (32:28)

Since functions in JavaScript are objects, it comes in your argument zero. The type is a function object. You have to set up a call, and call the function object. That is, if you have a function object, you can call it. If the anonymous function returns something, you can also get the return value out of it. If you want to do exceptions and return values at the same time, it can be tricky: you have to remember that functions (even anonymous functions) in JavaScript are just objects; they have a V8 class called Function for representing them.

Throwing Exceptions (34:07)

Throwing a C++ exception to your JavaScript user is a no-go (they do not understand it). If you want to throw an exception in these isolates, there is a method called ThrowException. It sets the JavaScript, the Node.js or V8 engine’s state to “exception”. When you return from your C++ and go back to V8, and start executing JavaScript again, suddenly it will be an exception. It just sets exception state; when you go back to V8, it will become a JavaScript exception.

Catching Exceptions (36:10)

If you have an anonymous function, or a block where you want to allow the user to throw a JavaScript exception, and want to catch that in C++, and then process it. Very useful if you have a transactional manager, where you have a transaction: for example, you throw an exception, and you want to roll back. You have TryCatch. After a function call, you can ask, was a Exception thrown?: HasCaught(). You can rethrow that JavaScript exception, and throw that again, back to V8, from C++. This is how you can do a Node.js engine. You do not need more than this to wrap C++ classes.

NAN (37:43)

If you want to bridge between Node.js 0.10 and 0.12+, there is a native abstraction for Node, called NAN, which tries to make few macros. You can use them as same source code for both versions.

Observations (38:26)

When you have C++ classes and you want to wrap them and use them for your extension, you do not have to wrap them one-to-one (I did that for this presentation). Moreover, since JavaScript is not a strongly-typed language, you have to do input validation. If you expect that something has to be a string, check it manually before using it. Unit testing is essential. Also, C++ and JavaScript both are object-oriented language, but one is with classes and one is classless: it might be awkward for JavaScript programmers to move into C++ land. Similarly, crossing language barrier during call is hard for debuggers.

Learn More (41:10)

Check out my demo. There is also Google documentation about V8, JavaScript: The Good Parts, or any modern C++ textbook.

Next Up: Realm Everywhere, with JavaScript: Realm Universal Node.js

About the content

This talk was delivered live in November 2015 at Øredev. The video was transcribed by Realm and is published here with the permission of the conference organizers.

Kenneth Geisshirt

Kenneth holds a Ph.D. in chemistry (and a B.Sc. in computer science), and in the 1990s he primarily worked on simulating chemical reacting on supercomputers. After graduating, he has been working as a software developer focusing on open-source software. Currently, he is working for Realm where he is part of the Android team. In his spare time, he has been speaking at meetups, conferences, and user groups and writing articles and book on topics related to software development and open source software.

Twitter

4 design patterns for a RESTless mobile integration »