Would you like to make your own Echo with a Raspberry Pi and Alexa, or an Android app that recognizes your emotions? Interested in making Android apps with intelligence, such as speech recognition, face detection, natural language processing, prediction and search? You absolutely can! There are open source machine learning platforms and services now available to app developers.
In this session from 360|AnDev, Margaret Maynard provides an overview of how you can integrate these advanced machine learning algorithms with your Android apps, without any deep expertise in ML. She walks through several examples to help you get started making Android apps with human intelligence.
My name is Margaret Maynard-Reid. I’m an Android developer, and currently, I’m a consultant on the Microsoft Research Cognitive Services team. I’m also an organizer and Women Techmakers lead for GDG Seattle. In this talk, I’ll share with you my personal learnings on how to make Android apps with intelligence.
Machine Learning is Everywhere (0:50)
We hear about machine learning all the time these days. From personal assistants like Google Assistant, Amazon Alexa, Microsoft Cortana, and Apple Siri, to self-driving cars and robots, they all have machine learning in them.
As developers, we know how to write programs and rules, but sometimes it’s hard to write the programming rules. This is where machine learning comes in. It’s a subset of artificial intelligence. It’s the study of the algorithms, which can learn from examples and experiences in the data.
A typical process is to collect training data, then training a model, then using the model to predict.
Awareness API (3:20)
Google Play Services 9.2.1 was released in June, and Awareness API, which was announced at Google I/O, was also released. Activity recognition, geofencing, and access to webservices are not new, but Awareness API packs these and others, for a total of seven signals, into one API.
Get more development news like this
Using the API, you can detect the time, the location, the nearby places, whether the user is running or walking, biking, et cetera and the beacons around you, whether your headphone is on or off, and the weather and temperature.
In the past, if you’re familiar with geofencing, you could define one fence with your geolocation. With this new Fence API, you can define any of those seven signals. You define the condition and you can use that condition to trigger a push notification or any kind of action that you define in your program.
To get started with the Awareness API, all you have to do is to add the Google Play Services to your
In this case, it’s the
contextmanager service. Then you need to get Awareness API from the Google dev console, and get the API key. To get the API key, you need to have a SHA-1 key first as well as your package name. Then you go to the dev console, you get the Awareness API, then you add the API, a key, in your manifest and you’re ready to go.
Additionally, you might need to add some permissions in your manifest. For example, if you’re working with camera or location, you need to put that permission in there. If you’re trying to get information on the places near you as well as the beacons around you, you also need to enable the two additional APIs, but you don’t need to get one more key. You can use the same key, but you need to enable those two APIs.
Mobile Vision API (6:57)
The Mobile Vision API was released with Google Play Services 7.8 last year. It’s not brand new, but in the 9.2.1 release there are a lot of feature enhancements.
It’s very easy to get started. It can detect faces, bar codes, as well as text. The Face API itself detects the face but does not do recognition. It will detect that face inside of the image, but it may not recognize this face belongs to person A or person B. The face can be detected from different angles and then it can detect whether your eyes are open or closed, or whether you’re blinking, or smiling, and it can detect the landmarks on your face (like eyes, nose, cheese, etc).
This is typical as you investigate other vision or face APIs. You will notice that more or less, all the APIs will detect the landmarks on the face. Here’s the example.
Add this to app
Create a detector:
FaceDetector faceDetector = new
Make sure detector is operational first:
// let user know face detector isn’t working
// log error etc...
Detect the face:
Frame frame = new Frame.Builder().setBitmap(myBitmap).build();
SparseArray<Face> faces = faceDetector.detect(frame);
To use the face detector, all you have to do is to add in the
vision service to your
build.gradle. Then you create a
faceDetector object. You also need to make sure this detector is operational before you start using it. Even though you added the Google Play Services in your
build.gradle, depending on your device, this detector may or may not be working.
Then, the detector will detect the face. For each face, you get a list of faces, and for each face you will use the face information to draw the rectangle on that image along with other additional information such as whether your eye is open or not.
Check out the video at the above timestamp for a face detector demo.
Machine Learning Services via REST APIs (11:36)
These machine learning services have pre-trained models. Companies train these models and then give these services to us Android developers to use (although not for free). For us to use these models, we don’t need to have any machine learning knowledge; it’s REST API Call.
These are the companies that I know of that will provide these services: the Google Cloud Machine Learning APIs, the Microsoft Cognitive Services, IBM Watson, and HP Haven OnDemand. There are probably other companies that provide these services, but these are the ones I know of.
How do we get started? Among these companies, they provide free trials based on a period of time. I think IBM Watson gives you a month free, and some other companies might give you a certain number of transactions for free per month, or maybe per minute. A typical process is that you go to their console or a platform, you sign up for an account, you get an API key, then you include the key in the Android app, and then you download the SDK or include the dependency in the
build.gradle file if there’s an SDK available. If not, you make the REST API call, so you have to write code yourself to handle the REST API call.
For the REST API Call, we typically put an image, send the image stream audio or video or text as a string via the HTTP call to some REST API endpoint, whether it’s vision or speech or language or text, and then you get back a response which you can use in your Android app.
One example might be sending an image with a face. I will send it over to a vision API, the result comes back, giving me the four corners of my face detected, which you can use to draw a bounding box on the image and the result comes back. It might have some emotion like, “Oh, I’m happy or I’m sad,” or the gender and the age; all of those responses will come back, and then you can put it in your Android app.
You need to consider when you make that call doing things asynchronously, if you need to make different calls using different APIs. Unlike Google Play Services, for example, the vision API gives you the detector. You can use the face detector or text recognizer or the bar code scanner. In this case, you’re writing your own REST API call, so you have to handle those yourself.
See the above timestamp for a demo of some Rest API calls in action.
Build Your Own Machine Learning Models (20:15)
To implement Google Play Services as a REST API call, you just have to be an Android developer. However, to build your own machine learning models, you do need to have machine learning knowledge.
The Google Cloud Machine Learning Platform will allow you to build and train the models. The Amazon Machine Learning Service is similar, where you go where it holds your hand and tries to help you to build and train a model, input the training data, etc. Then you can use it for your app.
Beyond Machine Learning Services (23:21)
I have to bring up Amazon Alexa. I built the Alexa with Raspberry Pi, but the setup was too complicated to set up here. With the Raspberry Pi I had to put in a speaker; the Alexa itself is a speaker, but behind that speaker there’s a lot of the machine learning, in particular the speech recognition.
Since we’re talking about machine learning services, I also want to point out an interesting thing about Alexa is that you can train it by building new skills. For example, you can train Alexa to play a game or have a flash card or to do some other things. The voice becomes the interface. With the phone, the interface is touching with a finger. Voice as interface is also very important.
Growing up in China, I grew up watching Astro Boy, so I love robots. I never dreamed that in my lifetime I would get to program robots, but now we can. At Google I/O, there was a session talking about Pepper the robot, a humanoid robot that can move around and detect emotion. It can speak and carry on a conversation with you. Aside from writing Android apps, you can actually program robots!
Getting started is easy. In Android Studio, where you go download plug ins, you can download the SDK and you can also launch the emulator for the robot. One is that Pepper has a tablet on his chest and so you can have the emulator of the tablet, as well as a robot viewer where you can see some of the gestures of the robot. Then you create a regular Android project. You enable the robot project structure, and then you’re off and ready to go program robots.
Design Considerations (27:15)
Before I close, I want to talk about some of the design considerations. All of this machine learning is really cool, but you should not include it in your app just because it’s cool. You should only include the intelligence in your app because the use scenario actually calls for that intelligence.
Once you decide that you need the intelligence, when choosing an API or service, you should consider the ease of use how much code you have to write. If you’re using Google Play Services, you don’t have to write that much code. If you’re doing the REST API call, if that company is providing an SDK for you, you also don’t need to write that much code. But if you have to write the code on your own to do the REST API call, you have a lot more to consider.
The Awareness API packs seven signals together, so you can imagine the drain your battery. When you put in these services, as you notice, a lot of it will use the camera. It will take a picture or a video, or perhaps detect the location, so you need to handle the permissions properly. Especially the Marshmallow runtime permissions. You need to check if you have that permission for camera or perhaps location before you start to use that service.
Privacy is also an important consideration. Vision and Face API can take a picture of my face, and can detect my age, my gender, and my emotions. With Awareness API, you know whether I’m walking or running, you know my location, and you know what’s near me. This is a lot of data that you can gather about a user. You should inform your user about data privacy as well as being mindful with the user’s data. For example, if you’re taking a picture of my face and sending it to some company, be upfront and let me know that this company may be keeping a picture of me.
What is next? (31:00)
I think the future is already here. You should go build an app with intelligence, make your own Echo, go program a robot, and study some machine learning. Even though most of the resources I provide here do not require machine learning knowledge, you’re going to see a trend where app developers of all stripes should probably go and learn a bit about how machine learning works.
That’s not to say we’ll become data scientists, but it will help you to have a better working relationship with a data scientist, and also help you to build better apps as you use these machine learning services.
- Google’s Vision on Machine Learning
- Machine Learning for Art
- Machine learning is not the future
- Breakthroughs in Machine Learning
- How to build a smart RasPi Bot with Cloud Vision and Speech API
- Intro to Machine Learning
- Deep Learning
- Andrew Ng’s ML course on Coursera
- Machine Learning Recipes
- Python Machine Learning by Sebastian Raschka
About the content
This talk was delivered live in July 2016 at 360 AnDev. The video was recorded, produced, and transcribed by Realm, and is published here with the permission of the conference organizers.