Try swift nyc paola mata header

Exploring Natural Language Processing

Paola Mata will introduce us to the natural language processing APIs, an underutilized but powerful set of APIs that have been updated for iOS 11 and explore the possibilities of harnessing their power to improve the user experience in apps.


Introduction

In this talk, I will cover Natural Language Processing. This year I had the pleasure of attending WWDC, and I was able to attend a session on natural language processing. Going into it, I knew little about the topic, except that it had to do with machine learning and linguistics.

What exactly is Natural Language Processing (NLP)? It’s a field in computer science, artificial intelligence, and computational linguistics. It’s concerned with the interactions between computers and humans using “natural language” (i.e. human language).

Core ML Framework

One of the most exciting announcements at WWDC was the new Core ML framework. What’s really cool about Core ML is that it allows us to incorporate machine learning into our apps without having much prior knowledge of how it works.

You can get up and running really quickly. The Core ML framework also supports several domain-specific frameworks. One of these is Vision, which allows us to add high-performance image analysis in computer vision to our apps. Another is GameplayKit, which allows us to architect and organize the game logic in gaming apps. Last is Foundation, which deals with natural language processing.

Get more development news like this

The NLP APIs are already being used, and have been for a while, in some of the apps that we already know and love - most notably, Siri!

Who should use the APIs?

Anyone whose app handles natural language as the input or output can use the APIs. For example, it can be useful if your app consumes a feed from an API, or maybe your users generate content within your app in the form of typed text, recognized handwriting, or transcribed speech.

Once we have raw text, we want to convert it into useful information that we can then use to improve the experience between our user and the device, or between two devices.

Let’s try to understand what we mean by useful information. To do that, we have to look at the fundamental building blocks of natural language processing. This starts with the concept of Tokenization, segmenting text into a specified unit, which can be a paragraph, sentence, or a word. Tokenization allows us to accomplish other tasks including:

  • Language recognition
  • Part of speech identification - determining whether a particular word may be a noun or a verb, etc.
  • Lemmatization, which is a fancy NLP word that essentially means getting the root form of a word. For example, in English, that would mean the word without any pluralization or verb tenses, or maybe removing the possessive form from a word. Spanish is a little bit more complicated because there are a ton of irregular verbs.
  • Name entity recognition - identifying whether a word or a set of words corresponds to a person, an organization or a company, or maybe a location.

Code Examples

Overview

The NSLinguisticTagger class is a Foundation class, which means it’s available across all platforms, and it helps us with much of the language processing. The class has been around since iOS 5.0.

Whats New in NSLinguisticTagger:

  • The concept of units, which allows us to tag a specific body of text, sentence, or individual words.
  • Ability to check for available schemes, using the function availableTagSchemes.
  • Additional language support for 52 other languages.

Language Identification

Now let’s take a look at the code. I played around with language identification in Playground because it is a quick way to get up and running. First, initialize our NSLinguisticTagger with specific tagSchemes we’re interested in. In this example, we’re only interested in the .language tagScheme. So we set the string on the tagger, like this:

//: Playground
import Foundation

let tagger = NSLinguisticTagger(tagSchemes: [.language], options: 0)

tagger.string = "Fuimos al cine y despues a tomar un helado"

let language = tagger.dominantLanguage

The string is a sentence in Spanish. And then we just call dominantLanguage and we get the response that corresponds to Español.

Tagging Tokens

Tagging tokens is similar but a little bit more complex:

private func tag(text: String, scheme: NSLinguisticTagScheme, unit: NSLinguisticTaggerUnit = .word) -> [TaggedToken] {
    let tagger = NSLinguisticTagger(tagSchemes: NSLinguisticTagger.availableTagSchemes(for: unit, language: "en"),
                                    options: 0)
    tagger.string = text

    let range = NSMakeRange(0, text.utf16.count)
    let options: NSLinguisticTagger.Options = [.omitWhitespace, .omitOther]

    var taggedTokens: [TaggedToken] = []

    tagger.enumerateTags(in: range, unit: unit, scheme: scheme, options: options) { tag, tokenRange, _ in
        guard let tag = tag else { return }

        let token = (text as NSString).substring(with: tokenRange)
        taggedTokens.append((token, tag, tokenRange))
    }
    return taggedTokens
}

We initialize our linguistic tagger because we do not know which unit might be passing into this function. I’m using the available tagSchemes and specifying the language.

For options, I want to ignore any white space and anything that might be identified as unknowns, So I call .omitWhitespace and .omitOther.

Then, we enumerate through each token and pass in our range, unit, scheme, and options. When the closure is called, it takes the arguments tag and tokenRange.

Name Entity Identification

Name Entity Identification is not very different:

func tagNames(text: String) -> [TaggedToken] {
    let tagger = NSLinguisticTagger(tagSchemes: [.nameType], options: 0)

    tagger.string = text

    let range = NSMakeRange(0, text.utf16.count)
    let options: NSLinguisticTagger.Options = [.omitWhitespace, .omitPunctuation, .joinNames]

    let tags: [NSLinguisticTag] = [.personalName, .placeName, .organizationName]
    var taggedTokens: [TaggedToken] = []

    tagger.enumerateTags(in: range, unit: .word, scheme: .nameType, options: options) { tag, tokenRange, _ in
        // Make sure that the tag that was found is in the list of tags that we care about.
        guard let tag = tag, tags.contains(tag) else { return }

        let token = (text as NSString).substring(with: tokenRange)
        taggedTokens.append((token, tag, tokenRange))
    }
    return taggedTokens
}

In this case, we know which tagScheme we want to use, so we can specify that when we initialize the tagger. Again, we pass in the string and the range.

A difference here is that under options, we’re including .joinNames. So in a text that contains, for example, personal names, the first and last name will be joined. A city name such as New York that has multiple words, will also get joined into one word.

In this case, we’re looking for three specific tags: I use an array that contains .personalName, .placeName, and .organizationName. Any tags are checked against that array to make sure it has the ones that are relevant to my needs. As before, an array with my tagTokens is returned.

Demo App

I came up with a sample app to show several examples of this in action. The first one is a fantastic article from BuzzFeed about what happened in that elevator with Jay-Z and Solange. If you haven’t read this article, I recommend it.

The app scans the text and analyzes it for parts of speech. In the middle of the screenshot, I’m looking at verbs.

Next, we’ll look at lemmatization. In this example, I have text from a random article about why you should never trust the Rotten Tomatoes movie review site.

Here, I enter a search term, and that term is lemmatized, so it will be matched with all of the text that I input in my text view, in any form. For example, type in “movie”, and the app highlights all occurrences of “movie” in the text, including the plural form, “movies”, and the possessive form, “movie’s”.

For anyone interested, my sample projects are on github.

Benefits of Natural Language Processing APIs

  • Apple uses the same APIs, which makes the user experience consistent.
  • The actual processing is completed on the device, which ensures user privacy.
  • It’s performant.

Problems

Limits to Name Entity Identification

The name entity identification is limited. I’m in tech, so I decided to analyze an article that mentioned a lot of tech companies, including Twitter and Facebook and BuzzFeed. For some reason, those companies were not recognized as organizations! However, others like Squarespace and maybe Shutterstock and Tumblr were.

Language identification

Language identification within the same sentence does not work the best. I tried to pass along some “Spanglish” to experiment, but if I changed one word in a sentence, then it would affect how the other words were analyzed.

Other Projects

If you want more information about the NLP APIs, here are a few links for following up on them.

A WWDC Session had two really good examples of how to incorporate the APIs into hypothetical apps - Winnow and Whisk.

Ayaka Nonaka had a really great talk at Realm a couple of years ago on natural language processing. She uses the APIs to train a model to identify spam. It’s a couple of years old, so it’s not using the Swift 4 APIs, but it’s pretty similar and you can follow along.

Martin Mitrevski has a great recent blog post where he uses a simple algorithm along with the NLP APIs to find key terms in some of his blog posts.

Next Up: Learning Paths: iOS

General link arrow white

About the content

This talk was delivered live in September 2017 at try! Swift NYC. The video was recorded, produced, and transcribed by Realm, and is published here with the permission of the conference organizers.

Paola Mata

I’m Paola (not Paolo). I’m an iOS developer, social media addict, and occasional blogger based in Brooklyn.

I’m currently building awesome apps at BuzzFeed, where last year I was part of the team that launched the highly acclaimed BuzzFeed News app.

I am also actively involved in the tech community as co-founder of NYC Tech Latinas and regularly volunteer my time to promoting diversity in tech and supporting the next wave of new programmers.

When I’m not buried in code, you’ll likely find me binge-watching a sci-fi series on Netflix, lifting at the gym, or hunting down good eats.

4 design patterns for a RESTless mobile integration »

close