Value SEMANTICS (not value types!)

Value SEMANTICS (not value types!)

by Alexis Gallagher

Dec 1 2016

Who can forget 2015, the “Year of the Value Type”? Through numerous blogs and videos, the Swift community explored how value types (structs and enums) enable new, simpler, safer patterns for application architecture. Alexis Gallagher argues for an important proviso: what a lot of these talks are trying to discuss is not value types, but value semantics. Value semantics is tricky to define, but the reward is a profound, satori-like experience of enlightenment, and a better understanding of how to use Swift.

This talk from the Swift Language User Group (SLUG) will explain value semantics and provide a straightforward recipe for enforcing value semantics in Swift using value types, reference types, or a mixture of the two.

Introduction (0:00)

I’m Alexis Gallagher, and I’ll be talking about value semantics. In 2014 and 2015, there were at least half a dozen talks on value types. Apple’s done three of them. One thing I noticed in almost all the value type talks, especially the one that got into more advance material, is about midway they used the phrase “value semantics,” which is often left undefined.

I also found that the term value semantics often came up around the point where those talks got really complicated. So I formed the idea in my mind that these things were connected, and if I understood value semantics, I’d also understand why all these talks got complicated at the same point.

I think understanding value semantics in a clear way helps you understand value types much better, and give you a much cleaner frame for understanding everything connected to value types. And value types are important, right? Because they’re updating foundation to use them everywhere. So it’s no longer a minor quirk of the Swift language, it’s part of understanding foundation. It’s part of understanding the core types that we use to build all kinds of apps.

Review (4:07)

I think the easiest way to start getting into value types and value semantics is not with an abstract definition, but with a game. And I’ve invented this game, and I call it the Mutation Game.

The Mutation Game is played with two players. We have Victor the Valucist. He’s a good guy, and then the bad guy is Salazar the SideEffector. And the game is this Victor Valucist is going to choose a type Foo. He’s going to set a variable of Foo with a value, and then Salazar is going try to change the value of that variable.

Get more development news like this

var v:Foo = Foo() // Valucist chooses Foo to
var s = v // defend the variable v
let v0 = valueOf(v)
/* { SideEffector attacks v using only s } */
let v1 = valueOf(v)
assert(v0 == v1) // v unchanged? Valucist wins!

Salazar doesn’t get to touch the variable. All he gets to use is another variable s. Salazar also gets to define a pure audit function that’s used to measure the value of v. It needs to be a pure function. It can’t just pull in a random variable.

Here’s the arena of play. This should look pretty similar from talks we’ve seen on value types. So, in the beginning the Valucist, Victor, defines the variable v of type Foo, and then we get this important statement in the middle: var s = v. So the value of s has now been determined by the value of v.

And we take the initial value of v with the valueOf function, and we’ll save that into v. And in that blocked out comment, that’s where Salazar can do anything he wants to do as long as he’s only using the variable s. He can’t touch the original v.

And then we take the value of v again, and we see: has it changed? or is it unchanged? If it hasn’t changed, then Victor wins. If it has changed, then Salazar wins.

Mutation Game with Int (6:24)

Let’s play out the game a few times. What happens if Victor the Valucist decides to play with the type Int?

var v:Int = 100
var s = v
let v0 = valueOf(v)
s = 200 // attempted attack!
let v1 = valueOf(v)
assert(v0 == v1) // but true, Valuecist wins!

If Victor plays with Int, then the Valucist wins. s cannot touch the value v. Why? Because Int is a value type.

When we say something is a value type or a reference type, in the context of Swift, this is in the context of assignment behavior, as in what happens when you assign to the variable.

Value types use assign-by-copy. So we start with a declaration that v, the variable, is pointing to a particular instance, and that instance has a value. And then when we assign to a new variable s, assign-by-copy means it behaves as if we are creating a copy of the instance.

So what ever happens later, whatever Salazar does with s, it’s only going to affect that instance that the variable s is pointing to. It can’t do anything that changes the value of v. The value of v you get by taking the variable v, going to the instance that it points to, and then looking at the value that that instance carries. This is a way to think about assign-by-copy, and this is why for a type like int the Valucist always wins.

Mutation Game with NSMutableString (8:10)

But what if Victor is relaxed, and he goes with NSMutableString? He loses because this is a reference type, and reference types use assign-by-reference.

var v:NSMutableString = NSMutableString()
var s = v
let v0 = valueOf(v)
s.append("Hello, world") // attempted attack!
let v1 = valueOf(v)
assert(v0 == v1) // false! SideEffector wins

You can see, looking at the same kind of diagram, the variable v points to the instance, the actual mutable string thing, and the value of that mutable string it starts out as an empty string. And when you code var s = v, what you’re really saying is, “I want to have a new variable, and I want it to point to the same instance.”

So v and s now share a common instance. And then what can happen is that if someone makes a change to the actual instance by, say, mutating that mutable string and filling it with “hello,” so that its value is now “hello.” Well now the value of v is “hello,” and so is the value of s.

So Salazar, just by grabbing hold of s, can get in there and destroy v - destroy what it was supposed to mean for us.

Benefits of Value Types (9:35)

What are the benefits of value types? Value types prevent unintended mutation.

If I have a function and it takes an Int, and it returns an Int, and it’s a well-behaved function, I don’t need to worry that under the hood it’s doing something to the Int that I handed into it. Because the rule for the variable that’s used in the body of the function is the same kind of assignment rule that’s used for assigning a variable within a scope of code. And all this also helps with thread safety.

Structs versus Classes (10:38)

Should we always just use structs when we want to have this safety, and then use classes when this kind of safety isn’t important to us?

Not exactly, because what I’ve described as a benefit of value type is not a benefit you always get from a value type. Look at what happens with UIImage.

var v:UIImage = UIImage(named:"smile.jpg")
var s = v
let v0 = valueOf(v)
// Hammer on s all you please. It's useless!
let v1 = valueOf(v)
assert(v0 == v1)

I define a UIImage here that represents a smile. Now we might worry that Salazar could grab s and then do things that would change the image, that he could change the colors on it or something. Because UIImage has been deliberately carefully defined to be immutable, it can’t be done. If you look at all the properties on UIImage, they’re all read only.

But not all value type are exactly safe. I could create a value type like Array that contains inside it a reference type like NSMutableString.

var v:Array<NSMutableString> = [NSMutableString()]
var s = v
let v0 = valueOf(v)
s[0].append("Hello, world")
let v1 = valueOf(v)
assert(v0 == v1) // false, SideEffector wins

Apple themselves in their talks about this problem describe it as unintended sharing. And it’s not always unintended.

Sometimes you want to do this because you want to sort of secretly maintain common storage for efficiency reasons. But then you don’t want that secretly maintained common shared storage to mess up the behavior of your types. So you put in special tricks to be sure that as soon as you would change something that’s shared, then you stop sharing it at the last minute. This is the copy-on-write optimization.

What kind of type wins the Mutation Game?

Types with value semantics.

Value Semantics Defined (14:57)

The kind of type that wins the Mutation Game is a type that has value semantics. It’s something like Int, which has value semantics, but you can also make more complicated things that have value semantics. My definition here is an operational one: the Mutation Game is the defining test of value semantics.

Here’s a conceptual definition which amounts to the same thing: Value semantics amounts to being a guarantee of the independence of the value of variables.

And independence doesn’t mean structural things. What we’re talking about is can one thing affect another. So a type has value semantics if the only way to modify a variable’s value, a variable that has the value semantic type, is through the variable itself. If the only way to modify a variable’s values is through the variable itself, it’s a variable with semantic type.

Apple’s Definition of Value Semantics (17:57)

A definition Apple provides is that variables are logically distinct. I don’t like that, because variables are always logically distinct. I think what they’re trying to get out there is structurally distinct. But they’re not actually saying the variables are structurally distinct, they’re saying something about how the storage for them is structurally distinct.

But that’s also not what matters, because again, you can have immutable things where they’re not structurally distinct. The important thing is whether you are immune to being affected by other things.

So I’d say that the formulation they have there in the middle is about right, but I would offer a kind of modest amendment. Which is just that mutating one variable will never affect a different variable of the value semantic type. The type is value semantic if it’s immune from side effects produced by other things. Not if it’s guaranteed not to perpetrate side effects on other things.

Value semantics is about interface, whereas value type versus reference type is about implementation.

Because again, value semantics is what you want to know if I give you a type, and you want to know how it’s safe to use it. Whereas whether it’s value type or reference type is really this detail about what’s happening with memory under the hood.

The whole benefit of value semantics is it allows you to forget instances exist. Because our definition of value semantics doesn’t actually have the concept of an instance, or the concept of reference in it. It only has the concept of the variable and value. A type has value semantics, or a variable has value semantics if the only way to affect the value of that variable is through that variable. The whole concept of reference instance is gone, because when you have value semantics, you can just use this as your mental model. They’re sort of immutable perfect ideal things.

Consequences (24:24)

So I’ve offered my slightly adventurous definition of value semantics, but I would argue it’s totally consistent with everything that’s come before. Here are some consequences of this definition.

Immutable reference types have value semantics. I think this makes sense by any reasonable definition of value semantics. Yes, they’re reference types, but they have value semantics in the sense that they behave like values, and someone else can’t mess them up.
Types have value semantics, not as an absolute matter, but only relative to an access level. This is because a variable has value semantics if the only way to modify the value of the variable is through that variable. So, if a type has a file-private access modifier on something related to the type, then I can access it if I’m running code that was defined in the same file, but if it’s not file-private, then I can’t access it from outside the same file. So if you think about it what that means is that the type might have value semantics from one access level, but not from another.

Recipe (26:38)

Suppose you’re working on a primitive value type like Int. In that case, sit tight. You’re done. Those types all have value semantics by default.

In the case you’re working on a reference type - something that’s defined using class, which uses assign-by-references as a sign of behavior - make it immutable. You use let and constant properties. And those properties themselves need to have value semantic types.

Suppose you’re dealing with a complicated case of composite value type - for example, a struct. Either use stored properties that all themselves have value semantic types. So if I have struct and all of its properties are value semantic types, then I’m done.

Or if you want to do something complicated with shared deep storage, then you need to handle the tricky copy-on-write case in the way that’s been described. Pick the value semantic access level for the type’s user; it might be public or a module. Restrict any mutable reference type properties to a lower access level, and then define accessible setters and mutating functions to copy the mutable instance instead of mutating a shared instance.

And if you want to know if you’ve done it right, you can always just go and think about the Mutation Game test.

Q&A

Is there a relation to Erlang and the Erlang Virtual Machine? (31:41)

I think that there’s definitely a connection here. And I’d say that the concept of value semantics is a little more inclusive than immutability. Immutability you’re guaranteed. But there’s still some kind of mutation that happens and usually – I’m very familiar with Clojure because I’ve used that for building and shipping systems, and in that language the primitives that you’re given tend to be immutable. I would say that I think the copy-on-write optimization is a performance optimization to maintain value semantics while under the hood you’re doing this stuff to keep storage cheaper and makes copying cheaper, but the functional programming languages like Clojure.

Is there going to be a Framework/Structure to enforce this? (34:29)

No, and I find that a bit puzzling, to be honest. Because I feel like I really do think value semantics is more important than value types. And that it’s the thing that matters when you’re deciding how you can use a type safely. Obviously, people have a excellent working understanding of it, because all of the people at Apple and elsewhere are very carefully building types that have these copy-on-write optimizations, while carefully maintaining value semantics. But the term value semantics is very rarely used and not used consistently, even though that’s the thing people are struggling to maintain and see the value of.

Next Up: New Features in Realm Obj-C & Swift

About the content

This content has been published here with the express permission of the author.

Alexis Gallagher

Alexis is as an independent consultant, building all sorts of systems with Swift, Clojure, bash, a heartfelt sincerity, a nagging skepticism, and the motley wisdom from his past adventures in science, finance, and comedy.

Twitter

4 design patterns for a RESTless mobile integration »