Making implicit semantics explicit

It's only fair to share...Share on Google+0Share on Facebook0Tweet about this on TwitterShare on LinkedIn0Email this to someone

New students in computer science often have a hard time distinguishing syntax and semantics of a programming language. On top of that though, many also fail to recognize the numerous levels of semantics. In this post, I’ll try to highlight how making your program’s semantics more explicit helps us in achieving better overall software.

Different semantics

If you know about formal logic, you are well aware of things having meanings on multiple levels, equality being the most typical representant of such a thing. The same applies to our code in programming – again equality is typical. Let’s start with a treacherously simply question:

What does it mean for two objects to be equal?

If you come from a more C/C++ background, then you may answer that equality simply means the same value, be that value an int or a pointer. In a language like Java, you may also know that == and equals are two different ways to decide equality. Do you think that’s it? Hopefully not.. since everyone should know that changing equals implies you must also change the hashCode result, therefore, hashCode is another form of equality. Ever tried inserting things into a TreeSet with a custom Comparator and lost some objects on the way? As it turns out, when a compare method returns 0 we have *drum roll* equality.

There is an unlimited supply of different semantics (or “meanings” if you struggle with the former) in our every day work.

Relation to bounded contexts

In terms of domain driven design (DDD), Eric Evans defines a bounded context, which can also be viewed by regarding the semantics involved. Often times, we use the same word or expression but mean different things in different contexts. Just like a “bank” means totally different things for a broker doing financial trades, a person walking through the park, or a ship captain steering through a river.

Identifying bounded contexts in your domain implies identifying the semantics of domain words within that context. In other words, it is as important that you have an ubiquitous language, as it is important that you know its semantics. You need to have both, syntax and semantics, to make the most of it.

Implicit semantics

The above may sound obvious, even trivial, but what happens all the time is that we, as developers, have semantics in our head that gets lost when we translate it into our requirements, designs, and code. Most of these semantics are not really lost completely though. It’s rather that they become invisible or implicit. Since we cannot see it, it’s hard to avoid breaking it, and therefore, implicit semantics are hidden champions for creating hidden bugs.

Let’s start with a very simple example to make this more concrete: We start with the broker by modelling shares, of which we can have fractions and which can have a certain market price. (Yes, this is extremely simplified for the sake of – well – simplicity)

case class Share(amount : Double, price : Double)

def calculate(all : Seq[Share]) : Double = {
  var sum = 0.0
  for (item <- all) {
    sum = sum + item.amount * item.price
  }
  sum
}

The first line just defines the model for this example. Then, we define a method to calculate the overall value of a bunch of shares. I’m sure you can think of a lot of ways in which this method may be improved, but do you understand what it’s doing?

Spoiler alert: It’s really just doing the same as computing your final score in a game of cards. Not seeing it? Keep reading then.

Making semantics explicit

Since this post is all about explicit semantics, let’s see how we can improve on the above code by focusing on the semantics.

Names

A lot of semantics is hidden in names already and I’m pretty sure you found a few problems with those names, so let’s pick better ones:

def summedValue(shares : Seq[Share]) : Double = {
  var sum = 0.0
  for (individualShare <- shares) {
    sum = sum + individualShare.amount * individualShare.price
  }
  sum
}

Nothing much happened. Just a few names changed. Doesn’t look like much of an improvement, until you try to visualize the calling site. When you want to call this method, then “calculate” tells you almost nothing about what this operation actually means. A “summedValue” is already much better. Does the parameter name matter? Try to think of auto-completion and how much better “shares” is for that. On the other hand, the “individualShare” is used only internal to the method, so it has a much smaller impact.

Derived values

We can have a certain amount of each share and it has a price. Put together, the share we hold has a certain value. This is not explicit in our model, but instead, the summation method computes the value as part of its work. Implicitly, the code already contains the meaning of the value of a share as being derived from the amount and price. Let’s make this derived value an explicit one then:

trait Valued {
  def value : Double
}

case class Share(amount : Double, price : Double) extends Valued {
  val value = amount * price
}

def summedValue(shares : Seq[Share]) : Double = {
    var sum = 0.0
    for (individualShare <- shares) {
      sum = sum + individualShare.value
    }
    sum
  }

The first couple of lines shows the extension of our model to explicitly include things that can be valued and a share being one of them. Making this explicit improves our design just as well as it improves the implementation code of summedValue.

Common operations

Summation of values is clearly a very common operation and we have a good intuitive understanding of its semantics. However, in our example, the summation is not happening explicitly. It is implicit in the declaration of a variable, a loop, a destructive assignment, and a return value. Since it’s 2015 you should be well aware by now that loops like the above can be written more succinct and with clearer semantics using a functional style, so let’s take a look:

def summedValue(shares : Seq[Share]) : Double =
    shares.map(_.value).sum

Interesting. A huge amount of boilerplate code (i.e. syntax without any relevant semantics!) has vanished and we are left with a single line of code that conveys the meaning pretty well.

Iterations

The funny thing is that once you start to specifically make your semantics explicit, you discover that there is always another level above what you have thought the meaning was. Sometimes it takes a while until we see it, other times we may even jump multiple levels in one step. In our example, we can see that values are retrieved from the shares and the sum is computed, but all that we need of the “Share” type once you think about it, is the “Valued” part of it. So why not just accept a bunch of Valued objects instead? Of course, “shares” doesn’t make any sense any more, so we iterate and make these improved semantics explicit again by changing the name once more.

def summedValue(valuedObjects : Seq[Valued]): Double =
    valuedObjects.map(_.value).sum

This may seem like the actual meaning of what happens isn’t as clear as above anymore. In a sense that is true, since the semantics of this method is now based on a higher abstraction level. On the other hand, the meaning is now at a level where we can apply it to a game of cards just as well.

Another application

As mentioned above, we want to apply this code to a game of cards. Consider typical playing cards which have a suit and a number. Throughout the game you acquire some of these cards, and your final score is calculated from these. To make things more interesting, the final score is actually the product of the card numbers.

case class Card(suit : Suit, number : Int) extends Valued {
  val value = number.toDouble
}

The “Suit” definition is straightforward and left as an exercise to the reader. Now this was pretty simple, and it seems like we could immediately use our “summedValue”. Well sort of. First, we have to wonder why “Valued” always needs to return a Double, as in this case, Int would be more suitable. And of course, we have not yet solved the problem of the computation requiring a product instead of a sum in this case.

If you haven’t read my post on mathematics, now would be a bloody fine time for that.

Reaping the benefits

Let us change Valued to include a generic type T for its value. Then of course, our sum calculation doesn’t work, but we know how to deal with that in a much more explicit way by referring to the monoid of T:

trait Monoid[T] {
  def identity : T
  def op(arg1 : T, arg2 : T) : T
}

implicit val intProductMonoid : Monoid[Int] = new Monoid[Int] {
  override def identity(): Int = 1

  override def op(arg1: Int, arg2: Int): Int = arg1 * arg2
}

implicit val doubleSumMonoid : Monoid[Double] = new Monoid[Double] {
  override def identity(): Double = 0.0

  override def op(arg1: Double, arg2: Double): Double = arg1 + arg2
}

trait Valued[T] {
  def value : T
}

def overallValue[T](valuedObjects : Seq[Valued[T]])(implicit monoid : Monoid[T]) : T =
  valuedObjects.map(_.value).fold(monoid.identity)(monoid.op)

The first few lines just re-iterate the Monoid and the individual definitions we need for our shares and cards. The “overallValue” is now changed to compute the result by folding the given values based on the monoid. Take a look at all the explicit semantics here. The method signature alone tells us that we have Valued objects of type T, where T needs to form a monoid, and we get a result of this type again.

At this point, we can easily apply the method to some shares and cards:

val shares = List(Share(0.5, 2), Share(0.75, 4))
val cards = List(Card(Hearts, 4), Card(Diamonds, 10))

println(overallValue(shares)) // 4.0
println(overallValue(cards))  // 40

Relation to Clean Code

In the clean code community, we have a set of values or beliefs, that are deemed essential for professional software development. These include: evolvability, correctness, production efficiency and reflection (the sort you do in your mind, not in your programming language). The basic idea is if you have different designs or techniques, you can get a sort of objective comparison of their worth by evaluating each with respect to these values. So let’s do this for making semantics explicit:

Evolvability

We have seen how far we evolved our original code. Once your semantics are more explicit, it is easier to re-use the corresponding code, as you can understand its meaning better. In a sense, the act of making semantics explicit is already evolving a program.

Correctness

In the above samples, we saw how a lot of initial code was reduced to a small piece of code in which it is much harder for bugs to hide. In fact, with the amount of explicit semantics found in the final method signature, introducing a bug is rather hard. The remaining work of defining a Valued object, or defining a concrete monoid, is simple and straightforward and there is little room left for mistakes.

Production efficiency

As the saying goes, we read code many more times than writing it. Reading code with implicit semantics is hard. You have an additional mental workload in figuring out the meaning in contrast to semantics that have been made explicit. Even something as simple as the explicit semantics we gained by renaming is beneficial to the efficiency with which others can read, understand, and apply our code.

Reflection

This was partly taken away from you by me writing this article. Take a look at your own code and try to make your implicit semantics explicit. Take a step back then and you will realize just how much more you thought about your code and what it really means.

[The feature image is CC-BY-SA 3.0 by Enoch Lau via wikimedia commons]

It's only fair to share...Share on Google+0Share on Facebook0Tweet about this on TwitterShare on LinkedIn0Email this to someone