Laziness in Scala

how-to
Mar 29, 200917 mins

While playing around with a recent research-oriented project for myself (more on that later), I discovered something that I haven’t seen mentioned anywhere in the Scala universe before. (OK, not really–as you’ll see towards the end of this piece, it really is documented, but allow me my brief delusions of grandeur as I write this. They’ll get deflated quickly enough.)

So the core of the thing was a stack-oriented execution engine; essentially I’m processing commands delivered in a postfix manner. Since some of these commands are relational operators, it’s important that there be two things to relationally operate on the execution stack, after which I want to evaluate the relational operation and push its result (1 if true, 0 if false) back on the stack; this is pretty easily done via the following:

def compareOp(op : (Int, Int) => Boolean) =

{

checkStack(2)

val v1 = (execStack.pop()).asInstanceOf[Int]

val v2 = (execStack.pop()).asInstanceOf[Int]

val vr = op(v1, v2)

execStack.push(<span style="color: #0000ff">if</span> (vr) 1 <span style="color: #0000ff">else</span> 0)

} 


where “execStack” is a mutable.Stack[Any] held in an enclosing function.

Interestingly enough, however, when I wrote this the first time, I wrote it like this, which is a very different sequence of operations:

def compareOp(op : (Int, Int) => Boolean) =

{

checkStack(2)

def v1 = (execStack.pop()).asInstanceOf[Int]

def v2 = (execStack.pop()).asInstanceOf[Int]

def vr = op(v1, v2)

execStack.push(<span style="color: #0000ff">if</span> (vr) 1 <span style="color: #0000ff">else</span> 0)

}

See the difference? Subtle, is it not? But the actual code is significantly different, something that’s more easily seen with a much simpler (and standalone) example:

<span style="color: #0000ff">object</span> App

{

def main(args : Array[String]) =

{

import scala.collection.mutable.Stack

var stack : Stack[Any] = <span style="color: #0000ff">new</span> Stack()

stack.push(12)

stack.push(24)

def v1 = (stack.pop()).asInstanceOf[Int]

def v2 = (stack.pop()).asInstanceOf[Int]

def vr = v1 + v2

System.<span style="color: #0000ff">out</span>.println(vr)

}

} 

When run, the console prints out “36”, as we’d well expect.

But suppose we want to look at those values of v1 and v2 along the way, perhaps as part of a logging operation, or perhaps because you’re just screwing around with some ideas in your head and you don’t want to bother to fire up an IDE with Scala support in it. So you decide to spit those values to a console:

<span style="color: #0000ff">object</span> App

{

def main(args : Array[String]) =

{

import scala.collection.mutable.Stack

var stack : Stack[Any] = <span style="color: #0000ff">new</span> Stack()

stack.push(12)

stack.push(24)

def v1 = (stack.pop()).asInstanceOf[Int]

def v2 = (stack.pop()).asInstanceOf[Int]

System.<span style="color: #0000ff">out</span>.println(v1)

System.<span style="color: #0000ff">out</span>.println(v2)

def vr = v1 + v2

System.<span style="color: #0000ff">out</span>.println(vr)

}

} 

And then something *very* different happens; you get “24”, “12”, and then a NoSuchElementException.

If you’re like me the first time I ran into this, your first reaction is, “Eh?”. Actually, if you’re like me, when you’re programming, your profanity filters are probaby at an ebb, so your first reaction is “WTF?!?”, said with great gusto and emphasis. Which has a tendency to get some strange looks when you’re at a Denny’s doing your research, I will admit. Particularly when it’s at 3 AM in the morning. And the bar crowd is in full alcoholic haze and slightly nervous about the long-haired, goatee-sporting guy in his headphones, wearing his black leather jacket and swearing like a drunken sailor at his laptop. But I digress.

What is Scala doing here?

Turns out this is exactly as the language designers intended, but it’s subtle. (Or maybe it’s just subtle to me at 3AM when I’m pumped full of caffeine.)

Let’s take this a different way:

<span style="color: #0000ff">object</span> App

{

def main(args : Array[String]) =

{

import scala.collection.mutable.Stack

var stack : Stack[Any] = <span style="color: #0000ff">new</span> Stack()

stack.push(12)

stack.push(24)

def v1 = (stack.pop()).asInstanceOf[Int]

def v2 = (stack.pop()).asInstanceOf[Int]

System.<span style="color: #0000ff">out</span>.println(stack)

}

} 

When run, the console prints “Stack(12, 24)”, which *really* starts to play with your mind when you’re a little short on sleep and a little high on Diet Coke. At first glance, it looks like Scala is broken somehow–after all, those “pop” operations are supposed to modify the Stack against which they’re operating, just as the push()es do. So why is the stack convinced that it still holds the values of 12 and 24?

Because Scala hasn’t actually executed those pop()s yet.

The “def” keyword, it turns out, isn’t what I wanted here–what I wanted (and in retrospect it’s painfully obvious) was a “val”, instead, in order to force the execution of those statements and capture the value into a local value (an immutable local variable). The “def” keyword, instead, creates a function binding that waits for formal execution before evaluating. So that when I previously said

<span style="color: #0000ff">object</span> App

{

def main(args : Array[String]) =

{

import scala.collection.mutable.Stack

var stack : Stack[Any] = <span style="color: #0000ff">new</span> Stack()

stack.push(12)

stack.push(24)

def v1 = (stack.pop()).asInstanceOf[Int]

def v2 = (stack.pop()).asInstanceOf[Int]

def vr = v1 + v2

System.<span style="color: #0000ff">out</span>.println(vr)

}

} 

… what in fact I was saying was this:

<span style="color: #0000ff">object</span> App

{

def main(args : Array[String]) =

{

import scala.collection.mutable.Stack

var stack : Stack[Any] = <span style="color: #0000ff">new</span> Stack()

stack.push(12)

stack.push(24)

def v1 = (stack.pop()).asInstanceOf[Int]

def v2 = (stack.pop()).asInstanceOf[Int]

System.<span style="color: #0000ff">out</span>.println(v1 + v2)

}

} 

… which is the same as:

<span style="color: #0000ff">object</span> App

{

def main(args : Array[String]) =

{

import scala.collection.mutable.Stack

var stack : Stack[Any] = <span style="color: #0000ff">new</span> Stack()

stack.push(12)

stack.push(24)

System.<span style="color: #0000ff">out</span>.println((stack.pop()).asInstanceOf[Int]
+ (stack.pop()).asInstanceOf[Int])

}

} 

… which, when we look back at my most recent “debugging” version of the code, substituting the “def”ed versions of v1 and v2 (and vr) where they’re used, makes the reason for the NoSuchElementException become entirely more clear:

<span style="color: #0000ff">object</span> App

{

def main(args : Array[String]) =

{

import scala.collection.mutable.Stack

var stack : Stack[Any] = <span style="color: #0000ff">new</span> Stack()

stack.push(12)

stack.push(24)

System.<span style="color: #0000ff">out</span>.println((stack.pop()).asInstanceOf[Int])

System.<span style="color: #0000ff">out</span>.println((stack.pop()).asInstanceOf[Int])

System.<span style="color: #0000ff">out</span>.println((stack.pop()).asInstanceOf[Int]
+ (stack.pop()).asInstanceOf[Int])

}

} 

Now, normally, this would probably set off all kinds of alarm bells in your head, but the reaction that went off in mine was “COOL!”, the reasons for which revolve around the concept of “laziness”; in a functional language, we frequently don’t want to evaluate the results right away, instead preferring to defer their execution until actually requiring it. In fact, many functional languages—such as Haskell—take laziness to new heights, baking it directly into the language definition and assuming laziness everywhere, so much so that you have to take special steps to avoid it. There’s a variety of reasons why this is advantageous, but I’ll leave those discussions to the Haskellians of the world, like Matt Podwysocki and Simon Peyton-Jones.

From a Scalist’s perspective, laziness is still a useful tool to have in your toolbox. Suppose you have a really powerful function that calculates PI to a ridiculous number of decimal places. In Java, you might be tempted to do something like this:

<span style="color: #0000ff">class</span> MyMath

{

<span style="color: #0000ff">public</span> <span style="color: #0000ff">static</span> final <span style="color: #0000ff">double</span> PI
= calculatePiToARidiculousNumberOfPlaces();

<span style="color: #0000ff">private</span> <span style="color: #0000ff">static</span> <span style="color: #0000ff">double</span> calculatePiToARidiculousNumberOfPlaces()

{

<span style="color: #008000">// implementation left to the reader's imagination</span>


<span style="color: #008000">// imagine it being "really cool"</span>


}

} 

The problem with this is that if that method takes any length of time to execute, it’s being done during class initialization during its ClassLoading phase, and aside from introducing a window of time where the class *could* be used before that initialization is finished (it’s subtle, it’s not going to happen very often, but it can, according to older versions of the JVM Spec), the problem is that the time required to do that initialization is paid for *regardless of whether you use PI*. In other words, the classic Stroustrup-ian “Don’t pay for it if you don’t use it” principle is being completely tossed aside.

In Scala, using the “def” keyword here, aside from avoiding the need for the additional decorators, completely eliminates this cost–people won’t need the value of PI until it becomes used:

<span style="color: #0000ff">object</span> App

{

def PI = calculatePiToARidiculousNumberOfPlaces()

def calculatePiToARidiculousNumberOfPlaces() =

{

System.<span style="color: #0000ff">out</span>.println(<span style="color: #006080">"Calculating
PI"</span>)

3 + 0.14

}

def main(args : Array[String]) =

{

System.<span style="color: #0000ff">out</span>.println(<span style="color: #006080">"Entering
main"</span>)

System.<span style="color: #0000ff">out</span>.println(<span style="color: #006080">"PI
= "</span> + PI)

}

} 

(In fact, you’d probably just write it without the calculating method definition, since it’s easier that way, but bear with me.)

When you run this, of course, we see PI being calculated after main()’s been entered, thus proving that PI is being calculated only on demand, not ahead of time, as a public-static-final-constant would be.

The problem with this approach is, you end up calculating PI on each access:

<span style="color: #0000ff">object</span> App

{

def PI = calculatePiToARidiculousNumberOfPlaces()

def calculatePiToARidiculousNumberOfPlaces() =

{

System.<span style="color: #0000ff">out</span>.println(<span style="color: #006080">"Calculating
PI"</span>)

3 + 0.14

}

def main(args : Array[String]) =

{

System.<span style="color: #0000ff">out</span>.println(<span style="color: #006080">"Entering
main"</span>)

System.<span style="color: #0000ff">out</span>.println(<span style="color: #006080">"PI
= "</span> + PI)

System.<span style="color: #0000ff">out</span>.println(<span style="color: #006080">"PI
= "</span> + PI)

<span style="color: #008000">// prints twice! Not good!</span>


}

} 

Which sort of defeats the advantage of lazy evaluation.

This got me wondering–in F#, we have lazy as a baked-in concept (sort of), such that when I write

#light

let sixty = lazy (30 + 30)

System.Console.WriteLine(sixty) 

What I see on the console is not 60, but a Lazy type instance, which effectively defers execution until it’s Force() method is invoked (among other scenarios). This means I can write things like

let reallyBigList = lazy ([1..1000000000000] |> complexCalculation |> anotherComplexCalcuation)

without fear of blowing the stack or heap apart, since laziness means the list won’t actually be calculated until it’s forced; we can see this from the following (from the F# interactive console):

> let sixtyWithSideEffect = lazy (printfn <span style="color: #006080">"Hello
world"</span>; 30+30);;

val sixtyWithSideEffect: Lazy<<span style="color: #0000ff">int</span>>

> sixtyWithSideEffect.Force();;

Hello world

val it : <span style="color: #0000ff">int</span> = 60

> sixtyWithSideEffect.Force();;

val it : <span style="color: #0000ff">int</span> = 60 

(Examples taken from the excellent Expert F# by Syme/Granicz/Cisternino; highly recommended, if a touch out-of-date to the current language definition. I expect Chris Smith’s Programming F#, from O’Reilly, to correct that before too long.)

It would be nice if something similar were doable in Scala. Of course, once I start looking for it, it makes itself visible, in the wonderful Venners/Odersky/Spoon book, Programming In Scala, p. 444:

You can use pre-initialized fields to simulate precisely the initialization behavior

of class constructor arguments. Sometimes, however, you might prefer

to let the system itself sort out how things should be initialized. This can

be achieved by making your val definitions lazy. If you prefix a val definition

with a lazy modifier, the initializing expression on the right-hand side

will only be evaluated the first time the val is used.

[…]

This is similar to the situation where x is defined as a parameterless

method, using a def. However, unlike a def a lazy val is never evaluated

more than once. In fact, after the first evaluation of a lazy val the result of the

evaluation is stored, to be reused when the same val is used subsequently.

Perfect! The key, then, is to define PI like so:

<span style="color: #0000ff">object</span> App

{

lazy val PI = calculatePiToARidiculousNumberOfPlaces()

def calculatePiToARidiculousNumberOfPlaces() =

{

System.<span style="color: #0000ff">out</span>.println(<span style="color: #006080">"Calculating
PI"</span>)

3 + 0.14

}

def main(args : Array[String]) =

{

System.<span style="color: #0000ff">out</span>.println(<span style="color: #006080">"Entering
main"</span>)

System.<span style="color: #0000ff">out</span>.println(<span style="color: #006080">"PI
= "</span> + PI)

System.<span style="color: #0000ff">out</span>.println(<span style="color: #006080">"PI
= "</span> + PI)

<span style="color: #008000">// prints once! Awesome!</span>


}

} 

That means, if I apply it to my Stack example from before, I should get the same deferred-execution properties of the “def”-based version …

def main(args : Array[String]) =

{

import scala.collection.mutable.Stack

var stack : Stack[Any] = <span style="color: #0000ff">new</span> Stack()

stack.push(12)

stack.push(24)

lazy val v1 = (stack.pop()).asInstanceOf[Int]

lazy val v2 = (stack.pop()).asInstanceOf[Int]

System.<span style="color: #0000ff">out</span>.println(stack)

<span style="color: #008000">// prints out "Stack(12,24)</span>


} 

… but if I go back to the version that blows up because the stack is empty, using lazy val works exactly the way I would want it to:

def main(args : Array[String]) =

{

import scala.collection.mutable.Stack

var stack : Stack[Any] = <span style="color: #0000ff">new</span> Stack()

stack.push(12)

stack.push(24)

lazy val v1 = (stack.pop()).asInstanceOf[Int]

lazy val v2 = (stack.pop()).asInstanceOf[Int]

System.<span style="color: #0000ff">out</span>.println(v1)

System.<span style="color: #0000ff">out</span>.println(v2)

lazy val vr = v1 + v2

System.<span style="color: #0000ff">out</span>.println(vr)

<span style="color: #008000">// prints 12, 24, then 36</span>


<span style="color: #008000">// and no exception!</span>


} 

Nice.

So, it turns out that my accidental use of “def” inside the compareOp function behaves exactly the way the language designers wanted it to, which is not surprising, and that Scala provides nifty abilities to defer processing or extraction of values until called for.

Curiously, the two languages differ in how laziness is implemented; in F#, the lazy modifier defines the type to be a Lazy instance, an ordinary type that we can pass around from F# to C# and back again as necessary (in much the same way that C# defined nullable types to be instances of Nullable under the hood). We can see that from the interactive console output above, and from the fact that we call Force() on the instance to evaluate its value.

In Scala, however, there is no corresponding Lazy[T] type; instead, the PI() method is defined to determine whether or not the value has already been evaluated:

<span style="color: #0000ff">public</span> <span style="color: #0000ff">double</span> PI();

Code:

0: aload_0

1: getfield #135; <span style="color: #008000">//Field bitmap$0:I</span>


4: iconst_1

5: iand

6: iconst_0

7: if_icmpne 48

10: aload_0

11: dup

12: astore_1

13: monitorenter

14: aload_0

15: getfield #135; <span style="color: #008000">//Field bitmap$0:I</span>


18: iconst_1

19: iand

20: iconst_0

21: if_icmpne 42

24: aload_0

25: aload_0

26: invokevirtual #137; <span style="color: #008000">//Method calculatePiToARidiculousNumberOfPlaces:()D</span>


29: putfield #139; <span style="color: #008000">//Field PI:D</span>


32: aload_0

33: aload_0

34: getfield #135; <span style="color: #008000">//Field bitmap$0:I</span>


37: iconst_1

38: ior

39: putfield #135; <span style="color: #008000">//Field bitmap$0:I</span>


42: getstatic #145; <span style="color: #008000">//Field scala/runtime/BoxedUnit.UNIT:Lscala/runtime/BoxedUnit;</span>


45: pop

46: aload_1

47: monitorexit

48: aload_0

49: getfield #139; <span style="color: #008000">//Field PI:D</span>


52: dreturn

53: aload_1

54: monitorexit

55: athrow

Exception table:

from to target type

14 48 53 any

If you look carefully at the bytecode, the implementation of PI is checking a bitmask field (!) to determine if the first bit is flipped (!) to know whether or not the value is held in the local field PI, and if not, calculate it and store it there. This means that Java developers will just need to call PI() over and over again, rather than have to know that the instance is actually a Lazy[T] on which they need to call Value or Force (such as one would from C# in the F# case). Frankly, I don’t know at this point which approach I prefer, but I’m slightly leaning towards the Scala version for now. (If only Java supported properties, then the syntax “MyMath.PI” would look like a constant, act lazily, and everything would be great.)

(It strikes me that the F# developer looking to write something C#-accessible need only tuck the Lazy instance behind a property accessor and the problem goes away, by the way; it would just be nicer to not have to do anything special on either side, to have my laziness and Force() it, too. Pipe dream, perhaps.)

In retrospect, I could wish that Scala weren’t *quite* so subtle in its treatment of “def” vs “val”, but now that I’m aware of it, it’ll (hopefully) not bite me quite so subtly in the sensitive spots of my anatomy again.

And any experience in which you learn something is a good one, right?


Enterprise consulting, mentoring or instruction. Java, C++, .NET or XML services. 1-day or multi-day workshops available. Contact me for details.

ted_neward

Ted Neward is an independent consultant specializing in high-scale enterprise systems, working with clients ranging in size from Fortune 500 corporations to small 10-person shops. He is an authority in Java and .NET technologies, particularly in the areas of Java/.NET integration (both in-process and via integration tools like Web services), back-end enterprise software systems, and virtual machine/execution engine plumbing. He lives in the Pacific Northwest with his wife, two sons, and eight PCs.

More from this author