Sunday, May 21, 2017

Lazy Scala


This might be elementary but when a simple error trips me more than once, I blog it.

You probably know that an Iterator can only be accessed once as it's stateful. "An Iterator can only be used once because it is a traversal pointer into a collection, and not a collection in itself" from here. (Iterable just means the implementing class can generate an Iterator).

The simple mistake I made was to check the size of the iterator in a temporary log statement before mapping over it. What was a little more interesting was that other collections behave the same way if you call them via their TraversableOnce interface.

To demonstrate, say we have a Set

    val aSet = Set(1, 2, 3)

and two functions that are identical other than the type of their argument:

  def mapSet[T](xs: Set[T]): TraversableOnce[String] = {
    val mapped = xs.map(_.toString)
    println(mapped.mkString("\n"))
    mapped
  }

  def mapTraversableOnce[T](xs: TraversableOnce[T]): TraversableOnce[String] = {
    val mapped = xs.map(_.toString)
    println(mapped.mkString("\n"))
    mapped
  }

then mapTraversableOnce will return an empty iterator while mapSet will return a Set of Strings. This will come as a surprise to anybody expecting Object Oriented behaviour.

Iterator also violates the functor laws. Take two otherwise identical methods:

  def isTraversableFunctor[T, U, V](xs: Traversable[T], f: T => U, g: U => V): Boolean = {
    val lhs = xs.map(f).map(g)
    val rhs = xs.map(g compose f)
    (lhs == rhs)
  }

  def isTraversableOnceFunctor[T, U, V](xs: TraversableOnce[T], f: T => U, g: U => V): Boolean = {
    val lhs = xs.map(f).map(g)
    val rhs = xs.map(g compose f)
    (lhs == rhs)
  }

and pass them aSet. The first will say it's a functor, the second says it's not.

This is somewhat trivial as the reason it fails is that the iterator has not been materialized. "TraversableOnce's map is not a functor map, but, then again, it never pretended to be one. It's design goals specifically precluded it from being a functor map" from Daniel C. Sobral.

Laziness

Iterator comes into its own when we want access to the underlying elements lazily. But there are other ways to do this like Stream "which implements all its transformer methods lazily."

Almost all collections are eager by default. Use the view method to make them lazy. Use force on these to make them eager again.

Finally, some methods are naturally lazy. For instance, exists terminates quickly if it finds what it's looking for.


No comments:

Post a Comment