Benchmarking Scala

Microbenchmarking is controversial as the following links show:

Anatomy of a flawed microbenchmark Brian Goetz 2005
Java benchmarking article ellipticgroup 2009(?)
How do I write a correct micro-benchmark in Java? stackoverflow 2009
Google Caliper
…

Nevertheless I did write a little micro-benchmarking framework in Scala so I could experiment with the Scala language and libraries.

It allows to run little code snippets such as:

object LoopExample extends ImageReport {
  def main(args: Array[String]) {
    lineChart("Loops", 0 to 2, 0 to 100000 by 10000,
      new FunBenchmarks[Int, Int] {
        prepare {
          count => count
        }
        run("for loop", "An empty for loop.") { count =>
          for (i <- 0 to count) {
          }
        }
        run("while loop", "A while loop that counts in a variable without returning a result.") { count =>
          var i = 0
          while (i < count) {
            i += 1
          }
        }
      })
  }

This produced the image used in the last blog Scala for (i <- 0 to n) : nice but slow:

The framework also allows to create an HTML report containing multiple benchmarks suites.

The following thumbnails are from an example report showing a full run of the suite I wrote to benchmark some basic functionality of Scala (and Java).

Let's have a look at some of the more interesting results.

Note: All micro-benchmarking results should always be interpreted with a very critical mindset. Many things can go wrong when measuring a single operation over and over again. It is very easy to screw up and become meaningless results.
If you want to analyze a particular benchmark in more details, follow the details link at the end of the suite chapter. It will show a detailed statistical analysis of this particular benchmark suite.

Loops

        run("for loop", "An empty for loop.") { count =>
          for (i <- 0 to count) {
          }
        }
        run("for loop result", "A for loop that accumulates a result in a variable.") { count =>
          var result = 0
          for (i <- 0 to count) {
            result += i
          }
          result
        }
        run("while loop", "A while loop that counts in a variable without returning a result.") { count =>
          var i = 0
          while (i < count) {
            i += 1
          }
        }
        run("while loop result", "A while loop that accumulates a result in a variable.") { count =>
          var result = 0
          var i = 0
          while (i < count) {
            result += i
            i += 1
          }
          result
        }
        run("do while loop", "A do-while loop that counts in a variable without returning a result.") { count =>
          var i = 0
          do {
            i += 1
          } while (i <= count)
        }
        run("do while loop result", "A do-while loop that accumulates a result in a variable.") { count =>
          var result = 0
          var i = 0
          do {
            result += i
            i += 1
          } while (i <= count)
          result
        }

As already discussed in another blog entry, loops in Scala perform surprisingly different:

The for (i <- 1 to count) loop is significantly slower than while (i < count).

Arithmetic

Not really a surprise, but BigDecimal arithmetic is very slow compared to double arithmetic (on both charts the red line is the reference benchmark that executes in the same time).

Casts

I doubted the performance of the Scala way of casting a reference, so I compared it with the Java-like cast method:

        var any: Any = "Hello"
        var result: String = _

        run("asInstanceOf", "Casts a value using asInstanceOf.") { count =>
          for (i <- 0 to count) {
            result = any.asInstanceOf[String]
          }
        }
        run("match case", "Casts a value using pattern matching with the type.") { count =>
          for (i <- 0 to count) {
            result = any match {
              case s: String => s
              case _ => throw new IllegalArgumentException
            }
          }
        }

Happily they perform practically the same:

Immutable Map

I am really fond of immutable maps, they are easy to reason use and perform very well.

        run("contains true", "Checks that a map really contains a value.") { map =>
          map.contains(0)
        }
        run("contains false", "Checks that a map really does not contain a value.") { map =>
          map.contains(-999)
        }
        run("+", "Adds a new entry to a map.") { map =>
          map + (-1 -> "X-1")
        }
        run("-", "Removes an existing entry from a map.") { map =>
          map - 0
        }

The immutable maps with size 0 to 4 are special classes that store the key/value pairs directly in dedicated fields (testing sequentially) - therefore we see linear behaviour there.

The strong peak when adding another key/value pair to an immutable map with a size of 4 is probably because it switches to the normal scala.collection.immutable.HashMap (and creating 4 tuples) after testing all keys:

    // Implementation detail of scala.collection.immutable.Map4
    override def updated [B1 >: B] (key: A, value: B1): Map[A, B1] =
      if (key == key1)      new Map4(key1, value, key2, value2, key3, value3, key4, value4)
      else if (key == key2) new Map4(key1, value1, key2, value, key3, value3, key4, value4)
      else if (key == key3) new Map4(key1, value1, key2, value2, key3, value, key4, value4)
      else if (key == key4) new Map4(key1, value1, key2, value2, key3, value3, key4, value)
      else new HashMap + ((key1, value1), (key2, value2), (key3, value3), (key4, value4), (key, value))
    def + [B1 >: B](kv: (A, B1)): Map[A, B1] = updated(kv._1, kv._2)

HashMap

The standard Java java.util.HashMap is also interesting for Java programmers.

The measured read operations are:

        run("contains true", "Checks that a map really contains a value.") { map =>
          map.containsKey(0)
        }
        run("contains false", "Checks that a map really does not contain a value.") { map =>
          map.containsKey(-999)
        }
        run("size", "Calculates the size of a map.") { map =>
          map.size()
        }

At first glance containsKey and size() seem to be constant (as expected), but to be sure an additional benchmark with a larger n up to a 1000 was added:

Surprisingly all measured methods grow slowly with increasing size (contrary to the expected constant time behaviour) - this needs to be analyzed in detail.

A look at the details of this suite shows that actually all measured elapsed times are either 0.00000 ms or 0.00038 ms and the apparent increase with growing size is purely a statistical side effect. Obviously this benchmark is at the low end of measurement accuracy.

The measured write operations are:

        run("put", "Adds a new entry to a map.") { map =>
          map.put(-1, "X-1")
        }
        run("remove", "Removes an existing entry from a map.") { map =>
          map.remove(0)
        }
        run("clear", "Removes all entries from a map.") { map =>
          map.clear()
        }

put() and remove() are reasonably constant, albeit they grow very slowly with increasing size.

I was very surprised to see that the clear() method is not constant time. A quick look at the implementation shows that it is linear with the number of entries.

    // Implementation of java.util.HashMap.clear()
    public void clear() {
        modCount++;
        Entry[] tab = table;
        for (int i = 0; i < tab.length; i++)
            tab[i] = null;
        size = 0;
    }

ConcurrentHashMap

The behaviour of java.util.concurrent.ConcurrentHashMap is very similar to HashMap (slightly slower).

Pattern Matching

Pattern matching is a very nice and powerful feature of Scala.

This benchmark tests only the simplest case of pattern matching and compares them with a comparable if-else cascade.

        run("match 1", "Matches the 1st pattern with a literal integer value.") { seq =>
          for (value <- seq) {
            value match {
              case 1 => "one"
              case _ => "anything"
            }
          }
        }
        run("if 1", "Matches the 1st if in an if-else cascade with integer values.") { seq =>
          for (value <- seq) {
            if (value == 1) "one"
            else "anything"
          }
        }
        run("match 5", "Matches the 5th pattern with a literal integer value.") { seq =>
          for (value <- seq) {
            value match {
              case 1 => "one"
              case 2 => "two"
              case 3 => "three"
              case 4 => "four"
              case 5 => "five"
              case _ => "anything"
            }
          }
        }
        run("if 5", "Matches the 5th if in an if-else cascade with integer values.") { seq =>
          for (value <- seq) {
            if (value == 1) "one"
            else if (value == 2) "two"
            else if (value == 3) "three"
            else if (value == 4) "four"
            else if (value == 5) "five"
            else "anything"
          }
        }

As you see, simple pattern matching and the if-else cascade have comparable speed.

Loops

Arithmetic

Casts

Immutable Map

HashMap

ConcurrentHashMap

Pattern Matching

Leave a Reply Cancel reply

Recent Posts

Categories

Archives

Benchmarking Scala

Loops

Arithmetic

Casts

Immutable Map

HashMap

ConcurrentHashMap

Pattern Matching

Leave a Reply Cancel reply

Recent Posts

Categories

Tags

Archives