OOPSLA Monday - This Ain't No Content

Some more notes on the tutorials I attended

Smalltalk

I was a bit disappointed in this tutorial. It never got much beyond installing an image and getting used to the syntax (which is admittedly a bit different from the curly brace syntax I grew up with). The slide deck was the old-fashioned kind where every point has a bullet point. I felt many of them could have been skimmed over with little loss. I was particularly unhappy about getting several slides at the start about why I might be interested in learning about Smalltalk. Isn't flying to Florida and paying to attend the tutorial enough evidence to show that I am already interested?

The image used was Pharo (apparently a fork of the Squeak project) and the meatier part of the tutorial was spent exploring the Class Browser, Test Runner, inspectors and the Monticello Browser.

The presenter, James Foster, was obviously very knowledgeable about Smalltalk but a little fuzzy on Java and other curly brace languages he was comparing it with. I mentioned that I had some exposure to Ruby and he responded with a money quote from Ward Cunningham that "I always expected Smalltalk to come back, I just didn't expect it to be called Ruby". Well, yes.

Parameterized Unit Testing

This should have been called "demo of the PEX tool". At least two other people who attended were annoyed because the tool could only be used with C# in Windows. Even though I do a lot of C# on Windows these days I was somewhat concerned at licensing terms (pick academic non-commercial or temporary evaluation). Java barely got a mention: apparently Junit4 has some support for parameterized unit tests and Agitar has a product that can be used to generate values.

The tool itself is impressive. You provide a unit test that takes parameters for the input values. It generates initial values (generally, the simnplest possible) and uses profiler hooks to instrument the bytecode at runtime to provide an extremely detailed trace of actual execution. The tool sees what paths are available at each branch and which are taken. It then uses a constraint solver to decide what values to try to make the execution take other branches. (The constraint solver comes from research into theorem proving and it understands restricted domains like integer arithmetic and simple program constructs such as tuples and arrays.) Within seconds it generates thousands of runs with different test data and keeps only those that caused a new execution path to be taken.

The amount of work required to get this functional is impressive. For example, they built into the tool a formal understanding of the semantics of every single byte code in the CLR (including what exceptions it could throw, including arithmetic overflow, which is possible if you configure Visual Studio the right way).

The question arises: if my unit test is going to get unknown (to me) input values how can I make meaningful assertions about its behaviour? In the tutorial, they presented some "patterns". In many cases, it boils down to identifying the group properties of your code: are there inverse operations, commutative operations, etc? If so, you can build a sequence of operations that should always have the same result, no matter what the input. Others relate to seeing that state invariants are preserved, for example if you insert something into a collection, you always expect to find it there afterwards.

This kind of tool looks promising as a way to alleviate the tedious task of coming up with plausible inputs. It does come at a cost to maintainability of unit tests. It could be very useful in generating regression tests.

It's just a shame that this is Microsoft only and, at that, not available for production use unless Microsoft decides to include it in their Visual Studio offering at some time in the future.