State Makes It Easy: a Test-First Parable

work safe

Just some background on how I got the the problem I’ll be discussing. Recently, the twitters where talking about the Spolskey / Martin feud over TDD and whether or not it can be useful for every class of problem. In addition there’s been a lot of talk about Katas as a form of practice. This lead me to a 2002 article by William Wake describing a test-first spreadsheet. And, Michael Feathers is writing Vi in Haskell.

So, I started the test first spreadsheet challenge in C# because it’s super easy to work in a language you know. For the record, I completed the engine but not the application. Part of the problem is building an expression parser to evaluate formulas. While I’m not proud of what I wrote, my parser works and passes all the tests. The parser is a horrible hack that has no place in polite society and I’m not sure it should be called a parser. I blindly implemented the worst thing that could possibly work. It does have a clever little pivot to control operator precedence, but it’s ugly. That was intentional as part of this exercise was to explore the Sudoku issue from the TDD dust-up (TDD can(‘t) be used for mathematically important solutions) and see if I can build a parser without paying attention to what defines a parser in my opinion of the eyes of the rest of the world.

So, that was fun, but I wanted to play with F# since it’s been a while. The spreadsheet challenge would be an interesting experiment for me in Functional programming and TDD. I remember reading about the Haskell vim project and saying, “You can do that?” So, I wanted to find out how. It went fairly smoothly until I hit the parser, then I hit a wall. Almost literally, since it was the “)” that set me back. I was ashamed of myself. I wrote a parser for my FSharp talk that was as complex or more than this. I’m a huge fan of DSLs. Am I going to have to admit that test-first is a failure? No, I just need to look a bit deeper and discover what I think of test first.

But I did it in C#, why? In C#, I scattered information about the state of the parser all over my classes. It was easy to bang your way out of a corner if you can throw state at it. F# also has state, but I was trying to be as stateless as possible, even with a spreadsheet, and I was doing pretty damn good up until this point. I took a hard look at what Test-first means, then I took some time and relearned everything I knew about parsers up until that point.

I’m avoiding terms like BDD, TDD, and Context-Specification because they have very specific meanings to very vocal people from whom I still plan on learning. What do I mean by test-first?

  • Try not to write code until you have a need.
  • Those needs are expressed as executable tests.
  • Your code should only express the needs of the tests.

Test-First agile design gets a bad wrap in some corners as “no-design” when it really is about controlling assumptions. My instincts said I’d need a parser to evaluate the formula, but I stubbornly threw them away. I was thinking about a parser framework like I built up in my FSharp talk. This is way too much and I couldn’t justify it. Assumptions here are: I need a parser framework vs. parsers will fall out. Both assumptions are wrong. A true parser is a small thing that you can test quickly and easily. I needed just enough parser to get the job done. The trick is not to ignore a solved problem, but control scope of the solution.

What is a parser? When reading the Graham Hutton Haskell book that Eric Meijer is using in his lecture series, I found a poem. A parser for a thing / is a function that from a strings / to lists of pairs / of things and strings. I’m not returning a list of things, but still. In a fit of assumption, I changed string to token-stream (I’m not a saint) and I was off. I started with a simple parser, operating on a token stream.

type expression =
  | Empty
  | Number of int
let expression stream =
  match stream with
  | [] -> Empty, stream
  | h::t -> Number( Int32.Parse(h)), t
let evaluate expr =
  match expr with
  | Empty -> failwith "Empty Expression"
  | Number(a) -> a

You’ll notice here that I have to return a tuple, the expression and the remainder of the parser. Since I’m programming in the functional style, I have to carry my state with me constantly. This means my functions start growing arguments, but I’m way more aware of the data I’m actually using. This became much more apparent as I started carrying a set of visited cells to prevent circular references.

The thing that tweaked my mind about it was how I didn’t need all the parser combinators I wrote over a year ago when I was parroting Mark Jason Dominus’ example from Higher-Order Perl. Even this time, trying to write test-first, I have far too much in the final solution. (Not Quite, I spent some of the weekend deleting code, but the point remains; I had code to delete, not refactor).

My final note; I think the test-first style is about controlling assumptions. It is okay to make some assumptions, but the purpose of the tests is to verify and document they are valid in the system you’re designing.

4 Comments

3 Comments

  1. Jon Fuller  •  Oct 19, 2009 @8:55 am

    Thanks Brian! I like your conclusions about test-first being an assumption realizer, and thus making those assumptions explicit.

    I’d been thinking about the TDD sudoku problem too, and had come to a similar conclusion about TDD not working for “mathematically important” solutions. So I guess those of us that are “mathematically impaired” will simply not ever find the solution to sudoku. :-/ (I suppose though, like you said, throw enough state at it that I could win, but that sort of defeats the purpose).

  2. Ball  •  Oct 19, 2009 @9:10 am

    Jon, part of the realization I came to was when some of the TDD proponents began to identify the Sudoku problem as a variation of a basic contraints problem. Once they had that, they then started with a small constrains engine and grew that to a Sudoku board with no issue. I wish I had the link at hand.

    The trick is, as far as I can tell, to allow yourself to do research before writing your first test. The selection of the first test is a design decision.

    If my spreadsheet’s first test was about how big it is, I’d have used a matrix to store my data. Since the first test was about returning empty when there’s no data, I ended up with a hash to store all my data. the test allowed for a non-existent cell. If you ask how big it is now, I’ll look at the keys to let you know, but it’s automatically sparse.

  3. Ball  •  Oct 19, 2009 @11:32 am

    Found the sudoku page. Same guy as the spreadsheet challenge. http://xp123.com/xplor/xp0607/index.shtml

1 Trackback

Leave a Reply

Allowed tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>