Browsing the archives for the FSharp tag.

Manipulating Data With F#

work safe

I’ve been working with our corporate website for the past few days. I’m trying to get analytics for some A/B split testing. As a part of that, I got to play with some really neat FSharp features. Type providers have allowed me to use a snippet of json to build a type hierarchy without doing the heavy lifting myself. And recursive sequences allow me to consume a paginated webservice as though it was a single data set.

Bringing My Customers Together

The data I’m currently interested in is, as far as I can discover, only available by asking each Hubspot contact for their list of form submissions so I can count the number of submissions to each form. But I can only ask for 100 customers at a time. The response will tell me if I can fetch more and the vid-offset I can use to get to the right spot in the pagination.

let rec Contacts offset =
   seq { let data = HubspotContacts.Parse(Http.RequestString("http://api.hubapi.com/contacts/v1/lists/all/contacts/all",
                                                             // apikey not included ;)
                                                             query=["hapikey", apikey; "count", "100"; "vidOffset", offset],
                                                             headers=[Accept HttpContentTypes.Json]))
         yield! data.Contacts
         if data.HasMore then
            yield! Contacts (data.VidOffset.ToString())}

What’s happening? The seq structure creates a seq computation (think IEnumerable) and allows us to write code that generates a seq. The yield! keywork takes a collection and doles each element out in turn. If we find the collection has more, we yield the results of a call with the offset. This means our first call will use the offset “”.

Types are your friends

FSharp type providers are wonderful. They can, at compile time, parse out some code and generate a type tree. In particular, using the FSharp.Data library I can give it a snippet of JSON and get back a strongly typed parser.

// I'm leaving a lot out of the json snippet.
// You don't need to see the details of my customer data
let [] ExampleResponse = """ { "contacts" : [ {"form-submissions": [{ ... }]}
                                                       { ...} ],
                                        "has-more" : true,
                                        "vid-offset": 1234 } """
type HubspotContacts = JsonProvider

Reductio ad absurdum

The only thing left for me to do is take this and map / reduce my way to a collection of form-ids and counts.

Contacts ""
// Transform from a seq of contacts to a seq of form submissions
|> Seq.map (fun c -> c.FormSubmissions)
|> Seq.concat
// Extract the form id
|> Seq.map (fun fs -> fs.FormId)
// Collect the counts of unique formIds
|> Seq.fold (fun s v -> Map.tryFind v s with
                        | Some(count) -> Map.add v (count+1) s
                        | None -> Map.add v 1 s) Map.empty
// Report the FormId and Count
|> Map.iter (fun formId count -> printfn "%A %i" formId count)
No Comments

State Makes It Easy: a Test-First Parable

work safe

Just some background on how I got the the problem I’ll be discussing. Recently, the twitters where talking about the Spolskey / Martin feud over TDD and whether or not it can be useful for every class of problem. In addition there’s been a lot of talk about Katas as a form of practice. This lead me to a 2002 article by William Wake describing a test-first spreadsheet. And, Michael Feathers is writing Vi in Haskell.

So, I started the test first spreadsheet challenge in C# because it’s super easy to work in a language you know. For the record, I completed the engine but not the application. Part of the problem is building an expression parser to evaluate formulas. While I’m not proud of what I wrote, my parser works and passes all the tests. The parser is a horrible hack that has no place in polite society and I’m not sure it should be called a parser. I blindly implemented the worst thing that could possibly work. It does have a clever little pivot to control operator precedence, but it’s ugly. That was intentional as part of this exercise was to explore the Sudoku issue from the TDD dust-up (TDD can(‘t) be used for mathematically important solutions) and see if I can build a parser without paying attention to what defines a parser in my opinion of the eyes of the rest of the world.

So, that was fun, but I wanted to play with F# since it’s been a while. The spreadsheet challenge would be an interesting experiment for me in Functional programming and TDD. I remember reading about the Haskell vim project and saying, “You can do that?” So, I wanted to find out how. It went fairly smoothly until I hit the parser, then I hit a wall. Almost literally, since it was the “)” that set me back. I was ashamed of myself. I wrote a parser for my FSharp talk that was as complex or more than this. I’m a huge fan of DSLs. Am I going to have to admit that test-first is a failure? No, I just need to look a bit deeper and discover what I think of test first.

But I did it in C#, why? In C#, I scattered information about the state of the parser all over my classes. It was easy to bang your way out of a corner if you can throw state at it. F# also has state, but I was trying to be as stateless as possible, even with a spreadsheet, and I was doing pretty damn good up until this point. I took a hard look at what Test-first means, then I took some time and relearned everything I knew about parsers up until that point.

I’m avoiding terms like BDD, TDD, and Context-Specification because they have very specific meanings to very vocal people from whom I still plan on learning. What do I mean by test-first?

  • Try not to write code until you have a need.
  • Those needs are expressed as executable tests.
  • Your code should only express the needs of the tests.

Test-First agile design gets a bad wrap in some corners as “no-design” when it really is about controlling assumptions. My instincts said I’d need a parser to evaluate the formula, but I stubbornly threw them away. I was thinking about a parser framework like I built up in my FSharp talk. This is way too much and I couldn’t justify it. Assumptions here are: I need a parser framework vs. parsers will fall out. Both assumptions are wrong. A true parser is a small thing that you can test quickly and easily. I needed just enough parser to get the job done. The trick is not to ignore a solved problem, but control scope of the solution.

What is a parser? When reading the Graham Hutton Haskell book that Eric Meijer is using in his lecture series, I found a poem. A parser for a thing / is a function that from a strings / to lists of pairs / of things and strings. I’m not returning a list of things, but still. In a fit of assumption, I changed string to token-stream (I’m not a saint) and I was off. I started with a simple parser, operating on a token stream.

type expression =
  | Empty
  | Number of int
let expression stream =
  match stream with
  | [] -> Empty, stream
  | h::t -> Number( Int32.Parse(h)), t
let evaluate expr =
  match expr with
  | Empty -> failwith "Empty Expression"
  | Number(a) -> a

You’ll notice here that I have to return a tuple, the expression and the remainder of the parser. Since I’m programming in the functional style, I have to carry my state with me constantly. This means my functions start growing arguments, but I’m way more aware of the data I’m actually using. This became much more apparent as I started carrying a set of visited cells to prevent circular references.

The thing that tweaked my mind about it was how I didn’t need all the parser combinators I wrote over a year ago when I was parroting Mark Jason Dominus’ example from Higher-Order Perl. Even this time, trying to write test-first, I have far too much in the final solution. (Not Quite, I spent some of the weekend deleting code, but the point remains; I had code to delete, not refactor).

My final note; I think the test-first style is about controlling assumptions. It is okay to make some assumptions, but the purpose of the tests is to verify and document they are valid in the system you’re designing.

4 Comments

FSharp Presentation Notes!

work safe

I had a great time giving my talk tonight.  Thanks to the guys at IndyAlt.NET for listening to my prattling.  It’s been fun having an excuse to dig into FSharp again.  So, let the linking begin.

Die Roller Source Code

My FSharp Examples

Slide Deck

More information on FSharp

msdn.microsoft.com/fsharp The official site

VS Shell Where you can install FSharp

Don Syme’s Blog

Expert FSharp

HubFS the FSharp Community and it’s Forums

Other Great Blogs

Matthew Podwysocki

Lucas Bolognese

Luis Diego Fallas

No Comments

I’m Presentable!

work safe

I’ll be presenting a talk on FSharp at IndyAlt.net on the 18th of December.

If you’ve seen my talk at SEP, This will be an updated version.  I’m re-ordering slides, adding new shout-outs to Eddie Izzard, expanding my example code, and slowing the explanation of my code.  Also, I’m going to use fsUnit to ease into the syntax and explain my parser bit by bit.  I’ll have a bigger writeup on the day of the talk.  Don’t want to tip my hand too early…

No Comments