Software Development

Why F#?

If you’ve been following along with my posts over the past six months or so you can probably imagine that I’ve been asked some variation of this post’s title more than a few times. One question that I keep getting is why I chose F# over some other functional languages like Haskell, Erlang, or Scala. The problem with that question though is that it’s predicated on the assumption that I actually set out to learn a functional language. The truth is that moving to F# was more of a long but natural progression from C# rather than a conscious decision.

The story begins about two and a half years ago. I had pretty much burned out and was deep into what I can only begin to describe as a stagnation coma. I stopped attending user group meetings; I cut way back on reading; I pretty much stopped doing anything that would help me remain even remotely aware of what was going on in the industry.

It’s easy to explain how I got to this point. My employer at the time gave very little incentive to stay current. For instance, there were homegrown frameworks for nearly every aspect of the product. Who needs NHibernate (or other ORM) when you have a proprietary DAL? Why learn ASP.NET MVC when you have a proprietary system for page layout? What’s the point of diving into WPF when the entire application is Web-based? Introducing new technologies or practices was generally discouraged under the banner of being too risky.

It wasn’t until the company hired a new architect that I started to wake up. He brought a wealth of knowledge of fascinating technologies that I’d hardly heard of and his excitement reminded me of what I loved about technology. My passion for software development was reigniting and I started looking at many of the technologies that had passed me by.

The first technology that really caught my attention during this time was LINQ. I’d consider it my introduction to the wonderful world of functional programming. As cool as I thought its query syntax was, I was really interested in the method syntax. I remember reading early on (although I don’t remember where) that query syntax was only added to C# after users complained that method syntax was too cumbersome in the preview versions. I didn’t understand this sentiment at all because to me method syntax felt completely natural. Gradually I began to realize that the reason it felt so natural was because it works the way I think. Lambda expressions, higher-order functions, composability, all of these functional concepts just spoke to me.

Over time I started using more of C#’s functional aspects but found myself getting increasingly frustrated. For the longest time though I couldn’t quite pinpoint anything specific but something just didn’t feel right anymore. It wasn’t until I was mowing the lawn late on a summer afternoon when I had that “a ha!” moment that changed my life.

That afternoon my yard work podcast selection included Hanselminutes #311. In this episode Richard Minerich and Phillip Trelford were talking about a functional language called F# that had been around for a few years and was built upon the .NET framework. I’d seen a few mentions of F# here and there but before hearing this podcast I hadn’t given it much thought.

At one point the discussion turned to language productivity and Phillip remarked that writing C# feels like completing government forms in triplicate. As he elaborated I experienced a sudden burst of clarity into one of the major things that had been bothering me about C# – it’s verbosity! After listening to the rest of the podcast and hearing more about how the functional nature of the language made it less error prone and how things like default immutability helped alleviate some of the complexity of parallel programming I knew I had to give it a try.

A language that doesn’t affect the way you think about programming, is not worth knowing.
— Alan Perlis, Epigrams on Programming

Despite my love of F# most of my work is still in C# but learning F# has had an amazing impact on how I write C#. C# has been becoming more of a functional language with virtually every new release and I’ve been using many of those capabilities for a few years but the language is hardly built around them. That said, forgetting about all the times I’ve typed let instead of var or tried to define a default constructor in the class signature in the last week alone F# really has changed the way I work. I find myself writing more higher-order functions and making much better use of delegation; I’ve developed a strong preference for readonly member fields and properties; and I regularly find myself longing for some of F#’s constructs like tuples, records, discriminated unions, and pattern matching.

The truth is that for all of its strengths though I’m finding working in C# increasingly annoying especially as I continue to work with F#. In many ways, working with C# feels like interacting with a toddler. I feel like I have to hold its hand and guide it along with explicit instructions on what to do every step of the way – even if I’ve already told it something. On the other hand, F# feels like having a personal assistant. Its functional nature allows me to more or less describe what I want and it handles the details.

There are plenty of things that make me prefer F# over C# but I’d like to highlight a few in particular. I’ve already written extensively about some of these and will be writing more about others as time permits but here I’d like to look at them from a more comparative angle.

Terse Syntax

Even though F# is a functional-first language I think a great way to illustrate the language’s expressiveness is with an object-oriented. We’ll start with a simple class definition in C#.

public class CircleMeasurements
{
  private double _diameter;
  private double _area;
  private double _circumference;

  public CircleMeasurements(double diameter, double area, double circumference)
  {
    _diameter = diameter;
    _area = area;
    _circumference = circumference;
  }

  public double Diameter { get { return _diameter; } }
  public double Area { get { return _area; } }
  public double Circumference { get { return _circumference; }  }
}

// Usage
var measurements = new CircleMeasurements(5.0, 19.63495408, 15.70796327);

Look how much code was required to create a class with three properties. Even in this small example we had to tell the compiler what type to use for each value three times! I could have used auto-implemented properties to simplify it a bit but even then we still need to tell the compiler which type to use twice for each value. Now let’s take a look at the same class in F#:

type CircleMeasurements(diameter, area, circumference) =
  member x.Diameter = diameter
  member x.Area = area
  member x.Circumference = circumference

// Usage
let measurements = CircleMeasurements(5.0, 19.63495408, 15.70796327);

That’s it – four lines of code. Granted this compiles to something that resembles the C# class but we didn’t have to write any of it and thanks to the fantastic type inference system we didn’t have to tell the compiler multiple times which type to use for each value. Quite often though even this is more verbose than we actually need. Many times we can use another F# construct – a record type – to represent something we’d represent with a class in other .NET languages:

type CircleMeasurements = { Diameter : float; Area : float; Circumference : float }

// Usage
let measurements = { Diameter = 5.0; Area = 19.63495408; Circumference = 15.70796327 }

In addition to being even more concise than the corresponding class definition, record types have the added benefit of being structurally comparable so we can easily check for equality between two instances. Record types do require us to include the type annotations but we only need to explicitly tell the compiler what to use once for each value and the constructor and properties are each created implicitly.

Functional Style

I already mentioned that functional programming feels more natural to me and by design F# really shines when it comes to expressing and using functions. Traditional .NET development has always had some type of support for delegation and it has definitely improved over the years, particularly with the common generic delegate classes (e.g.: Func, Action) and lambda expressions but actually trying to use them in a more functional style is a pain. This is complicated by the fact that in some situations the C# compiler can’t infer whether a lambda expression is a delegate or an expression tree. Although in some regards I prefer the C# lambda expression syntax I definitely prefer F#’s syntactic distinction between delegates and code quotations.

While on the topic of functional programming I have to mention F#’s default immutability. Immutability is key to any functional language and has been shown to improve overall program correctness by eliminating side effects. C# has some support for immutability through readonly fields or by omitting setters from property definitions but either of these require a conscious decision to enable. Nearly everything in F# is immutable unless explicitly declared otherwise. Immutability also provides benefits when writing asynchronous code because if nothing is changing, there’s a reduced need for locking.

Discriminated Unions

If you haven’t read my post on discriminated unions yet, they’re a way to express simple object hierarchies (or even tree structures) in a concise manner. From a syntactic perspective they’re incredibly simple but trying to replicate them even for simple structures really isn’t particularly feasible in C#. Here’s a simple example:

In this example suit is another discriminated union.

type Card =
| Ace of Suit
| King of Suit
| Queen of Suit
| Jack of Suit
| ValueCard of Suit * int

Using this discriminated union we express the 7 of hearts as ValueCard(Heart, 7). For illustration of what it would take to represent this structure in C# I’m including a screenshot of ILSpy’s decompilation. Note that I’ve only included the signatures and even this five case discriminated union more than fills my screen. Just to drive the point home, fully expanded, this code is nearly 700 lines long! Granted there are a few CompilerGeneratedAttributes in there but they hardly count for a majority of the code.

Ultimately, the union type is an abstract class and each case is a nested class. The nested classes are each assigned a unique tag that’s used for type checking and code branching in some of the union type’s methods. Not included in the screenshot are implementations of several interfaces and overrides of Equals and GetHashCode.

Decompiled Card Discriminated Union

Decompiled Card Discriminated Union

Collection Types & Comprehensions

C# has made great strides in regard to initializing various collection types but it still pales in comparison to the constructs offered by F#. Don’t get me wrong, collection initializers are a nice syntactic convenience but nearly every time I use it I think how much easier it would likely be with a comprehension. LINQ can address some of these shortcomings but even convenience methods like Enumerable.Range feel limiting. Yes, I could write some additional convenience methods to address some of the shortcomings but in F# I don’t have to.

Part of the beauty of comprehensions is that they generally apply regardless of which collection type you’re creating. Although each of the examples below create F# lists they can easily be modified to create sequences or arrays instead.

// Numbers 0 - 10
> [0..10];;
val it : int list = [0; 1; 2; 3; 4; 5; 6; 7; 8; 9; 10]

// Numbers 0 - 10 by 2
> [0..2..10];;
val it : int list = [0; 2; 4; 6; 8; 10]

// Charcters 'a' - 'z'
> ['a'..'z'];;
val it : char list =
  ['a'; 'b'; 'c'; 'd'; 'e'; 'f'; 'g'; 'h'; 'i'; 'j'; 'k'; 'l'; 'm'; 'n'; 'o';
   'p'; 'q'; 'r'; 's'; 't'; 'u'; 'v'; 'w'; 'x'; 'y'; 'z']

// Unshuffled deck
> let deck =
  [ for suit in [ Heart; Diamond; Spade; Club ] do
      yield Ace(suit)
      yield King(suit)
      yield Queen(suit)
      yield Jack(suit)
      for v in 2 .. 10 do
        yield ValueCard(suit, v)
  ];;

val deck : Card list =
  [Ace Heart; King Heart; Queen Heart; Jack Heart; ValueCard (Heart,2);
   ValueCard (Heart,3); ValueCard (Heart,4); ValueCard (Heart,5);
   ValueCard (Heart,6); ValueCard (Heart,7); ValueCard (Heart,8);
   ValueCard (Heart,9); ValueCard (Heart,10); Ace Diamond; King Diamond;
   Queen Diamond; Jack Diamond; ValueCard (Diamond,2); ValueCard (Diamond,3);
   ValueCard (Diamond,4); ValueCard (Diamond,5); ValueCard (Diamond,6);
   ValueCard (Diamond,7); ValueCard (Diamond,8); ValueCard (Diamond,9);
   ValueCard (Diamond,10); Ace Spade; King Spade; Queen Spade; Jack Spade;
   ValueCard (Spade,2); ValueCard (Spade,3); ValueCard (Spade,4);
   ValueCard (Spade,5); ValueCard (Spade,6); ValueCard (Spade,7);
   ValueCard (Spade,8); ValueCard (Spade,9); ValueCard (Spade,10); Ace Club;
   King Club; Queen Club; Jack Club; ValueCard (Club,2); ValueCard (Club,3);
   ValueCard (Club,4); ValueCard (Club,5); ValueCard (Club,6);
   ValueCard (Club,7); ValueCard (Club,8); ValueCard (Club,9);
   ValueCard (Club,10)]

Pattern Matching

faced with a 20th century computer
Scotty: Computer! Computer?
He’s handed a mouse, and he speaks into it
Scotty: Hello, computer.
Dr. Nichols: Just use the keyboard.
Scotty: Keyboard. How quaint.
— Star Trek IV

The above conversation enters my mind when I’m working with C#’s branching constructs, switch statements in particular.  F#’s pattern matching may bear a slight resemblance to C# switch statements but they’re so much more powerful.  switch statements limit us to simply branching on constant values but pattern matching allows value extraction, multiple cases, and refinement constraints for more precise control with a syntax much friendlier than your common if/else statement. Furthermore, like virtually everything else in F#, pattern matches are expressions so they return a value making them ideal candidates for inline conditional bindings.

Units of Measure

Every programmer works with code that uses different units of measure. In most languages dealing with units of measure is error prone because they require discipline from the developer to ensure that the correct units are always used. There are some libraries that attempt to address the problem but to my knowledge (please correct me if I’m wrong) F# is the only one to actually include it in the type system. F#’s unit of measure support is complete enough that it can often automatically convert values between units, particularly when a conversion expression is included with the type definition.

[<Measure>] type px
[<Measure>] type inch
[<Measure>] type dot = 1 px
[<Measure>] type dpi = dot / inch

let convertToPixels (inches : float<inch>) (resolution : float<dpi>) : float<px> =
  inches * resolution

let convertToInches (pixels : float<px>) (resolution : float<dpi>) : float<inch> =
  pixels / resolution

The biggest downfall of F#’s units of measure is that they’re a feature of the type system and compiler rather than a CLR feature. As such, the compiled code doesn’t have any unit information and we can’t enforce the units in cross-language scenarios.

Object Expressions

I haven’t written about object expressions yet but they’re definitely on my backlog. Object expressions provide a way to create ad-hoc (anonymous) types based on one or more interfaces or a base class. They’re useful in a variety of scenarios like creating one-off formatters or comparers and I think they can at least supplement, if not replace some mocking libraries. To illustrate we’ll use a somewhat contrived example of a logging service.

type LogMessageType =
| DebugMessage of string
| InfoMessage of string
| ErrorMessage of string

type ILogProvider =
  abstract member Log : LogMessageType -> unit

type LogService(provider : ILogProvider) =
  let log = provider.Log
  member this.LogDebug msg = DebugMessage(msg) |> log
  member this.LogInfo msg = InfoMessage(msg) |> log
  member this.LogError msg = ErrorMessage(msg) |> log

Here we use a discriminated union to define the message types the logging provider interface can handle. The log service itself provides a slightly friendlier API than the provider interface itself. Normally if we wanted to use the log service instance we’d have to define a concrete implementation of ILogProvider but object expressions allow us to easily define one inline.

let logger =
  LogService(
    { new ILogProvider with
      member this.Log msg =
        match msg with
        | DebugMessage(m) -> printfn "DEBUG: %s" m
        | InfoMessage(m) -> printfn "INFO: %s" m
        | ErrorMessage(m) -> printfn "ERROR: %s" m
    }
  )

In our object expression we use pattern matching to detect the message type, extract the associated string, and write an appropriate message to the console.

> logger.LogDebug "message"
logger.LogInfo "message"
logger.LogError "message";;
DEBUG: message
INFO: message
ERROR: message

What’s Not So Great?

I could continue on for a while with things I like about F# but this post is already long enough as it is. That said, I think it’s only fair to list out a few of the things I don’t like about the language.

Tooling Support

Probably the biggest gripe I have is the lack of tooling support around the language. So many tools and templates that I take for granted when working with C# simply aren’t available. Things like IntelliSense are pretty complete but if you’re just getting started with F# and looking to do more than a console application or library be prepared to spend some time looking for 3rd party templates and reading blog posts.

Language Interop

On a somewhat related note, even though F# compiles to MSIL and can reference or be referenced by other .NET assemblies there are some quirks that make language interoperability a but cumbersome. For instance, using extension methods defined in C# doesn’t work as cleanly as I thought they would. When I was experimenting with MassTransit I couldn’t get the UseMsmq, VerifyMsmqConfiguration, or a number of other extension methods to appear no matter what I tried. I ultimately had to call the static methods directly.

I’ve read that this is addressed in F# 3.0 but I haven’t done enough with 3.0 yet to confirm.

Project Structure

It’s not really fair to put this under the “what’s not so great?” heading but it seemed most appropriate. This isn’t so much an issue with the language as much as it’s a big mindset shift of a similar magnitude of switching from OO to functional. The structure of an F# project is significantly different than that of a C# (or even VB) project and is something I’m still struggling with.

In C# we generally organize code into folders representing namespaces and keep one type (class) per file. F# evaluates code from top down throughout the project so file sequence is significant. Furthermore, code is organized by modules rather than type. F# does have namespaces but even then they’re usually divided across one or more files and from my experience, not grouped by folder.

Wrap Up

No matter what language you work in, programming in a functional style provides benefits. You should do it whenever it is convenient, and you should think hard about the decision when it isn’t convenient.
— John Carmack, Functional Programming in C++

In general I’ve found that the more I learn and work with F# the more I like it. I regularly find myself reaching for it as my first choice, especially when it comes to new development. Although there are a few things that I don’t like about working in F# most of them just require more diligence on my part or are easily managed. I’ve only listed a few key areas where I think F# excels but I firmly believe that their strengths far outweigh any weaknesses.

Revisiting String Distances

Back in April I wrote about three different string distance algorithms.  If you’re not familiar with string distance algorithms I encourage you to read the original post but in short, these algorithms provide different ways to determine how many steps are required to translate one string into another.  Since the algorithms I wrote about aren’t particularly complex I thought translating them to F# would be a fun exercise.

In general the code is pretty similar to its C# counterpart but I was able to take advantage of a few F# constructs, tuples and pattern matching in particular.  These examples may or may not be the best way to implement the algorithms in F# (ok, I’ll be honest, they’re probably not) but they certainly show how well suited F# is to this type of work.

Hamming Distance

The Hamming distance algorithm simply counts how many characters are different in equal length strings.

let hammingDistance (source : string) (target : string) =
  if source.Length <> target.Length then failwith "Strings must be equal length"
 
  Array.zip (source.ToCharArray()) (target.ToCharArray())
  |> Array.fold (fun acc (x, y) -> acc + (if x = y then 0 else 1)) 0

hammingDistance "abcde" "abcde" |> printfn "%i" // 0
hammingDistance "abcde" "abcdz" |> printfn "%i" // 1
hammingDistance "abcde" "abcyz" |> printfn "%i" // 2
hammingDistance "abcde" "abxyz" |> printfn "%i" // 3
hammingDistance "abcde" "awxyz" |> printfn "%i" // 4
hammingDistance "abcde" "vwxyz" |> printfn "%i" // 5

Levenshtein Distance

The Levenshtein distance algorithm counts how many characters are different between any two strings taking into account character insertions and deletions.

open System
let levenshteinDistance (source : string) (target : string) =
  let wordLengths = (source.Length, target.Length)
  let matrix = Array2D.create ((fst wordLengths) + 1) ((snd wordLengths) + 1) 0

  for c1 = 0 to (fst wordLengths) do
    for c2 = 0 to (snd wordLengths) do
      matrix.[c1, c2] <-
        match (c1, c2) with
        | h, 0 -> h
        | 0, w -> w
        | h, w ->
          let sourceChar, targetChar = source.[h - 1], target.[w - 1]
          let cost = if sourceChar = targetChar then 0 else 1
          let insertion = matrix.[h, w - 1] + 1
          let deletion = matrix.[h - 1, w] + 1
          let subst = matrix.[h - 1, w - 1] + cost
          Math.Min(insertion, Math.Min(deletion, subst))
  matrix.[fst wordLengths, snd wordLengths]

levenshteinDistance "abcde" "abcde" |> printfn "%i" // 0
levenshteinDistance "abcde" "abcdz" |> printfn "%i" // 1
levenshteinDistance "abcde" "abcyz" |> printfn "%i" // 2
levenshteinDistance "abcde" "abxyz" |> printfn "%i" // 3
levenshteinDistance "abcde" "awxyz" |> printfn "%i" // 4
levenshteinDistance "abcde" "vwxyz" |> printfn "%i" // 5
levenshteinDistance "abcdefg" "abcde" |> printfn "%i" // 2
levenshteinDistance "abcdefg" "zxyde" |> printfn "%i" // 5
levenshteinDistance "abcde" "bacde" |> printfn "%i" // 2
levenshteinDistance "abcde" "baced" |> printfn "%i" // 4

Damerau-Levenshtein Distance

The Damerau-Levenshtein algorithm builds upon the Levenshtein algorithm by taking transpositions into account.

open System
let damerauLevenshteinDistance (source : string) (target : string) =
  let wordLengths = (source.Length, target.Length)
  let matrix = Array2D.create ((fst wordLengths) + 1) ((snd wordLengths) + 1) 0

  for h = 0 to (fst wordLengths) do
    for w = 0 to (snd wordLengths) do
      matrix.[h, w] <-
        match (h, w) with
        | x, 0 -> x
        | 0, y -> y
        | x, y ->
          let sourceChar, targetChar = source.[x - 1], target.[y - 1]
          let cost = if sourceChar = targetChar then 0 else 1
          let insertion = matrix.[x, y - 1] + 1
          let deletion = matrix.[x - 1, y] + 1
          let subst = matrix.[x - 1, y - 1] + cost

          let distance = Math.Min(insertion, Math.Min(deletion, subst))

          if x > 1 && w > 1 && sourceChar = target.[y - 2] && source.[x - 2] = targetChar then
            Math.Min(distance, matrix.[x - 2, y - 2] + cost)
          else
            distance
  matrix.[fst wordLengths, snd wordLengths]

damerauLevenshteinDistance "abcde" "abcde" |> printfn "%i" // 0
damerauLevenshteinDistance "abcde" "abcdz" |> printfn "%i" // 1
damerauLevenshteinDistance "abcde" "abcyz" |> printfn "%i" // 2
damerauLevenshteinDistance "abcde" "abxyz" |> printfn "%i" // 3
damerauLevenshteinDistance "abcde" "awxyz" |> printfn "%i" // 4
damerauLevenshteinDistance "abcde" "vwxyz" |> printfn "%i" // 5
damerauLevenshteinDistance "abcdefg" "abcde" |> printfn "%i" // 2
damerauLevenshteinDistance "abcdefg" "zxyde" |> printfn "%i" // 5
// Transpositions
damerauLevenshteinDistance "abcde" "bacde" |> printfn "%i" // 1
damerauLevenshteinDistance "abcde" "baced" |> printfn "%i" // 2

F# Enumerations

I was originally going to include this discussion about F# enumerations in the discriminated unions post because their syntax is so similar but the section quickly took on a life of its own. Enumerations in F# are the same as enumerations from other .NET languages so all of the same capabilities and restrictions apply.

Defining Enumerations

Defining an enumeration in F# is much like defining a discriminated union but here we bind an integral value. Unlike in C# though, F# doesn’t automatically generate values for each item so we need to give every case a value.

type DayOfWeek =
| Sunday = 0
| Monday = 1
| Tuesday = 2
| Wednesday = 3
| Thursday = 4
| Friday = 5
| Saturday = 6

Changing Base Type

We can also change the enumeration’s underlying type by changing the suffix of each value. For example, if we wanted to change the underlying type for the DayOfWeek enumeration to byte we could do so by appending a y to each value as follows:

type DayOfWeekByte =
| Sunday = 0y
| Monday = 1y
| Tuesday = 2y
| Wednesday = 3y
| Thursday = 4y
| Friday = 5y
| Saturday = 6y

Referencing Enumeration Values

As is the case with enumerations in other .NET languages enumeration values are referenced through the enumeration name.

> DayOfWeek.Friday
val it : DayOfWeek = Friday

FlagsAttribute

Also like in other .NET languages, F# enumerations support the FlagsAttribute to indicate that each value in the enumeration represents a bit position.

open System

[<Flags>]
type Border =
| None = 0
| Right = 1
| Top = 2
| Left = 4
| Bottom = 8;;

let showBorder = Border.Bottom ||| Border.Left

To determine whether a particular flag is set we can either use bit-math or we can use the enumeration’s HasFlag method. Since I generally prefer the code I don’t have to write so I generally favor built-in approach:

> showBorder.HasFlag Border.Top;;
val it : bool = false

> showBorder.HasFlag Border.Left;;
val it : bool = true

Conversion

For those times when we only have the underlying integer value and want to convert that value to the corresponding enum value we can use the built-in generic enum function:

> enum<DayOfWeek> 5
val it : DayOfWeek = Friday

Note: The signature of the enum function above is int32 -> 'U so it can only be used with enumerations based on int32. If your enumeration is based on another data type such as with the DayOfWeekByte example above you’ll need to use the generic EnumOfValue function from the Microsoft.FSharp.Core.LanguagePrimitives module instead.

Likewise, we can get the underlying value by calling the corresponding conversion function:

int DayOfWeek.Friday
// val it : int = 5

byte DayOfWeekByte.Friday
// val it : byte = 5uy

F# Discriminated Unions

When I started learning F# discriminated unions were probably the first thing that really sold me on the language only partially because there’s nothing like them in C#.  From a traditional .NET background a quick glance at them, particularly simple ones, may lead you to believe that they’re just glorified enumerations but in reality, discriminated unions are so much more powerful.  While enumerations are limited to a single, fixed integral value but applications for discriminated unions go far beyond that.

Like enumerations, discriminated unions allow us to define values that can only be one of a specified set of named cases but that’s where the similarities end.  Discriminated unions allow us to associate zero or more data items of any type with each case.  The effect is that we can easily define tree structures, simple object hierarchies, and even leverage pattern matching.

Discriminated Unions as Object Hierarchies

I think the easiest way to get into discriminated unions is to start off with an example.  In my studies of the language I’ve become quite fond of the example I first encountered in Programming F# where discriminated unions and a list comprehension are used to define a standard, unshuffled card deck.  This example does a great job illustrating not only simple, “dataless” unions but also shows how we can associate multiple data items and even combine discriminated unions to achieve more expressive code.

We start by defining a discriminated union for each suit. The suits don’t have any data associated with them so this discriminated union simply includes the case names.

type Suit =
| Heart
| Diamond
| Spade
| Club

The second discriminated union defines the individual cards. Each card belongs to a particular suit and non-face cards will also include their denomination.

type Card =
| Ace of Suit
| King of Suit
| Queen of Suit
| Jack of Suit
| ValueCard of Suit * int

With just these two constructs we can define a value representing any card in a standard deck.  For instance, if we wanted to represent the Ace of Spades, or 7 of Hearts we’d write Ace(Spade) and ValueCard(Heart, 7), respectively.

We don’t need to include to the actual discriminated union name when referring to a case. Should there be a naming conflict we’re free to include the union name with dot syntax (Suit.Heart) or we could instruct the compiler to force a qualified name with the RequireQualifiedAccess attribute.

If you were paying attention during the Card example you probably noticed a familiar syntax. When multiple data elements are associated with a case (such as with ValueCard) we use the tuple syntax to identify each element’s data type. While this allows for a lot of flexibility it also brings along some of the drawbacks of tuples such as not being able to define names for each value. We’ll look at one way to work around this limitation a little later when we discuss pattern matching.

The tuples in discriminated unions are syntactic tuples. The types the compiler generates for each case are nested classes that derive from the discriminated union type. The data elements are exposed as read-only properties following the same conventions as tuples but there aren’t any actual tuple class instances involved.

Now that we’re armed with some basic knowledge of discriminated unions we can use a list comprehension to easily construct a full, unshuffled deck.

let deck =
  [ for suit in [ Heart; Diamond; Spade; Club ] do
      yield Ace(suit)
      yield King(suit)
      yield Queen(suit)
      yield Jack(suit)
      for v in 2 .. 10 do
        yield ValueCard(suit, v)
  ]

Option Type

No introductory discussion about discriminated unions would be complete without at least a brief mention of the option type. The option type is a predefined discriminated union (Microsoft.FSharp.Core.FSharpOption) that’s useful for indicating when a value may or may not exist (remember: null generally isn’t used in F#). There are only two cases in the option type: Some(_) and None. Because the type itself is an unconstrained generic type, its cases can be used to indicate the presence of any value.

Although options are generally referenced in pattern matching expressions there’s an associated option module that contains several useful functions including Option.get, Option.bind, and Option.exists.

Discriminated Unions as Trees

A really nice aspect of discriminated unions is that they can be self-referencing. This makes creating simple tree structures trivial. To illustrate let’s create a very rudimentary HTML structure.

type Markup =
| Element of string * Markup list
| EmptyElement of string
| Content of string

Note that the Element case has two associated pieces of data: a string (the element name), and a list of additional Markup instances. From just these three cases we can quickly represent a simple HTML document:

let htmlTree =
  Element("html",
    [
      Element("head", [ Element("title", [ Content("Very Simple HTML") ]) ])
      Element("body",
        [
          Element("article",
            [
              Element("h1", [ Content("Discriminated Unions") ])
              Element("p",
                [
                  Content("See how ")
                  Element("b", [ Content("easy") ])
                  Content(" this is?")
                ])
              Element("p", [ Content("F#"); EmptyElement("br"); Content("is fun") ])
            ])
        ])
    ])

Discriminated Unions in Pattern Matching

We’ve seen how discriminated unions can be used to easily define both object hierarchies and tree structures. Like with so many things in F# though that power is amplified when combined with pattern matching. With pattern matching we can not only can we provide conditional logic for operating against each case but we can also work around the limitation I mentioned earlier where each associated data item is assigned a less than useful name.

Let’s expand upon the Markup example from the previous section to illustrate. Here we’ll define a recursive function to convert the htmlTree value to an HTML string.

let rec toHtml m : string =
  match m with
  | Element (tag, children) ->
      let sb = System.Text.StringBuilder()
      let append (s : string) = sb.Append s |> ignore

      append (sprintf "<%s>" tag)
      List.iter (fun c -> append (c |> toHtml)) children
      append (sprintf"</%s>" tag)
      sb.ToString()
  | EmptyElement tag -> sprintf "<%s />" tag
  | Content v -> sprintf "%s" v;;

htmlTree |> toHtml
> val it : string =
  "<html><head><title>Very Simple HTML</title></head><body><article><h1>Discriminated Unions</h1><p>See how <b>easy</b> this is?</p><p>F#<br />is fun</p></article></body></html>"

In this example we have a pattern for each Markup case. Each case is covered by a pattern so there’s no need to introduce the wildcard (_) pattern here. Each pattern includes an identifier for each data item associated with the case. The identifiers allow us to refer to each item with something meaningful rather than the default name. Identifiers are also useful within when constraints to provide more granular control over what matches a given pattern. For instance if we wanted to apply different handling for p elements we could add the pattern | Element (tag, children) when tag = "p" -> ... to the beginning of the match expression.

Members in Discriminated Unions

Just as with record types, discriminated unions ultimately compile down to classes. As such, we can expand upon them with additional members such as methods. Continuing with the Markup it makes sense to keep the toHtml method with the Markup type so let’s combine them.

To convert the function to a method we only need to make a few small tweaks. First, we’ll eliminate the argument since we’ll be operating against the current instance. We’ll also add a type annotation to the delegate used by the List.iter function.

type Markup =
| Element of string * Markup list
| EmptyElement of string
| Content of string
  member x.toHtml() : string =
    match x with
    | Element (tag, children) ->
        let sb = System.Text.StringBuilder()
        let append (s : string) = sb.Append s |> ignore

        append (sprintf "<%s>" tag)
        List.iter (fun (c : Markup) -> append (c.toHtml())) children
        append (sprintf"</%s>" tag)
        sb.ToString()
    | EmptyElement tag -> sprintf "<%s />" tag
    | Content v -> sprintf "%s" v

Note that the member function is no longer marked as recursive. This is because recursion is implicitly allowed for members.

Now that the toHtml function is a method of the Markup type we can call it like we would a method on any other type:

htmlTree.toHtml()
> 
val it : string =
  "<html><head><title>Very Simple HTML</title></head><body><article><h1>Discriminated Unions</h1><p>See how <b>easy</b> this is?</p><p>F#<br />is fun</p></article></body></html>"

Wrapping Up

Discriminated unions are a highly useful construct with applications far beyond that of enumerations. Whether they’re being used to construct an object hierarchy or represent a tree structure discriminated unions can greatly enhance the clarity of your code.

Sometimes a discriminated union is bit overkill for the problem space and an enumeration is all that’s needed. In the next post we’ll explore how to define enumerations in F#.

Concatenating PDFs with F# and PdfSharp

In order to test part some new functionality I’m working on I needed to combine a bunch of somewhat arbitrary documents into one. At first I thought about just re-scanning the documents but the scanner attached to my PC is a single-sheet flatbed. I also thought about running them through our sheet-feed scanner but it’s attached to my counterpart’s PC, not shared, and I didn’t want to bother him with it. I’ve been working a bit with the PdfSharp library lately so I decided to do the programmer thing and write a script to concatenate the documents. Of course, this would be a perfect chance to flex my developing F# muscles and came up with what I think is a pretty decent solution.

Before taking a look at the script there are a few things to note:

  • The script is intended to be run from within a folder that contains the documents that you want to concatenate. This is mainly because I had a bunch of documents I wanted to combine and including the full path to each one quickly became unwieldy.
  • I don’t check that files exist or that what’s supplied is actually a PDF. PdfSharp will throw an exception in those cases
  • I needed something quick so I hard-coded the destination file name as Result.pdf.
  • fsi.CommandLineArgs includes the name of the script as the first array item. The easiest thing I could think of for getting a list that didn’t include the script name was to create a new list with Array.toList and grab the Tail.
  • The PdfPages class is built around IEnumerable rather than IEnumerable<T> so I needed to use a comprehension to build the pages list.

Script:

// PdfConcat.fsx

#r "PdfSharp.dll"

open System;
open System.IO
open PdfSharp.Pdf
open PdfSharp.Pdf.IO

let readPages (sourceFileName : string) =
  use source = PdfReader.Open(sourceFileName, PdfDocumentOpenMode.Import)
  [ for p in source.Pages -> p ]

let createDocFromPages pages =
  let targetDoc = new PdfDocument()
  pages |> List.iter (fun p -> targetDoc.Pages.Add p |> ignore)
  targetDoc

let docNames = (Array.toList fsi.CommandLineArgs).Tail
let workingDirectory = Environment.CurrentDirectory;
let targetFileName = Path.Combine(workingDirectory, "Result.pdf")

let allPages =
  [ for n in docNames -> Path.Combine(workingDirectory, n) ]
  |> List.map readPages
  |> List.concat

let doc = createDocFromPages allPages
doc.Save targetFileName
doc.Dispose()

Usage:

D:\ScannedDocuments>fsi D:\Dev\FSharp\PdfConcat\PdfConcat.fsx "07-09-2012 05;29;44PM.PDF" "07-11-2012 08;47;13PM.PDF" "07-11-2012 08;48;24PM.PDF" "07-11-2012 08;50;27PM.PDF"

CODODN Notes

I spent Saturday (Dec 8) over in Columbus, OH attending the Central Ohio Day of .NET conference. As with any multi-track conference there were plenty of good sessions to choose from but of course, I could only attend five. What follows are my raw, nearly unedited notes from the sessions I selected.

Underestimated C# Language Features

John Michael Hauck (@john_hauck | http://w8isms.blogspot.com)

Delegation

  • Anonymous functions
  • Closures
  • Automatic closures compile to classes

Yay!  ANOTHER discussion about using var or type name for declaring variables

Enumeration

  • IEnumerable<T>
  • yield return
  • IEnumerable of delegates

Deep Dive into Garbage Collection

Patrick Delancy (@patrickdelancy | http://patrickdelancy.com/)

Automatic Memory Management

  • Reference counting
  • Track & mark

Allocating

  • Application virtual memory
  • Heap is automatically reserved
  • New declarations are allocated into segments in the heap
  • Ephemeral segments
  • Clean-up tries to move longer lasting objects into other dedicated ephemeral segments
  • Longer-lasting objects are collected less frequently

GC Timing

  • When an ephemeral segment is full
  • GC.Collect() method call
  • Low memory notifications from the OS (physical memory)

Only finalization is non-deterministic

GC Process

  • Marking – Identifying objects for removal
  • Relocating – Relocating surviving pointers to older segments
  • Compacting – Moving actual object memory

Dead or Alive

Starting points:

  • Static Data is rooted in garbage collector (never out of scope)
  • Stack Root (call stack) has variable references
  • GC Handles – special cases that can be created and freed

Collector walks the tree from each point

GC Handles

Struct that has a reference to a managed object and can be passed off to unmanaged/native code

Marshalling

GCHandle.Alloc() and .Free()

Handle Types:

  • Weak – tracks the object but allows collection; Zeroed when collected; Zeroed before finalizer
  • WeakTrackResurrection – Weak but is not zeroed when collected; Can resurrect in finalizer
  • Normal – Opaque address via the handle (can’t get the address); not collected by GC; Managed object w/o managed references
  • Pinned – Normal with accessible address. Not collected or moved by GC; free quickly if needed

Generations

  • Generation 0 – newly allocated objects
  • Generation 1 – objects that survived gen 0; infrequent collection
  • Generation 2 – Long-lived objects; things allocated as large objects too; infrequent collection

Anything larger than 85K in memory is considered large and automatically placed on the large object heap.

Threshold checks consider machine resources including bitness and is always in flux (except gen 0 which is based on segment size)

Configuration

Workstation & server mode:

  • Default behavior is workstation
  • Workstation
    • Single-processor machines
    • Collection happens on triggering thread
    • Normal thread priority
  • Server
    • Multi-processor
    • Separate GC per thread processor
    • Parallel on all threads
    • Collection at highest thread priority
    • Intended to maximize GC throughput and scalability

Concurrency:

  • Foreground
    • Non-concurrent
    • All other threads are suspended until collection finishes
  • Concurrent & Background
    • Collect concurrently while app is running
    • Background (v4) is the replacement for concurrent
    • Collects concurrently while the app is running
    • Only applies to Gen 2 collection
    • Prevents Gen 0 & 1 collections while running

Latency modes

Time the user knows the system is busy

Useful especially when users will notice the collection – graphics rendering

  • Batch
    • Default when concurrency is disabled
    • Most intrusive when running
  • Interactive
    • Default when concurrency is enabled
    • Less intrusive but still accommodates the GC
  • LowLatency
    • For short-term use – impedes the GC process
    • Suppresses Gen 2 collection
    • Workstation mode only
    • Still collects under low host memory
    • Allows manual collection
  • SustainedLowLatency
    • Added in 4.5
    • Contained but longer
    • Suppresses foreground gen 2 collection
    • Only available with concurrency
    • Does not compact managed heap

Finalization

Non-deterministic

Only used with native resources (IntPtr handles)

  • To release unmanaged resources ONLY
  • Objects with a finalizer get promoted to the next generation and the next time the collector hits that generation it calls the finalizer; added to finalizaton queue
  • Finalizer runs on another thread after collection has finished and runs at the highest thread priority
  • Managed objects may already be freed
  • Flagged for finalization upon allocation
  • When troubleshooting performance check the size of the finalizer queue and look for hung finalizer threads

Disposable

Objects using managed resources that need to be released

Debug Build

  • Does some artificial rooting due to CLR optimizations

Unleashing the Power: A Lap around PowerShell

Sarah Dutkiewicz (@sadukie | http://codinggeekette.com/)

Covering PowerShell 3.0

Some Cmdlets:

  • Clear – clears the screen
  • Get-verb
  • Get-unique

Console

  • Part of Windows Management Framework 3.0 suite
    • 32/64 bit available
    • Windows 8 & Server 2012 have it installed
  • Built on CLR 4
  • Adds support for
    • Networking
    • Parallelism
    • MEF
    • Compatibility/Deployment
    • WWF
    • WCF
  • show-command cmdlet
    • allows searching for commands and executing in the console
    • often easier than get-command | more
  • Updatable help
    • Not installed by default on Win8/Server 2012
    • Update-Help cmdlet
    • Must run as administrator
    • No restart required
    • Can store to a network share
  • Language improvements
    • Simplified foreach (automatic/implicit)
      • PS C:\windows\system32> $verbs = get-verb
      • PS C:\windows\system32> $verbs.Group | Get-Unique
    • Simplified where
      • PS C:\windows\system32> $verbs | where group -eq “Data”
    • Enhanced tab completion
  • Unblock files with Unblock-File cmdlet
  • JSON, REST, & HTML Parsing support
  • Session improvements
    • Disconnected sessions on remote computers can be reconnected later w/o losing state
    • Both ends need 3.0

Integrated Scripting Environment

  • IntelliSense
  • Show-Command window
  • Unified console pane
  • Rich copy
  • Block copy
  • Snippets
  • Brace Matching

Scheduled Jobs

  • PowerShell jobs can now be integrated with task scheduler
    • Register-ScheduledJob
    • Found under Microsoft / Windows / PowerShell / ScheduledJobs

Autoloading Modules

  • PowerShell 3 automatically loads a module when one of its cmdlet is invoked

PowerShell Web Access

  • PowerShell in a browser
  • Mostly 3.0 w/ some limitations due to remotin
  • Prereqs:
    • Server 2012
  • Setup is difficult
    • Install web access
    • Authorization configuration
    • PowerShell remoting must be enabled to connect
  • Limitations:
    • Some function keys aren’t supportd
    • Only top-level progress is shown
    • Input color cannot be changed
    • Writing to console doesn’t work

Management OData IIS Extension

  • Allows RESTful OData access
  • Requires Server 2012, not Server Core

MS Script Explorer for Windows PowerShell

  • 32/64-bit platforms
  • PowerShell ISE is required
  • Downloadable (Non-standard)

Building Large Maintainable JavaScript Applications w/o a Framework

Steve Horn (@stevehorn | http://blog.stevehorn.cc/blog)

Quote from Kahn about complaining about status quo and theorizing about how things should be

(I tried to find the actual quote but couldn’t – if anyone knows where I can find it I’ll be happy to update these notes)

Assumptions for JavaScript Applications

  • Not progressive enhancement
  • JavaScript is enabled
  • Rendering of HTML templates is done on the client
  • Server side is for querying or performing work
  • The UI is the most important part of the app

Each framework is giving its own world view of how you should build an app

Code Organization

Book: JavaScript Patterns (Stoyan Stefanov)

  • How can I create modules and namespaces?

window.nmap = window.nmap || {};

Let JavaScript be JavaScript; don’t worry about public/private/etc…    

Constructor members are recreated every time the constructor is invoked

jQuery Tiny Pub/Sub http://benalman.com

Designing for Windows 8 Apps

Dan Shultz (@dshultz)

Metro/Windows 8 Style Design

  • Modern Design – Bauhaus
  • International Typographic Style – Swiss design
  • Motion Design – Cinematography

Windows 8 grid

  • Grid units
  • 20×20 pixel grids
  • Title size: 42pt
  • Title line height: 48pt
  • Page header is 5 units from the top

Responsive Design

…is the approach that suggests that design and development should respond to the user’s behavior and environment based on screen size, platform, and orientation

Techniques

  • A flexible grid
  • Media queries

Media Query Ranges

  • Mobile portrait < 479 px wide
  • Mobile landscape 480 – 767 px
  • Tablet portrait 768 – 1023 px
  • Tablet Landscape >= 1024 px

Certification Tips

  • Apps need to work in snap view to pass certification
  • Keep functionality off of the margins to prevent interference with charms and app switching
  • Use charms contracts where applicable
    • Share
    • Search
    • Picker
  • Need to include a privacy statement within settings
  • Weight functionality toward edges for higher usability
    • Center screen requires a posture change
  • Sharing & Ages
    • Limit < age 12
  • Reserved Space
  • Asset sizes
    • 100%
    • 140%
    • 180%
  • Invest in a great live tile
    • Consider different sizes

F# Record Types

In the last post we talked about tuples and how they’re a convenient construct for grouping data. For all their convenience though sometimes tuples aren’t quite powerful enough. Sometimes we want to name the values with something more descriptive than fst or snd. Sometimes we may even want to attach additional members to the type. In these cases we may be inclined to fall back on traditional OOP and create a class – OOP is one of the paradigms supported by F#, after all. In some cases a class may be appropriate but F# offers a syntactically lighter-weight alternative – record types.

Although at their core record types are classes in MSIL but in F# they differ from classes in a few important ways. First, record types have structural equality rather than reference equality. This is achieved by generating overrides for CompareTo, GetHashCode, and Equals. Next, every label is automatically exposed as a read-only property. Finally, record types give us no control over the constructor.

Defining and Creating Record Types

Defining a record type is much easier than defining a class in F#. In keeping with the circle measurements theme from the tuple post, we can create a record type to contain the same measurements as the tuple as follows:

> type CircleMeasurements = { Diameter : float; Area : float; Circumference : float; };;

type CircleMeasurements =
  {Diameter: float;
   Area: float;
   Circumference: float;}

Each of the names within the record definition (the names between the curly braces) are called labels and they must have a type annotation. If we wanted to put each label on its own line we could omit the semicolons.

Creating instances of record types is accomplished with record expressions. With basic record expressions we assign a value to each label:

> let measurements = { Diameter = 5.0; Area = 19.63495408; Circumference = 15.70796327 };;

val measurements : CircleMeasurements = {Diameter = 5.0;
                                         Area = 19.63495408;
                                         Circumference = 15.70796327;}

Just as with defining the type, we can omit the semicolons between the values if we place them each on a separate line.

Conflict Resolution

Occasionally we’ll run across multiple record types with the same labels. When this happens, the type inference engine can’t resolve the proper record type and we need to provide a little more information to the compiler to ensure that the correct type is selected. Consider a system that has both favorites and bookmarks defined as record types:

> type Favorite = { Name : string; Url : string };;

type Favorite =
  {Name: string;
   Url: string;}

> type Bookmark = { Name : string; Url : string };;

type Bookmark =
  {Name: string;
   Url: string;}

You can see that both of these record types has the labels Name and Url. If we were to create an instance as we saw above the compiler will infer that we want the Bookmark type because it was evaluated more recently than Favorite.

> let mySite = { Name = "Didactic Code"; Url = "http://davefancher.com" };;

val mySite : Bookmark = {Name = "Didactic Code";
                         Url = "http://davefancher.com";}

If we really need to use the Favorite type instead of Bookmark we need to explicitly identify the it by prefixing it on the first label:

> let mySite = { Favorite.Name = "Didactic Code"; Url = "http://davefancher.com" };;

val mySite : Favorite = {Name = "Didactic Code";
                         Url = "http://davefancher.com";}

Copy & Update

As with everything else in F#, record types are immutable by default. The implication for record types is that if we want to make a change to an existing instance we need to copy that instance and provide new values where appropriate. Record expressions not only allow creating new instances from scratch but also provide a convenient copy & update syntax. Consider the Favorite instance from the last section. If we needed to change the Url value we could use the copy & update syntax and specify mySite as the template:

> let myOtherSite = { mySite with Url = "http://www.davefancher.com" };;

val myOtherSite : Favorite = {Name = "Didactic Code";
                              Url = "http://www.davefancher.com";}

With copy & update syntax the system will copy forward any values not explicitly assigned. If we were changing multiple values we’d follow the same pattern, separating each label/value pair with a semicolon.

Mutability

If we actually do want to allow changes to a record we can easily make individual labels mutable by prefixing each label we want to make mutable with the mutable keyword. For instance, we can change the type definition for the Favorite type to allow changing the Name as follows:

> type Favorite = { mutable Name : string; Url : string };;

type Favorite =
  {mutable Name: string;
   Url: string;}

Creating an instance of the Favorite type is the same as we’ve already seen:

> let mySite = { Name = "My Blog"; Url = "http://davefancher.com" };;

val mySite : Favorite = {Name = "My Blog";
                         Url = "http://davefancher.com";}

Changing the mutable value is the same as changing any other mutable value in F#:

> mySite.Name <- "Didactic Code";;

val it : unit = ()

> mySite;;

val it : Favorite = {Name = "Didactic Code";
                     Url = "http://davefancher.com";}

Additional Members

Like classes, record types can have additional members such as methods defined on them. If we wanted to extend our Favorites type to include a method to open the associated page we could do so with a simple Visit method that invokes Process.Start.

> open System.Diagnostics;;
> type Favorite =
    { Name : string; Url : string }
    member this.Visit() = Process.Start("IExplore.exe", this.Url);;

type Favorite =
  {Name: string;
   Url: string;}
  with
    member Visit : unit -> Process
  end

Invoking the method is just like invoking a method on any other type in F#:

> let mySite = { Name = "Didactic Code"; Url = "http://davefancher.com" };;

val mySite : Favorite = {Name = "Didactic Code";
                         Url = "http://davefancher.com";}

// Remove the pipe forward to ignore to see process details
mySite.Visit() |> ignore

Pattern Matching

One area that I really haven’t devoted enough space to so far is F#’s pattern matching. I plan to explore pattern matching in much more detail in a future post so for now, if you’re not already familiar with the concept it’s enough to think of it as a much more powerful version of C#’s switch statements (or VB’s select case).

When using record types with pattern matching we need to use a record pattern. Record patterns deconstruct the record into it’s components so we can work with them individually. Record patterns don’t require us to use every value from the record either so we’re free to only include the relevant parts.

Identifying a point’s quadrant in a two-dimensional coordinate plane lets us see most of these concepts in action.

> type Point = { X : float; Y : float }

let FindQuadrant point =
  match point with
  | { X = 0.0; Y = 0.0 } -> "origin"
  | { X = 0.0 } -> "y-axis"
  | { Y = 0.0 } -> "x-axis"
  | p when p.X > 0.0 && p.Y > 0.0 -> "I"
  | p when p.X < 0.0 && p.Y > 0.0 -> "II"
  | p when p.X < 0.0 && p.Y < 0.0 -> "III"
  | p when p.X > 0.0 && p.Y < 0.0 -> "IV"
  | _ -> failwith "unknown";;

type Point =
  {X: float;
   Y: float;}
val FindQuadrant : Point -> string

In this example we see matching against both the X and Y coordinate to determine if the point is the origin. We then see matching against either the X or Y coordinate to determine if the point is on an axis. With the special cases accounted for we move on to using when conditions to check for the individual quadrants. Finally, we have a wildcard pattern that uses failwith to throw an exception.

Wrapping Up

Record types provide a concise and convenient syntax for quickly creating types with named values and optional members.

In the next post we’ll explore another one of F#’s powerful constructs, the discriminated union, and see how it can help us quickly create simple object hierarchies.