Languages

F# Basic Units of Measure

As I’ve been outlining the things I want to write about in regards to F# there are only a few topics that excite me more than units of measure.  If you’ve ever had to write code that deals with different measurement units (and who hasn’t?) this feature can quickly become one of your best friends. I really struggled with where to include this in the series but ultimately decided that it made the most sense to keep it near primitive data types since they’re so closely related.

The Case for Units of Measure

My current project at work involves a bit of image manipulation.  If we were only working with pixels this wouldn’t be an issue but we actually need to translate some values between inches and pixels.  This, of course, means we have three distinct units of measure: inches, pixels, and dots per inch for conversion.  Keeping track of which types and methods require pixels versus those that require inches can be troublesome to say the least.  We can mitigate the problem a bit through naming conventions but simply calling something widthInInches does not guarantee that the supplied value is actually in inches.

This example is pretty insignificant in terms of impact on the world but using the wrong unit of measure can have devastating consequences.  Wouldn’t it be nice if the compiler could enforce using the correct measurement?  Units of measure in F# do just that.

Before we dive in to defining units I think it’s helpful to start from an example that doesn’t have them. Consider the following:

let convertToPixels inches resolution =
	inches * resolution
	
let convertToInches pixels resolution =
	pixels / resolution

let resolution = 150.0
let pixels = convertToPixels 8.0 resolution
let inches = convertToInches pixels resolution

In this simple example we have functions to convert between inches and pixels. The parameter names guide us but there’s nothing enforcing any rules about the values. How can we be sure that 8.0 and 150.0 are actually inches and dots per inch? The answer, of course, is to define and apply some units.

Introducing Units of Measure

Units of measure are simply specially annotated types that can be associated with numeric (both signed integral and floating point) values. Defining a basic unit of measure is easy; just define an opaque (memberless) type and decorate it with the Measure attribute. For instance, we can define dpi, inches, and pixels as follows:

[<Measure>] type dpi
[<Measure>] type inch
[<Measure>] type px

If your application requires SI units (International System of Units) you can save yourself some time as these units are already defined. For F# versions prior to 3.0 you can find them in the F# PowerPack on CodePlex. In F# 3.0 they are included in the FSharp.Core assembly and found under the Microsoft.FSharp.Data.UnitSystems.SI.UnitNames and Microsoft.FSharp.Data.UnitSystems.SI.UnitSymbols namespaces.

Identifying values of a particular unit is similarly trivial:

let dpiValue = 150.0<dpi>
let inchValue = 8.0<inch>
let pxValue = 1200.0<px>

Microsoft recommends using only using units of measure with floating point expressions to avoid potential conversion problem. I’ll be following that convention throughout this article.

In the above example, each value is qualified by a particular measure. With this knowledge we can modify our conversion code to enforce that only properly qualified values can be supplied.

[<Measure>] type dpi
[<Measure>] type inch
[<Measure>] type px

let convertToPixels (inches : float<inch>) (resolution : float<dpi>) =
  float(inches * resolution) * 1.0<px>

let convertToInches (pixels : float<px>) (resolution : float<dpi>) =
  float(pixels / resolution) * 1.0<inch>

let resolution = 150.0
let pixels = convertToPixels 8.0 resolution
let inches = convertToInches pixels resolution

The function signatures have been updated with type annotations to qualify the parameters with the required unit of measure. If we attempt to run this code though the compiler will raise an error because the values being passed to the functions haven’t been qualified. To continue we just need to update the value definitions with the correct units:

let resolution = 150.0<dpi>
let pixels = convertToPixels 8.0<inch> resolution
let inches = convertToInches pixels resolution

Changing Units of Measure

Not all code is aware of measures so there will be plenty of occasions when we’ll need to either add or remove units of measure from an existing value and there are a couple of approaches to both.

Adding Units of Measure

If your data is coming from an external source you’ll need to add units to it to use it with your measure aware code. The easiest way to add units is to take the raw value and multiply it by some value with the desired unit.

> 150.0 * 1.0<dpi>
val it : float<dpi> = 150.0

Alternatively we can use a library function from the LanguagePrimitives module in FSharp.Core.

> LanguagePrimitives.FloatWithMeasure<dpi> 150.0
val it : float<dpi> = 150.0

Removing Units of Measure

If your data needs to be used by something that isn’t measure aware you’ll need to strip away the units. Like with adding units, removing units can be done with a simple mathematical operation. To easily remove units just multiply the measured value by another value of the same units.

> 150.0<dpi> / 1.0<dpi>;;
val it : float = 150.0

We can also remove units by passing the measured value to the conversion function that corresponds to it’s underlying type.

> float 150.0<dpi>;;
val it : float = 150.0

Relating Units of Measure

In the physical world there is often a correlation between different units of measures. F# allows us to express those relationships naturally. Continuing with the theme of image processing let’s look at dots per inch. So far we’ve used a unit named dpi to define dots per inch. On it’s own this is already a huge improvement over a naming convention but we can express the concept much more eloquently by adding another measure and defining relationships between them.

[<Measure>] type px
[<Measure>] type inch
[<Measure>] type dot = 1 px
[<Measure>] type dpi = dot / inch

We added in the dot measure with a formula that identifies a dot as being equivalent to a pixel (the 1 is optional, see the rules section below for more information). We also modified the dpi measure with a formula defining it as being dots divided by inches. These two changes have a profound impact on our conversion code. By defining the formulas and including a type annotation for the function return value the compiler has enough information to convert the return values from float<dpi inch> and float<px/dpi> to float<px> and float<inch>, respectively.

let convertToPixels (inches : float<inch>) (resolution : float<dpi>) : float<px> =
  inches * resolution

let convertToInches (pixels : float<px>) (resolution : float<dpi>) : float<inch> =
  pixels / resolution

Measure Formula Rules

There are some rules to keep in mind when writing unit formulas. Some highlights:

  • Positive and negative integral powers are supported
  • Spaces (or *) between measures indicate a product
  • A / character between measures indicates a quotient
  • 1 is allowed to express a dimensionless quantity or with other units

Runtime Implications

Units of measure are a feature of F#’s static type checking logic and are not included in the compiled code. The implication is that except in certain scenarios we can’t write anything to detect units of measure at runtime.

Next Steps

This article introduced units of measure and demonstrated how they can improve the quality of your code but we’ve only examined how to apply them to simple values and pass specific units to functions. In the next article in this continuing series we’ll look at some other ways to use units of measure including:

  • Adding static members to measures
  • Using generic measures
  • Defining custom measure-aware types

F# Primitives

If you’re reading this I’m assuming that you have a background in software development (.NET in particular) so I won’t do more than show the keyword to .NET type mappings and highlight a few notable items.

As a .NET language F# supports the same primitives as the traditional .NET languages.  Also like other .NET languages, F# supports suffixes on most numeric types to remove ambiguity.  Using suffixes in your code can help the type inference engine resolve types, reducing the need for explicit type annotations.

Mappings

F# Keyword .NET Type Suffix
bool System.Boolean
unit N/A
void System.Void
char System.Char
string System.String
byte System.Byte uy
sbyte System.SByte y
int16 System.Int16 s
uint16 System.UInt16 us
int (or int32) System.Int32 l (optional)
uint (or uint32) System.UInt32 u
int64 System.Int64 L
uint64 System.UInt64 UL
nativeint System.IntPtr n
unativeint System.UIntPtr un
decimal System.Decimal m (or M)
float (or double) System.Double
float32 (or single) System.Single f (or F)

Although it is not a primitive type, F# also exposes System.BigInteger via the bigint keyword for computations with integers larger than 64-bits.

Unit

You should already be familiar with most of the types listed above but there’s one type that’s specific to F#.  The unit type, denoted by (), represents the absence of an actual value but should not be confused with null or void.  The unit value is generally used where a value would be required by the language syntax but not by the program logic such as the return value for functions that are invoked for their side-effect(s) rather than their result.

Type Conversions

One important way that F# differs from other .NET languages is that it does not allow implicit type conversions because of bugs that can arise due to type conversion errors.  Instead we can explicitly convert values using the built-in conversion functions.

(float 1) + 2.0 |> printfn "%A"

In the example we convert the int value to float by passing it to the float function and add it to the existing float value.

Introducing F#

If you’ve been watching my blog over the past few months you’ve probably noticed a few posts about F#.  If you’ve spoken to me about programming and the conversation has turned to F# you’ve probably had a hard time getting me to stop talking about it.  I’ve known about F# for a few years but despite wanting to learn it and a few false starts I really hadn’t done anything but glance at it until a few months ago.

I’ve been using C# as my primary language since I started with .NET but the past few years there has been something really bothering me about the language.  It wasn’t until one afternoon while mowing the lawn and listening to Hanselminutes #311 when I heard Phillip Trelford pinpoint one of my issues in a much more eloquent and entertaining manner than I could.  He likened writing C# to completing local government forms in triplicate.  C# has a way of making us describe the same thing multiple times.  The result is that we end up writing extra code to make the compiler happy rather than solving a problem.  It was with this newly found clarity that I decided to dive into F#.

Since getting serious about learning the language I’ve taken a PluralSight course, read Programming F#, referenced Real World Functional Programming a few times, and have begun doing most of my prototyping code with it.  I still have a lot to learn but I finally feel that I’ve reached a point where I’m comfortable enough with the language that I can start sharing my love a bit.

Just in case it isn’t apparent yet, this is the first in what will be an ongoing series about F#.  I’m by no means an F# expert and my intent isn’t to provide a comprehensive reference.  Instead I’m going to introduce the language, its features, and document some of the aspects I find most interesting as I continue to learn and explore.

What is F#?

Originally developed in 2005 at Microsoft Research, Cambridge, F# is a case-sensitive, statically typed, multi-paradigm language targeting the .NET framework. F# belongs to the ML family and is heavily influenced by OCaml in particular.  Like other languages in the ML family F# makes heavy use of type inference often making explicit type annotations unnecessary.

As an ML language, F# emphasizes functional programming and as such, it sports a variety of concepts from functional languages such as first-class functions and immutability.  However, it cannot be considered purely functional because it allows for side-effects including optional mutability.  Because it targets the .NET Framework, F# code compiles to MSIL.  F# assemblies can also consume or be consumed by other .NET assemblies.

Anatomy of an F# Application

If you’re diving in to F# from a traditional .NET background like me, F# is probably going to be a bit of a shock.  F# differs from traditional .NET languages in virtually every way including project organization and programming style.

Top-Down Evaluation

In traditional .NET languages it’s standard practice to include only one type per file.  In these projects the files are almost always organized into a neatly organized folder hierarchy that mirrors the namespaces in the project.  This is most definitely not the case with F# where related types are often contained within the same file and files are evaluated from top-down in the order they appear in the project.

Top-down evaluation is a critical concept in F# in that it enables a number of language features including its powerful type inference and entry point inference capabilities.  Top-down evaluation can be a source of frustration though if you forget to properly organize new files since you can only access types defined earlier in the same file or in a file higher up in list.

Modules & Namespaces

Whether explicitly defined through code or not, each file in an F# application must be part of a module or namespace.  Modules are roughly equivalent to static classes in C# program while namespaces are organizational units just like in other .NET languages.  When a module or namespace isn’t explicitly declared, the compiler generates a default module  named after the code file.  For example, if you have a file named MyFile.fs, the compiler will generate a module named MyFile.

Whitespace

Where other languages use a variety of syntactic elements to denote code blocks (semicolons, curly braces, BEGIN, END, etc…) F# uses whitespace (see whitespace note below).  Code blocks are created by indenting the contents of the block beyond the beginning of the block.  Consider the following code:

let add x y = x + y

printfn &quot;%i&quot; (add 1 2)

In the example we define an add function that accepts two values. The body of the function is indented two spaces beyond the definition. If we were to remove the indentation the compiler would greet us with a warning about possible incorrect indentation.

Whitespace: Spaces or Tabs?
F# actually allows us to code using either an explicit syntax or a lightweight syntax.  Lightweight syntax is generally considered more readable because it lets us omit some language elements by making whitespace significant to organize code into blocks.  Since indentation level is significant for code blocks in lightweight syntax and because tabs can indicate any number of spaces F# puts a quick end to the unending debate over tabs or spaces by explicitly forbidding tabs in the language specification (section 15.1 for those interested).

Values, Bindings, & Immutability

Another major way that F# differs from traditional .NET languages is due to its functional nature.  .NET has always had some support for limited functional programming.  Even since the early days of .NET we’ve had support for delegates with anonymous functions and lambda expressions coming much later.  For the most part though functional programming in .NET has been pretty limited.  F# changes all that by focusing on functional programming rather than changes in state.  As such, all values in F# are immutable by default.  In fact it is generally a misnomer to refer to values in F# as variables.

Mutability is often the reason for subtle bugs that arise because of inadvertent changes to program state.  By enforcing immutability the likelihood of this type of error is greatly reduced.  Another implication of the immutable nature of F# is that asynchronous and parallel processing is greatly simplified since we don’t need to worry (as much) about side-effects.

In F# we bind a name to a value using the let keyword.

let name = &quot;Dave&quot;

Once we have bound a name to a value in this manner that value cannot be changed. In many cases it’s possible to “shadow” the original value by defining another binding with the same name. With shadowing, the original value still exists but is not accessible.

let name = &quot;Dave&quot;
let name = &quot;David&quot;

Immutability isn’t always desirable though.  Consider the case of a property’s backing variable.  If the property is writable we’ll generally want to change the value of its backing variable.  In cases like that we can simply include the mutable keyword in the backing variable’s definition to make it mutable.

let mutable name = &quot;Dave&quot;

Changing a mutable value is simple but note that the assignment operator is an arrow:

let mutable name = &quot;Dave&quot;
name &lt;- &quot;David&quot;

Return Values

Every expression in F# must return a value.  Because of this constraint it is assumed that the value returned from evaluating the last line of a function will be the return value so it is implicitly returned.  This is generally a great convenience but what if your function exists solely for it’s side-effect (such as printing something to the console)?

The unit type which is roughly equivalent to void in C# exists for this purpose.  To return unit from a function simply make the last line read ().

let add x y =
  printfn &quot;%i&quot; (x + y)
  ()

Sometimes we want to invoke a function for its side-effect and want to ignore the return value. If we were to forego binding the result to a name as is often the practice in other .NET languages the F# compiler will generate a warning about the ignored value. To work around the warning we can simply pipe the result (I’ll talk about piping in a future post) to the built-in ignore function.

let add x y = x + y

add 1 2 |&gt; ignore

Nulls

F# only has a limited concept of null. In most cases, null isn’t permitted but there are certain circumstances such as when interoperating with other .NET languages where it’s necessary.  If it’s absolutely necessary for an F# type to support null the AllowNullLiteral attribute can be used but it should be used sparingly and generally only when other non-null options have been exhausted.

Type Inference

I have to make a confession that will annoy some static typing purists.  I use var whenever possible in C#.  I like static typing but I like the DRY principle even more and I hate compiler inflicted repetition.  I see no reason why I should have to continually tell the compiler what data type I’m working with.  By using var to tap into the type inference engine in C# I’m able to avoid some of the repetition.  F# takes type inference in .NET to a new level.

If you’ve been paying attention to the examples, particularly those involving functions, you’ve probably noticed that the code samples haven’t had any explicit mention of types.  F#’s type inference engine gives me everything I want: static typing without the repetition.  The type inference engine is so powerful that it often isn’t necessary to explicitly annotate types even in function signatures.  It’s easily one of those features that separates the language from the pack in my eyes.

Let’s consider the add method again:

let add x y = x + y

add 1 2 |&gt; printfn &quot;%A&quot;

In this simple example the compiler is able to infer from usage that the arguments x and y are of type int. If we later decide that we need to pass float values we only need to change the type passed to the function:

let add x y = x + y

add 1.0 2.0 |&gt; printfn &quot;%A&quot;

If it isn’t clear how much impact this inference can have on readability and maintainability consider the equivalent code in C#:

Func&lt;int, int, int&gt; add = (x, y) =&gt; x + y;

Console.WriteLine(add(1, 2));

Even in this contrived example the difference is obvious. If we wanted to change the C# version to use float we’d have to make five changes instead of two!

Unfortunately, the compiler isn’t always able to infer the types. In those cases we need to turn to type annotations and give the compiler some help. Here’s the add function modified to always accept int values:

let add (x : int) (y : int) = x + y

add 1 2 |&gt; printfn &quot;%A&quot;

In this modified example, if we were to pass anything other than int values, the compiler would produce an error. We can even provide an annotation for the return type as follows:

let add (x : int) (y : int) : int = x + y

add 1 2 |&gt; Console.WriteLine

At this point we’ve reached the same level of complexity as the C# version so the gains are minimal but when a function is used consistently the compiler should have no problem inferring the correct types.

Type annotations aren’t limited to function signatures either.  We can use a similar syntax to instruct the compiler to enforce a particular type with value bindings too:

let name : string = &quot;Dave&quot;

Generally speaking though, defining a binding in this manner is seldom required. Most of the time the inference engine will determine the correct type. In the case of numeric types we can use type suffixes to give the compiler a hint about the actual type:

let int32Value = 10
let int64Value = 10L
let byteValue = 10y

Next Steps

Having only highlighted some of the very high level concepts available in F# I’ve barely scratched the surface of what it can do.  I hope this has at least piqued your interest enough to continue looking at the language and consider making it part of your toolkit.  In the coming weeks I’ll be writing more posts taking a closer look at many of the language’s features including some of the functional types, pattern matching, function currying, and object-oriented capabilities

In the mean time if you’d like to explore on your own or engage with the community here are a few resources for you:

C# 5.0 Breaking Changes

In the Language Lab section of the November 2012 issue of Visual Studio Magazine Patrick Steele highlights some of the lesser known changes to C#. Among the changes are some new attributes to help obtain caller information without having to resort to directly accessing StackFrames but that’s not what I want to call attention to. The more important part of his article are some breaking changes that anyone moving to C# 5 should be aware of.

The first of the breaking changes relate to capturing the value of an iteration variable in a lambda expression. If you’ve ever written a loop where the body contained a lambda expression that directly used the iteration variable you’ve encountered some unexpected behavior.  Consider Patrick’s example:

var computes = new List<Func<int>>();

foreach(var i in Enumerable.Range(0, 10))
{
	computes.Add(() => i * 2);
}

foreach(var func in computes)
{
	Console.WriteLine(func());
}

Without knowing the old behavior one could reasonably assume that the second loop would print out 0 – 18 (by 2s of course) but that’s not what happens. Prior to C# 5.0 deferring execution of the lambda expression to the second loop causes the expression to use the last value of i (9) so the number 18 is printed 10 times. We can observe similar behavior in LINQ as it iterates over sequences. The way to work around it was to create a state variable and capture it in a closure like in this modified example:

var computes = new List<Func<int>>();

foreach(var i in Enumerable.Range(0, 10))
{
	var state = i;
	computes.Add(() => state * 2);
}

foreach(var func in computes)
{
	Console.WriteLine(func());
}

Under C# 5.0 using a state variable is no longer necessary. The compiler will handle capturing the value of the iteration variable when it’s created.

The other breaking change relates to how named and positional arguments are handled. I typically only use explicit, ordered parameters so the old behavior never really affected me but previous versions of the compiler would evaluate named arguments before evaluating the ordered parameters. This behavior wasn’t particularly intuitive so it has been changed in C# 5.0. The only time this would really be a problem is when the expression being evaluated affected subsequent expression evaluations but since the change does affect compiler behavior it’s important to be aware of.

Using LoopingSelector and ILoopingSelectorDataSource

Custom Time EntryThis week I finished and submitted my second Windows Phone app. On one of the pages I wanted to allow users to enter a custom TimeSpan in a manner similar to entering a date or time. Of course, the SDK doesn’t directly provide the controls to replicate that experience so I turned to the LoopingSelector in the Silverlight Toolkit for Windows Phone for help. Unfortunately, documentation is pretty sparse so I’m hoping this can help someone else out.

(more…)

String Calculator TDD Kata in F#

In a recent post I mentioned that I started learning F#.  Over the past few weeks I’ve really started falling in love with the language.  Coincidentally, my team has been incorporating more test-driven development (TDD) into our process and as part of that training Roy Osherov’s TDD Kata 1 was mentioned.  I immediately thought “Hey! Let me try doing that in F# and see how it goes!”  I’ve run through the kata more than a few times now and I’m ready to share my experience with the world. (more…)

[WPF] Formatting Content Controls

Over the past few years I haven’t done much UI work and what little I did was either with a proprietary Web framework or Windows Forms so I’m more than a little behind the curve with WPF. My most recent project has me learning WPF on the fly and although it’s a daunting task and I still have plenty to learn I’m finding it pretty straightforward. Every once in a while though something works completely different than I expect and throws me for a bit of a curve.

One such example just happened yesterday. I was trying to bind and format a value to a Button‘s Content property. I was trying to bind using the same syntax as binding to the Text property on a TextBox:

<Button Content="{Binding Percentage, StringFormat={}{0:P0}}" />

I was surprised though when the value bound properly but wasn’t formatted. After some investigation I learned that controls like Button that inherit from ContentControl can’t be formatted that way. Instead we need to use the ContentStringFormat to achieve the same effect:

<Button Content="{Binding Percentage}" ContentStringFormat="P0" />

With that change the value started showing up as the percentage just like I expected. Lesson learned.

NUnit With F#

Visual F#A few weeks ago I started learning F#. As I moved from the simple examples that were easily executable in F# Interactive or LINQPad though I found myself wanting to write formal unit tests to make sure I was understanding the concepts correctly. I decided to write my tests in F# for more practice and because its syntax makes it an attractive choice but in order to keep the learning curve under control a bit I thought it best to stick with a familiar test framework – NUnit. Using NUnit with F# is pretty trivial but being so new to the language I ran into a few small obstacles.

Importing Types

In order to use NUnit we need to import the types from the NUnit.Framework namespace. Importing the types is accomplished with the open keyword which looks and behaves much like the using directive in C#.

open NUnit.Framework

Don’t get too excited yet though because the compiler is probably yelling at you.

Most F# tutorials that I’ve seen provide examples that are easily run within F# Interactive (FSI), LINQPad, or a single F# Application (exe) project. They generally only include a few types and almost never span multiple files let alone multiple assemblies or languages. This is great for introducing concepts but as soon as you move to a more complex library project (like one for testing) the compiler will politely inform you of a problem depending on where you’ve placed the declaration:

Files in libraries or multiple-file applications must begin with a namespace or module declaration, e.g. ‘namespace SomeNamespace.SubNamespace’ or ‘module SomeNamespace.SomeModule’

If you’ve placed the directive at the beginning of the file before declaring a namespace or declaration you’ll see the above error. To fix this problem I like to take the namespace route because I’m just trying to group my testing types and it follows the organization pattern I’d be using were I writing the tests in C#.

namespace StringCalculatorKata.Logic.Tests

open NUnit.Framework

[]
type StringCalculatorTests() =
  []
  member x.Add_EmptyString_ReturnsZero() =
    // Test Code Omitted
  [<TestCase("1", Result=1)>]
  [<TestCase("2", Result=2)>]
  member x.Add_SingleNumber_ReturnsThatNumber calcString =
    // Test Code Omitted

Applying Attributes

With the types imported we can identify our fixtures and tests. The good news is that we accomplish this with attributes just like with other .NET languages. The bad news is that although it’s simple, it’s not immediately obvious. If you’re reading this you should already know that TestFixtureAttribute applies to the classes that hold your tests and that TestAttribute and TestCaseAttribute identify tests. But how does that translate to F#?

One answer is to define a type and decorate it with the appropriate attributes. You’ll want to make sure you’re applying the test attributes to member bindings rather than let bindings here because let bindings compile to internal members whereas member bindings compile to public members by default.

[]
type StringCalculatorTests() =
  []
  member x.Add_EmptyString_ReturnsZero() =
    // Test Code Omitted
  [<TestCase("1", Result=1)>]
  [<TestCase("2", Result=2)>]
  member x.Add_SingleNumber_ReturnsThatNumber calcString =
    // Test Code Omitted

Were you aware though that since .NET 2.0 you could apply TestFixtureAttribute to a static class as well? If you prefer this approach you can define your fixture as a module since modules compile down to static classes.

[]
module StringCalculatorTests =
  []
  let Add_EmptyString_ReturnsZero() =
    // Test Code Omitted
  [<TestCase("1", Result=1)>]
  [<TestCase("2", Result=2)>]
  let Add_SingleNumber_ReturnsThatNumber calcString =
    // Test Code Omitted

Aside from being a module definition, notice that the attributes are applied to let bindings in the module. Module level let bindings compile to public static members by default.

Assertions

So far I’ve been careful to avoid including assertions in the sample code. Not to worry, assertions in F# are the same as they’d be in any other language but you do need to be aware of a syntactic nuance that affects many of the assertion methods.

In short, the compiler’s overload resolution mechanism requires you call them using a syntactic tuple form (syntax that looks like a tuple but isn’t) rather than the curried form that’s so common in F#. This means that instead of separating the individual parameters with spaces you need to call the method like you would in C#:

type StringCalculatorTests() =
    []
    member x.Add_EmptyString_ReturnsZero() =
        let calc = new StringCalculator()
        let result = calc.Add ""
        Assert.That(0, Is.EqualTo 0)

Of course, if you’re using TestCaseAttribute with the optional Result parameter you’ll be able to eliminate the manual assertion altogether in many cases:

[]
type StringCalculatorTests() =
  [<TestCase("1", Result=1)>]
  [<TestCase("2", Result=2)>]
  member x.Add_SingleNumber_ReturnsThatNumber calcString =
    let calc = new StringCalculator()
    calc.Add calcString

Specifications, Expression Trees, and NHibernate

For my latest project we decided to follow a domain driven approach.  It wasn’t until after we had built out much of the domain model that we decided to change course and embrace CQRS and event sourcing.  By this point though we had already developed several specifications and wanted to adapt what we could on the query side.

(more…)

String Distances

One of my first projects at Leaf has been trying to match some data read from an OCR solution against what is in the database.  As sophisticated as OCR algorithms have become though it’s still not reliable enough to guarantee 100% accurate results every time due to the number of variations in type faces, artifacts introduced through scanning or faxing the document, or any number of other factors.

Most of the documents I’ve been working with have been pretty clean and I’ve been able to get an exact match automatically.  One of my samples though, has some security features that intentionally obfuscate some of the information I care about.  This naturally makes getting an exact match difficult.  Amazingly though, the OCR result was about 80% accurate so there was still some hope.

One of my coworkers suggested that I look at some of the string distance algorithms to see if any of them could help us get closer to an exact match.  He pointed me at the Levenshtein algorithm so I took a look at that along with the Hamming and Damerau-Levenshtein algorithms.

For the uninitiated (like myself a week ago), these algorithms provide a way to determine the distance between two strings.  The distance is essentially a measurement of string similarity. In other words, they calculate how many steps are required to transform one string into another.

I want to look briefly at each of these and show some examples. Note that each of these algorithms are case sensitive but modifying them to ignore case is trivial.

(more…)