Software Development

TypeScript: Bringing Sanity to JavaScript

TypeScriptAs much as I’d prefer to be working in F#, I’ve found myself doing quite a bit of JavaScript lately. I’m hardly a JavaScript expert but I like to think my skills are above average (although that could merely be wishful thinking). I won’t deny that JavaScript has its problems but I think that it has traditionally been maligned by things that aren’t necessarily the language’s fault; for instance, DOM inconsistencies are hardly under JavaScript’s control. Generally speaking, I really do enjoy working with JavaScript. That said, there are things such as prototypal inheritance that I always struggle with and I know I’m definitely not alone. That’s why I decided to take a look at TypeScript now that it has finally earned a 1.0 moniker.

By introducing arrow function expressions, static typing, and class/interface based object-oriented programming, TypeScript brings sanity to JavaScript’s more troublesome areas and gently guides us to writing manageable code using familiar paradigms. Indeed, writing TypeScript feels almost the same as writing C# (with a dash of Java) which is hardly a surprise given that its lead architect is Anders Hejslberg.

If you’re currently working primarily with the Microsoft stack, adding TypeScript to your environment is easy. Because TypeScript is a proper superset of JavaScript all of your existing JavaScript is already valid TypeScript. What’s more is that while the TypeScript compiler emits JavaScript, it also produces a .map file which allows you to debug using the original TypeScript source!

In this article, I’ll introduce a number of what I think are TypeScript’s most important features giving a brief overview of each. This isn’t intended to be a comprehensive survey of the language but hopefully it will give you enough information to whet your appetite.

Functional Goodness

As a functional guy, I strongly believe that JavaScript is at its best when its functional nature is embraced. TypeScript includes a few features that make functional programming more accessible.

Arrow Function Expressions

Arrow function expressions are hardly the most important aspect of TypeScript but I like them so much and will be using them throughout this article so I thought it best to describe them up front.

One of the things that bothers me about JavaScript is its lack of lambda expressions. JavaScript’s anonymous functions certainly serve the same purpose but oftentimes they introduce a lot of noise that can obscure the function’s logic. Consider a simple anonymous function for calculating the area of a circle:

function (radius) { return Math.PI * Math.pow(radius, 2); };

In this example, only about 60% of the code is actually what I’d consider useful; the rest is syntactic noise. Toss this form into a call to a higher-order function and the logic is further obfuscated. This has been especially bothersome since I’ve gotten so used to lambda expressions in C# and F#. ECMAScript 6 has a proposed arrow syntax that closely resembles C#’s lambda expressions but ECMAScript 6 adoption is a long way off. Fortunately, with TypeScript you can use it today. In TypeScript, you could redefine the above function like this:

radius => Math.PI * Math.pow(radius, 2);

The arrow function syntax is equivalent to the traditional syntax (the compiler actually expands it to the traditional syntax) but the amount of useful code is closer to 90% – a significant improvement!

Optional and Default Parameters

One aspect of JavaScript that I consider to be both a blessing and a curse is its flexibility with function parameters. On one hand, it’s nice to be able to pass any number of values to a function and access them via the arguments pseudo-array. On the other hand, if I’ve taken the time to explicitly list each parameter, I should reasonably expect that a value will be supplied for each yet JavaScript offers no such guarantee. TypeScript enforces that the proper number of arguments are supplied. Let’s look at a simple function for formatting a typical American name:

function formatName(first : string, last : string, middle : string) {
    return last + ", " + first + " " + middle;
}

Unfortunately for our function, not everyone has a middle name so if we try to invoke the function but supply values for only first and last, the compiler will warn that it cannot select an overload. To work around this, we can define middle as either an optional parameter or a default parameter.

To make middle an optional parameter, we simply need to append a question mark to the parameter name. We can then handle the parameter appropriately within the function body as shown:

function formatName(first : string, last : string, middle? : string) {
    return last + ", " + first + (middle ? " " + middle : "");
}

Alternatively, to make middle a default parameter we simply need to replace the type annotation with an assignment as follows:

function formatName(first : string, last : string, middle = "") {
    return last + ", " + first + " " + middle;
}

The compiler infers the proper type from the assignment and we’re guaranteed to have a value (unless someone passes null, of course).

Function Overloading

Optional and default parameters are nice for simple scenarios but for more complex scenarios we can turn to function overloading. Because JavaScript doesn’t support overloading, TypeScript’s overloading capabilities are rather limited and are quite different from what you might expect.

TypeScript’s function overloads have two parts, the overload signatures, and the implementation. The caveat is that the compiler must be able to resolve each overload signature to the implementation signature. Interestingly, each overload signature can resolve to an empty signature thus, I’ve (so far) found it most convenient to provide the implementation as a parameterless function and resort to the traditional JavaScript approach of using the arguments pseudo-array to determine what was supplied based on the overload signatures. To see this in action, let’s look at an overload approach to the simple name formatting function we saw in the previous section:

function formatName(first : string, last : string);
function formatName(first : string, last : string, middle : string);
function formatName() {
    var first = arguments[0];
    var last = arguments[1];
    var middle = arguments.length > 2 ? arguments[2] : undefined;
    return first + ", " + last + (middle ? " " + middle : "");
}

By virtue of the overload signatures, we can invoke the formatName function with either 2 or 3 arguments. The implementation knows to expect both the first and last names so it assigns them to names unconditionally. The implementation also knows that it may or may not receive a middle name so it conditionally sets that value before concatenating and returning the values.

Built-in Types

With a name like “TypeScript” you probably correctly guessed that the type system plays a significant role within the language. TypeScript shares a number of common types with JavaScript. For the purposes of this article, I’ll assume you’re already familiar with the common types such as Boolean, Number, String, Null, and Undefined and focus on what TypeScript adds.

Any

JavaScript is a dynamically typed language. Although dynamic typing is convenient for simple applications, it often leads to confusion and fragility in more complex systems. TypeScript deviates from JavaScript by using static typing.

The TypeScript language reference states that static typing is optional but this seems a bit misleading. In TypeScript, all types derive from the Any type thus, every type is compatible with Any. In this sense, the Any type is much like Object in .NET or Java as shown next:

var x: any = 10;
x = "a";
x = true;

Here we’ve defined the variable x to be of type Any by including a type annotation in the definition and setting its initial value to 10. We then change the value to a string before finally settling on a Boolean.

Where Any differs from Object (and where I think the statement about optional static typing originates, though that’s pure speculation) is that a value of type Any can be assigned to any other type. The following code shows this principle in action:

var x: number = 10;
var y: any = "a";
x = y;

In this example we define two variables, x and y. Here, x is explicitly defined as Number while y is defined as Any. On the final line, we assign y to x and the TypeScript compiler happily complies despite the differing types.

Void

The Void type is similar to the void type in C# in that it is generally used as a function’s return type to indicate the function has an effect as Void indicates no particular value. Where TypeScript’s Void type differs from that of other languages is that Void is also the base type for both null and undefined (null is also a subtype of every type other than undefined) so it’s technically possible to define a variable of type Void but doing so is rather pointless.

Enumerations

TypeScript includes first-class support for enumerations through C#-like enum declarations.

Defining

TypeScript’s enumerations let you associate a series of labels to known numeric values thus avoiding “magic numbers.” A nice thing about TypeScript enumerations is that because they’re subtypes of Number, they can be used interchangeably with Number but the TypeScript compiler protects against assigning the wrong enumeration type. Here’s an example of a simple enumeration that defines some shapes:

enum Shapes {
    None,
    Circle,
    Triangle,
    Rectangle
}

The previous sample follows the default behavior of implicitly assigning zero-based sequential values to the labels such that None is 0, Circle, is 1, and so on. Just as in C#, you can override this default behavior by explicitly providing values inline as follows:

enum Shapes {
    None = 0,
    Circle = 10,
    Triangle = 20,
    Rectangle = 30
}

Now, instead of being automatically associated with a value, each label is explicitly assigned. Should any label not have an explicitly associated value, the default sequential implicit association will be used.

Despite their syntactic similarities with C#, TypeScript enumerations are actually quite different constructs. The way a TypeScript enumeration is represented in JavaScript is fascinating and highlights how flexible JavaScript truly is. Here’s the JavaScript representation of the Shapes enumeration using the default (generated) values:

var Shapes;
(function (Shapes) {
    Shapes[Shapes["None"] = 0] = "None";
    Shapes[Shapes["Circle"] = 1] = "Circle";
    Shapes[Shapes["Triangle"] = 2] = "Triangle";
    Shapes[Shapes["Rectangle"] = 3] = "Rectangle";
})(Shapes || (Shapes = {}));

The first time I saw this code it took me a little while to understand what it was doing but it’s actually quite simple once you recognize that it relies heavily on the fact that the assignment operator returns a value. At its heart, initializing the enumeration involves an immediate function which dynamically creates two properties for each named value. The first property is named for the case it represents and is assigned the case value (Shapes[“None”] = 0). The second property is named for the case value (via the nested assignment) and is assigned the case name (Shapes[Shapes[“None”] = 0] = “None”). These dual properties establish a sort of bi-directional relationship allows you to fairly easily retrieve either the name or value via the property indexer syntax. For instance, if you wanted to get the name “Circle” you’d write this:

Shapes[Shapes.Circle]

Computed Members

When the case values are known constants (either explicitly supplied or implicitly generated), the members are said to be constant members. Where TypeScript enumerations get really interesting is that they can exploit the underlying JavaScript’s dynamic nature such that the explicitly associated values don’t need to be constants! Instead, each case can be based on a computation so long as that computation results in a number. For instance, we could define a Circles enumeration with each label associated with different computed areas like this:

var getArea = r => Math.PI * Math.pow(r, 2);

enum Circles {
    Small = getArea(1),
    Medium = getArea(5),
    Large = getArea(10)
}

The TypeScript compiler behaves differently when resolving computed members than it does when resolving constant members. With constant members, the compiler replaces the member reference with the constant value. Conversely, computed members are left as member references.

Type Annotations and Inference

Like any language with static typing, you need a way to inform the compiler what type to expect. TypeScript achieves this through F#-like type annotations on variable declarations as shown next:

var x : number = 10;
var y : string = "ten";

Here we’ve defined two variables, x and y, and annotated them to be of types number and string, respectively. Constantly telling the compiler what type to expect gets pretty tedious (it’s one of my main gripes about C#) but fortunately, TypeScript includes a rather robust type inference system which often makes explicit annotations unnecessary. Here are the same definitions without the type annotations:

var x = 10;
var y = "ten";

While these may look like traditional JavaScript definitions, they are in fact strongly typed. What’s happened is that the compiler inferred Number and String from the respective values as evidenced by hovering over the name in Visual Studio to reveal the appropriate type names.

TypeScript’s type inference can also work contextually on function parameters as shown in this example adapted from the TypeScript handbook:

window.onmousedown = e => console.log(e.button);

The TypeScript compiler knows about onmousedown so it is able to infer that e is of type MouseEvent. Interestingly, despite TypeScript’s typically bottom-up approach to type inference, function parameters are inferred to be of type Any. Let’s take another look at the getArea function introduced in the earlier Circles example, to see this in action (note that this example uses another TypeScript feature, the arrow function; we’ll take a closer look at arrow functions in the Functional Goodness section a bit later):

var getArea = r => Math.PI * Math.pow(r, 2);

If we were to hover over getArea in Visual Studio, we’d see that it’s signature is (r: any) => number which indicates that the sole parameter, r, is of type Any and getArea’s return value is Number. Unfortunately, TypeScript’s type inference capabilities aren’t quite as strong as F#’s so despite Math.pow requiring a Number, r is still inferred to be Any. To improve getArea’s type safety we should revise it to identify the parameter type through an annotation like this:

var getArea = (r : number) => Math.PI * Math.pow(r, 2);

Now Visual Studio will correctly report the function’s type as (r: number) => number.

Classes

JavaScript has always been an object-based language. The problem with being object-based rather than object-oriented is that developers tend to want to treat it as an object-oriented language but get frustrated when core OO concepts like inheritance and encapsulation don’t work the same way as they do in more traditional OO languages. The name JavaScript has never helped distinguish it, either. It’s often not that these things can’t be done in JavaScript or even that they’re difficult, it’s just that JavaScript’s approach is so foreign that developers often find them confusing.

Much of that confusion stems from the fact that objects aren’t really objects in the way that traditional OO developers think of them. Instead, JavaScript objects are dictionaries (or hashes, if you prefer) with some additional syntactic support. Sure, we can define functions that serve as constructors, nest values, and even reference something in an instance via the this identifier but all we’re really doing is defining a series of key/value pairs. This is especially apparent when using object literals.

TypeScript addresses disconnect this through formal classes that somewhat resemble C# classes. For instance, here’s a simple class that represents a circle:

class Circle {
    radius : number;

    constructor(radius : number) {
        this.radius = radius;
    }

    getArea() {
        return Math.PI * Math.pow(this.radius, 2);
    }
}

The traditional OO conventions should be pretty easy to spot in the Circle class. First, we have a field named radius, next is a constructor which accepts the radius, and finally there’s a method called getArea.

Aside from following a more conventional approach, TypeScript classes ensure that the JavaScript definitions are properly established on the object. For example, the getArea method is associated with Circle’s prototype rather than Circle itself thus facilitating reuse and inheritance.

Access Control

In TypeScript, class members are public by default but can be made private by including the private access modifier. It’s important to note though, that this access control does not extend beyond TypeScript’s boundaries into the compiled JavaScript;. Because each member is directly associated with the object, anything defined as private will still be a property of the object and thus, accessible via JavaScript. It’s also important to note that TypeScript has no concept of protected members.

Accessors (Properties)

Fields will often be sufficient for associating data with an object but should you require more granular control over how values are accessed or modified, you can use the get and set accessors in conjunction with a backing field. Let’s revise our Circle class to use this approach:

class Circle {
    private _radius: number;

    constructor(radius: number) {
        this._radius = radius;
    }

    get radius() { return this._radius; }
    set radius(value) { this._radius = value; }

    getArea() {
        return Math.PI * Math.pow(this._radius, 2);
    }
}

Now the Circle class has a private _radius field. Access to that field is controlled through the get and set radius functions. If we wanted to make the radius value read-only then we could omit the set function but this will still control access to the _radius value within the confines of TypeScript.

Methods

Like classes in more traditional OO languages, TypeScript classes support methods. TypeScript methods are just regular functions with a streamlined syntax. One notable aspect of defining functions as methods as opposed to fields is that the compiler attaches methods to the object’s prototype rather than an individual instance which plays more nicely with JavaScript inheritance.

Since methods are just functions, it’s possible to overload them in the same manner as a function.

Constructors

TypeScript classes are initialized through constructor functions. You can either define your own constructor function or rely on the default constructor functions provided by the compiler. We’ve already seen an example of a TypeScript constructor so I won’t repeat it here.

Automatic Constructors

If you don’t define a constructor function, the compiler will generate one automatically for you. In many cases, the automatic constructor function will have no parameters and perform no initialization actions beyond member variable initialization. However, if the class is derived from another class, the automatic constructor will accept the same parameters as the base class’ constructor and pass the arguments on to the base class before initializing member variables.

Custom Constructors

To create a custom constructor function we simply define a function named “constructor” along with any parameters we want to accept as shown in the Circle class above. Like functions and methods, it’s possible to overload constructors.

Parameter Properties

As a final note about TypeScript constructors, if you specify an access modifier (public or private) for a parameter, that parameter will automatically become a field on the class. This approach is much like F#’s (or C# 6’s) primary constructors. We’ll see an example of this a bit later when we look at interfaces.

Inheritance

As I mentioned at the beginning of the article, prototypal inheritance is something I typically mess up when coding straight JavaScript. It’s not necessarily that prototypal inheritance is hard, it’s just that I typically forget some step. Rather than relying on search engines memory to get it right, TypeScript provides a familiar, Java-like inheritance syntax.

Central to TypeScript’s inheritance model is the extends keyword which allows you to specify the base class as shown in the following sample:

class Shape {
    description: string;
    
    constructor (description : string) {
        this.description = description;
    }
    
    getArea() { return NaN; }
    toString() { return this.description; }
}

class Circle extends Shape {
    radius : number;

    constructor (radius: number) {
        super("Circle");
        this.radius = radius;
    }
    
    getArea() {
        return Math.PI * Math.pow(this.radius, 2);
    }
    
    toString() {
        var area = this.getArea();
        return super.toString() + " (" + area.toString() + ")";
    }
}

In the preceding code, the Circle class extends the Shape class and overrides both of its methods (note that TypeScript has no concept of abstract members). Because the Circle class has a custom constructor we needed to explicitly invoke Shape’s constructor. Had we used an automatic constructor, the compiler would have handled this detail for us.

Derived classes can access the base class’ public members via the super identifier as shown in the Circle class’ toString override. Because the TypeScript compiler handles the details of ensuring that the correct execution context (this) is used, we could just as easily have foregone overriding toString by including the call to getArea in the Shape class but I thought it would be more useful to see the super identifer in action.

Static Members

Although TypeScript doesn’t include the notion of an static class (modules fill that role as we’ll see a bit later), classes can have static members. To make a member static, simply prefix the definition with the static modifier. The following snippet shows a class with a static method:

class CircleGeometry {
    static getArea(circle) {
        return Math.PI * Math.pow(circle.radius, 2);
    }
}

Interfaces

Interfaces in TypeScript serve much the same purpose as interfaces in traditional OO languages in that they specify a contract to which an object must adhere. It’s important to recognize that TypeScript’s interfaces are purely logical constructs. Interfaces exist only within the TypeScript sandbox to help enforce type safety; you won’t find any traces of them in the compiled JavaScript. That said, I also consider them to be one of TypeScript’s most powerful and compelling features because of how they embrace the underlying JavaScript’s dynamic nature to improve type safety with code not only within your project but from outside as well.

Defining

Interface definition follows the familiar pattern from other traditional OO languages. Here’s how we could define an interface to represent an object that implements a getArea method:

interface Shape {
    getArea () : number;
}

The Shape interface defines a single member: a function named “getArea” that accepts no parameters and returns a number. This ensures that whenever we have an instance of something that implements the Shape interface, the interface guarantees that instance will have a getArea function.

Implementing with Classes

Implementing interfaces with classes via the implements keyword is perhaps the most familiar way interfaces are used. For example, we could define a Circle class that implements the Shape interface as follows:

class Circle implements Shape {
    constructor(public radius : number) {} // <-- note the public modifier; I told you we'd see it later!
    getArea() { return Math.PI * Math.pow(this.radius, 2); }
}

As you can see, defining a class that implements an interface is almost identical to a normal definition – all we had to do was identify the interface via the implements keyword. Should you need to implement multiple interfaces, simply delimit the names with commas.

Type Compatibility

What’s interesting about TypeScript interfaces is that an object’s type compatibility is based on object structure rather than explicit definition. This greatly increases the power of TypeScript’s interfaces because it means that types including object literals or externally defined types can satisfy an interface’s requirements by virtue of having the correct members. Consider a function which simply prints out a shape’s area:

function writeArea (shape : Shape) {
    document.write("The shape's area is: " + shape.getArea() + "<br />");
}

Here we’ve constrained the shape parameter to types that implement the Shape interface. While we certainly could pass along an instance of the Circle class we defined in the last section, it’s much more interesting to use an object literal! First, we’ll create the function that creates and returns the literal:

function createCircle (radius : number) {
    return { getArea: () => Math.PI * Math.pow(radius, 2) };
}

The createCircle function accepts the radius and returns an object literal with a single member: a getArea function. If we invoke inspect createCircle’s signature, we’ll find that its return type is Any. However, if we invoke the writeArea function, passing the result of createCircle as follows, the compiler is perfectly happy:

writeArea (createCircle(5));

The fact that the object literal includes the getArea function is enough to satisfy the compiler. If we wanted to get more explicit and further prove how the Shape interface is satisfied, we can annotate the createCircle function with a return value (rather than relying on the type inference) like this:

function createCircle (radius : number) : Shape {
    return { getArea: () => Math.PI * Math.pow(radius, 2) };
}

Even with the type annotation in place, the compiler recognizes that the object literal meets the requirements defined by the interface. Used like this, object expressions are conceptually similar to F#’s object expressions in that they can be created ad-hoc but still satisfy the type requirements.

Describing Functions

One of my favorite TypeScript features is that interfaces can also represent function signatures! Function signatures can get quite verbose, especially when dealing with generics (something I’m not covering in this article beyond this brief glimpse). To improve maintainability, you might consider leveraging this feature. For now, we’ll stick with a simple example of how you might define an interface that represents a predicate:

interface Predicate<T> {
    (item: T): boolean;
}

Using this form we’re not defining which members must be implemented, we’re defining the arguments and return type of a function! Now, rather than repeating the signature everywhere, we can simply use the interface name instead as shown in this filter function:

function filter<T> (source : T[], callback : Predicate<T>) {
    var result = [];
    for(var i = 0; i < source.length; i++) {
        var value = source[i];
        if (callback(value)) result[result.length] = value;
    }

    return result;
}

If we wanted to use this function to filter out all of the odd numbers from an array we could write:

var evenNumbers =
    filter(
        [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144],
        i => i % 2 === 0);

Notice how we didn’t need to do anything special with the supplied function – we simply passed it as we would any other argument. The compiler even inferred the correct type parameter for the filter function and, by extension, the Predicate interface!

Modules

Modules provide a convenient mechanism for grouping related code elements while providing some degree of encapsulation and access control. The primary benefit of organizing code in this manner is to avoid polluting the global namespace with miscellaneous types and functions, thus reducing the likelihood of naming conflicts. In many ways they’re very much like modules in F# or static classes in C#. Let’s take a look at how to define a module continuing with our theme of circle geometry:

module CircleGeometry {
    export function getArea(radius : number) {
        return Math.PI * Math.pow(radius, 2);
    }
	
    export function getCircumference(radius : number) {
        return 2 * Math.PI * radius;
    }
}

The CircleGeometry module defines two functions, getArea and getCircumference. By default, module members are private so in order to access these functions from outside the module we export them using the export keyword. With the module in place, we’re free to invoke its functions with a qualified name like this:

CircleGeometryModule.getCircumference(5);

Despite their similarity to static classes, module are represented quite differently in the compiled JavaScript. To see just how different modules and classes are, let’s create a functionally equivalent class with static members and compare the generated JavaScript. First, here’s the class definition:

class CircleGeometry {
    static getArea(radius : number) {
        return Math.PI * Math.pow(radius, 2); 
    }

    static getCircumference(radius : number) {
        return 2 * Math.PI * radius;
    }
}

This results in the following JavaScript:

var CircleGeometry = (function () {
    function CircleGeometry() {
    }
    CircleGeometry.getArea = function (radius) {
        return Math.PI * Math.pow(radius, 2);
    };

    CircleGeometry.getCircumference = function (radius) {
        return 2 * Math.PI * radius;
    };
    return CircleGeometry;
})();

The generated JavaScript isn’t particularly complicated; it assigns the result of an immediate function to the name CircleGeometry. The immediate function creates a void function called CircleGeometry (which would serve as a constructor function should we want to create an instance), sets the two functions as properties of that function, and returns it. Now, compare that with what’s generated for the module we defined earlier:

var CircleGeometry;
(function (CircleGeometry) {
    function getArea(radius) {
        return Math.PI * Math.pow(radius, 2);
    }
    CircleGeometry.getArea = getArea;

    function getCircumference(radius) {
        return 2 * Math.PI * radius;
    }
    CircleGeometry.getCircumference = getCircumference;
})(CircleGeometry || (CircleGeometry = {}));

Again, the generated JavaScript isn’t very complicated but this time, the behavior is much different. Here, a variable called CircleGeometry is declared but not assigned. Next, an immediate function is defined and invoked, passing either an existing CircleGeometry object or a new object (via the assignment operator’s return value) that will expose the functions as properties. Inside the immediate function’s body we can clearly see both the getArea and getCircumference but, rather than being directly assigned to properties on the supplied object, the functions are defined as, well, functions! It’s only after the function definitions that they’re assigned to properties on the supplied object via name. This difference may appear subtle but it means that the immediate function for the module definition is serving as a closure thereby hiding the actual implementations. A script certainly could reassign the properties but it wouldn’t necessarily have access to anything else defined within the closure, particularly not the private members.

Summary

As a super set of JavaScript, TypeScript is by no means a perfect language. It carries JavaScript’s legacy (both good and bad) but I see this as more of a strength than a weakness because the compiler takes advantage of many of JavaScript’s best features and wrangles them into more familiar object oriented and functional patterns. Taken together, the language features described in this article should lead to simpler, more maintainable code which is as important as ever as application complexity continues to increase.

More VS2013 Scroll Bar Magic

Yesterday I wrote about map mode, an exciting enhancement to Visual Studio 2013’s vertical scroll bar. If you haven’t enabled the feature yet, go do it, I’ll wait.

If you had the Productivity Power Tools extension installed prior to enabling the feature, you may have noticed that there are some extra annotations in the scroll bar. These annotations, shown in the form of vertical lines and “bubbles” illustrate scope and nesting level.

You can control whether these annotations are displayed by changing the “Show code structure in the margin” setting under Productivity Power Tools/Other extensions in the options dialog. So far, I think they’re pretty helpful so I plan on leaving them enabled; at least for a while.

EnableCodeStructure

VS2013 Scroll Bar Map Mode

At Nebraska Code Camp this past weekend, Mike Douglas talked a bit about the developer productivity enhancements included in VS2013. One of the features that I’d missed until his talk was the vertical scroll bar’s map mode.

Beyond the now familiar annotations for changes, errors, and cursor position, the scroll bar’s map mode shows a low-resolution depiction of the structure of the code in the current file. This can be helpful for ascertaining the context of a particular piece of code or identifying duplicated code by observing patterns in the structure, among other things.

Perhaps just as useful is that when map mode is enabled, the scroll bar can also show a tooltip containing a preview of the code at any point on the map. To see the tooltip, simply hover over a point of interest.

I’ve only just started to use this feature but I think it’ll aid immensely in code discovery.

EnableScrollbarMap

BinaryReader Change in .NET 4.5

BinaryReader has always had what I consider a design flaw in the way it implemented IDisposable.  Even since the early days of .NET, the BinaryReader constructors have required a stream.  To me this means that the BinaryReader has a dependency on the stream thus something else is responsible for that stream’s lifecycle.  The problem is that when the BinaryReader was disposed it would also close the Stream!

It took until .NET 4.5 but Microsoft finally gave us a way to address this issue in the form of a third constructor overload.  With this constructor we can now specify whether we want to close the provided stream or leave it open.

Why F#?

If you’ve been following along with my posts over the past six months or so you can probably imagine that I’ve been asked some variation of this post’s title more than a few times. One question that I keep getting is why I chose F# over some other functional languages like Haskell, Erlang, or Scala. The problem with that question though is that it’s predicated on the assumption that I actually set out to learn a functional language. The truth is that moving to F# was more of a long but natural progression from C# rather than a conscious decision.

The story begins about two and a half years ago. I had pretty much burned out and was deep into what I can only begin to describe as a stagnation coma. I stopped attending user group meetings; I cut way back on reading; I pretty much stopped doing anything that would help me remain even remotely aware of what was going on in the industry.

It’s easy to explain how I got to this point. My employer at the time gave very little incentive to stay current. For instance, there were homegrown frameworks for nearly every aspect of the product. Who needs NHibernate (or other ORM) when you have a proprietary DAL? Why learn ASP.NET MVC when you have a proprietary system for page layout? What’s the point of diving into WPF when the entire application is Web-based? Introducing new technologies or practices was generally discouraged under the banner of being too risky.

It wasn’t until the company hired a new architect that I started to wake up. He brought a wealth of knowledge of fascinating technologies that I’d hardly heard of and his excitement reminded me of what I loved about technology. My passion for software development was reigniting and I started looking at many of the technologies that had passed me by.

The first technology that really caught my attention during this time was LINQ. I’d consider it my introduction to the wonderful world of functional programming. As cool as I thought its query syntax was, I was really interested in the method syntax. I remember reading early on (although I don’t remember where) that query syntax was only added to C# after users complained that method syntax was too cumbersome in the preview versions. I didn’t understand this sentiment at all because to me method syntax felt completely natural. Gradually I began to realize that the reason it felt so natural was because it works the way I think. Lambda expressions, higher-order functions, composability, all of these functional concepts just spoke to me.

Over time I started using more of C#’s functional aspects but found myself getting increasingly frustrated. For the longest time though I couldn’t quite pinpoint anything specific but something just didn’t feel right anymore. It wasn’t until I was mowing the lawn late on a summer afternoon when I had that “a ha!” moment that changed my life.

That afternoon my yard work podcast selection included Hanselminutes #311. In this episode Richard Minerich and Phillip Trelford were talking about a functional language called F# that had been around for a few years and was built upon the .NET framework. I’d seen a few mentions of F# here and there but before hearing this podcast I hadn’t given it much thought.

At one point the discussion turned to language productivity and Phillip remarked that writing C# feels like completing government forms in triplicate. As he elaborated I experienced a sudden burst of clarity into one of the major things that had been bothering me about C# – it’s verbosity! After listening to the rest of the podcast and hearing more about how the functional nature of the language made it less error prone and how things like default immutability helped alleviate some of the complexity of parallel programming I knew I had to give it a try.

A language that doesn’t affect the way you think about programming, is not worth knowing.
— Alan Perlis, Epigrams on Programming

Despite my love of F# most of my work is still in C# but learning F# has had an amazing impact on how I write C#. C# has been becoming more of a functional language with virtually every new release and I’ve been using many of those capabilities for a few years but the language is hardly built around them. That said, forgetting about all the times I’ve typed let instead of var or tried to define a default constructor in the class signature in the last week alone F# really has changed the way I work. I find myself writing more higher-order functions and making much better use of delegation; I’ve developed a strong preference for readonly member fields and properties; and I regularly find myself longing for some of F#’s constructs like tuples, records, discriminated unions, and pattern matching.

The truth is that for all of its strengths though I’m finding working in C# increasingly annoying especially as I continue to work with F#. In many ways, working with C# feels like interacting with a toddler. I feel like I have to hold its hand and guide it along with explicit instructions on what to do every step of the way – even if I’ve already told it something. On the other hand, F# feels like having a personal assistant. Its functional nature allows me to more or less describe what I want and it handles the details.

There are plenty of things that make me prefer F# over C# but I’d like to highlight a few in particular. I’ve already written extensively about some of these and will be writing more about others as time permits but here I’d like to look at them from a more comparative angle.

Terse Syntax

Even though F# is a functional-first language I think a great way to illustrate the language’s expressiveness is with an object-oriented. We’ll start with a simple class definition in C#.

public class CircleMeasurements
{
  private double _diameter;
  private double _area;
  private double _circumference;

  public CircleMeasurements(double diameter, double area, double circumference)
  {
    _diameter = diameter;
    _area = area;
    _circumference = circumference;
  }

  public double Diameter { get { return _diameter; } }
  public double Area { get { return _area; } }
  public double Circumference { get { return _circumference; }  }
}

// Usage
var measurements = new CircleMeasurements(5.0, 19.63495408, 15.70796327);

Look how much code was required to create a class with three properties. Even in this small example we had to tell the compiler what type to use for each value three times! I could have used auto-implemented properties to simplify it a bit but even then we still need to tell the compiler which type to use twice for each value. Now let’s take a look at the same class in F#:

type CircleMeasurements(diameter, area, circumference) =
  member x.Diameter = diameter
  member x.Area = area
  member x.Circumference = circumference

// Usage
let measurements = CircleMeasurements(5.0, 19.63495408, 15.70796327);

That’s it – four lines of code. Granted this compiles to something that resembles the C# class but we didn’t have to write any of it and thanks to the fantastic type inference system we didn’t have to tell the compiler multiple times which type to use for each value. Quite often though even this is more verbose than we actually need. Many times we can use another F# construct – a record type – to represent something we’d represent with a class in other .NET languages:

type CircleMeasurements = { Diameter : float; Area : float; Circumference : float }

// Usage
let measurements = { Diameter = 5.0; Area = 19.63495408; Circumference = 15.70796327 }

In addition to being even more concise than the corresponding class definition, record types have the added benefit of being structurally comparable so we can easily check for equality between two instances. Record types do require us to include the type annotations but we only need to explicitly tell the compiler what to use once for each value and the constructor and properties are each created implicitly.

Functional Style

I already mentioned that functional programming feels more natural to me and by design F# really shines when it comes to expressing and using functions. Traditional .NET development has always had some type of support for delegation and it has definitely improved over the years, particularly with the common generic delegate classes (e.g.: Func, Action) and lambda expressions but actually trying to use them in a more functional style is a pain. This is complicated by the fact that in some situations the C# compiler can’t infer whether a lambda expression is a delegate or an expression tree. Although in some regards I prefer the C# lambda expression syntax I definitely prefer F#’s syntactic distinction between delegates and code quotations.

While on the topic of functional programming I have to mention F#’s default immutability. Immutability is key to any functional language and has been shown to improve overall program correctness by eliminating side effects. C# has some support for immutability through readonly fields or by omitting setters from property definitions but either of these require a conscious decision to enable. Nearly everything in F# is immutable unless explicitly declared otherwise. Immutability also provides benefits when writing asynchronous code because if nothing is changing, there’s a reduced need for locking.

Discriminated Unions

If you haven’t read my post on discriminated unions yet, they’re a way to express simple object hierarchies (or even tree structures) in a concise manner. From a syntactic perspective they’re incredibly simple but trying to replicate them even for simple structures really isn’t particularly feasible in C#. Here’s a simple example:

In this example suit is another discriminated union.

type Card =
| Ace of Suit
| King of Suit
| Queen of Suit
| Jack of Suit
| ValueCard of Suit * int

Using this discriminated union we express the 7 of hearts as ValueCard(Heart, 7). For illustration of what it would take to represent this structure in C# I’m including a screenshot of ILSpy’s decompilation. Note that I’ve only included the signatures and even this five case discriminated union more than fills my screen. Just to drive the point home, fully expanded, this code is nearly 700 lines long! Granted there are a few CompilerGeneratedAttributes in there but they hardly count for a majority of the code.

Ultimately, the union type is an abstract class and each case is a nested class. The nested classes are each assigned a unique tag that’s used for type checking and code branching in some of the union type’s methods. Not included in the screenshot are implementations of several interfaces and overrides of Equals and GetHashCode.

Decompiled Card Discriminated Union

Decompiled Card Discriminated Union

Collection Types & Comprehensions

C# has made great strides in regard to initializing various collection types but it still pales in comparison to the constructs offered by F#. Don’t get me wrong, collection initializers are a nice syntactic convenience but nearly every time I use it I think how much easier it would likely be with a comprehension. LINQ can address some of these shortcomings but even convenience methods like Enumerable.Range feel limiting. Yes, I could write some additional convenience methods to address some of the shortcomings but in F# I don’t have to.

Part of the beauty of comprehensions is that they generally apply regardless of which collection type you’re creating. Although each of the examples below create F# lists they can easily be modified to create sequences or arrays instead.

// Numbers 0 - 10
> [0..10];;
val it : int list = [0; 1; 2; 3; 4; 5; 6; 7; 8; 9; 10]

// Numbers 0 - 10 by 2
> [0..2..10];;
val it : int list = [0; 2; 4; 6; 8; 10]

// Charcters 'a' - 'z'
> ['a'..'z'];;
val it : char list =
  ['a'; 'b'; 'c'; 'd'; 'e'; 'f'; 'g'; 'h'; 'i'; 'j'; 'k'; 'l'; 'm'; 'n'; 'o';
   'p'; 'q'; 'r'; 's'; 't'; 'u'; 'v'; 'w'; 'x'; 'y'; 'z']

// Unshuffled deck
> let deck =
  [ for suit in [ Heart; Diamond; Spade; Club ] do
      yield Ace(suit)
      yield King(suit)
      yield Queen(suit)
      yield Jack(suit)
      for v in 2 .. 10 do
        yield ValueCard(suit, v)
  ];;

val deck : Card list =
  [Ace Heart; King Heart; Queen Heart; Jack Heart; ValueCard (Heart,2);
   ValueCard (Heart,3); ValueCard (Heart,4); ValueCard (Heart,5);
   ValueCard (Heart,6); ValueCard (Heart,7); ValueCard (Heart,8);
   ValueCard (Heart,9); ValueCard (Heart,10); Ace Diamond; King Diamond;
   Queen Diamond; Jack Diamond; ValueCard (Diamond,2); ValueCard (Diamond,3);
   ValueCard (Diamond,4); ValueCard (Diamond,5); ValueCard (Diamond,6);
   ValueCard (Diamond,7); ValueCard (Diamond,8); ValueCard (Diamond,9);
   ValueCard (Diamond,10); Ace Spade; King Spade; Queen Spade; Jack Spade;
   ValueCard (Spade,2); ValueCard (Spade,3); ValueCard (Spade,4);
   ValueCard (Spade,5); ValueCard (Spade,6); ValueCard (Spade,7);
   ValueCard (Spade,8); ValueCard (Spade,9); ValueCard (Spade,10); Ace Club;
   King Club; Queen Club; Jack Club; ValueCard (Club,2); ValueCard (Club,3);
   ValueCard (Club,4); ValueCard (Club,5); ValueCard (Club,6);
   ValueCard (Club,7); ValueCard (Club,8); ValueCard (Club,9);
   ValueCard (Club,10)]

Pattern Matching

faced with a 20th century computer
Scotty: Computer! Computer?
He’s handed a mouse, and he speaks into it
Scotty: Hello, computer.
Dr. Nichols: Just use the keyboard.
Scotty: Keyboard. How quaint.
— Star Trek IV

The above conversation enters my mind when I’m working with C#’s branching constructs, switch statements in particular.  F#’s pattern matching may bear a slight resemblance to C# switch statements but they’re so much more powerful.  switch statements limit us to simply branching on constant values but pattern matching allows value extraction, multiple cases, and refinement constraints for more precise control with a syntax much friendlier than your common if/else statement. Furthermore, like virtually everything else in F#, pattern matches are expressions so they return a value making them ideal candidates for inline conditional bindings.

Units of Measure

Every programmer works with code that uses different units of measure. In most languages dealing with units of measure is error prone because they require discipline from the developer to ensure that the correct units are always used. There are some libraries that attempt to address the problem but to my knowledge (please correct me if I’m wrong) F# is the only one to actually include it in the type system. F#’s unit of measure support is complete enough that it can often automatically convert values between units, particularly when a conversion expression is included with the type definition.

[<Measure>] type px
[<Measure>] type inch
[<Measure>] type dot = 1 px
[<Measure>] type dpi = dot / inch

let convertToPixels (inches : float<inch>) (resolution : float<dpi>) : float<px> =
  inches * resolution

let convertToInches (pixels : float<px>) (resolution : float<dpi>) : float<inch> =
  pixels / resolution

The biggest downfall of F#’s units of measure is that they’re a feature of the type system and compiler rather than a CLR feature. As such, the compiled code doesn’t have any unit information and we can’t enforce the units in cross-language scenarios.

Object Expressions

I haven’t written about object expressions yet but they’re definitely on my backlog. Object expressions provide a way to create ad-hoc (anonymous) types based on one or more interfaces or a base class. They’re useful in a variety of scenarios like creating one-off formatters or comparers and I think they can at least supplement, if not replace some mocking libraries. To illustrate we’ll use a somewhat contrived example of a logging service.

type LogMessageType =
| DebugMessage of string
| InfoMessage of string
| ErrorMessage of string

type ILogProvider =
  abstract member Log : LogMessageType -> unit

type LogService(provider : ILogProvider) =
  let log = provider.Log
  member this.LogDebug msg = DebugMessage(msg) |> log
  member this.LogInfo msg = InfoMessage(msg) |> log
  member this.LogError msg = ErrorMessage(msg) |> log

Here we use a discriminated union to define the message types the logging provider interface can handle. The log service itself provides a slightly friendlier API than the provider interface itself. Normally if we wanted to use the log service instance we’d have to define a concrete implementation of ILogProvider but object expressions allow us to easily define one inline.

let logger =
  LogService(
    { new ILogProvider with
      member this.Log msg =
        match msg with
        | DebugMessage(m) -> printfn "DEBUG: %s" m
        | InfoMessage(m) -> printfn "INFO: %s" m
        | ErrorMessage(m) -> printfn "ERROR: %s" m
    }
  )

In our object expression we use pattern matching to detect the message type, extract the associated string, and write an appropriate message to the console.

> logger.LogDebug "message"
logger.LogInfo "message"
logger.LogError "message";;
DEBUG: message
INFO: message
ERROR: message

What’s Not So Great?

I could continue on for a while with things I like about F# but this post is already long enough as it is. That said, I think it’s only fair to list out a few of the things I don’t like about the language.

Tooling Support

Probably the biggest gripe I have is the lack of tooling support around the language. So many tools and templates that I take for granted when working with C# simply aren’t available. Things like IntelliSense are pretty complete but if you’re just getting started with F# and looking to do more than a console application or library be prepared to spend some time looking for 3rd party templates and reading blog posts.

Language Interop

On a somewhat related note, even though F# compiles to MSIL and can reference or be referenced by other .NET assemblies there are some quirks that make language interoperability a but cumbersome. For instance, using extension methods defined in C# doesn’t work as cleanly as I thought they would. When I was experimenting with MassTransit I couldn’t get the UseMsmq, VerifyMsmqConfiguration, or a number of other extension methods to appear no matter what I tried. I ultimately had to call the static methods directly.

I’ve read that this is addressed in F# 3.0 but I haven’t done enough with 3.0 yet to confirm.

Project Structure

It’s not really fair to put this under the “what’s not so great?” heading but it seemed most appropriate. This isn’t so much an issue with the language as much as it’s a big mindset shift of a similar magnitude of switching from OO to functional. The structure of an F# project is significantly different than that of a C# (or even VB) project and is something I’m still struggling with.

In C# we generally organize code into folders representing namespaces and keep one type (class) per file. F# evaluates code from top down throughout the project so file sequence is significant. Furthermore, code is organized by modules rather than type. F# does have namespaces but even then they’re usually divided across one or more files and from my experience, not grouped by folder.

Wrap Up

No matter what language you work in, programming in a functional style provides benefits. You should do it whenever it is convenient, and you should think hard about the decision when it isn’t convenient.
— John Carmack, Functional Programming in C++

In general I’ve found that the more I learn and work with F# the more I like it. I regularly find myself reaching for it as my first choice, especially when it comes to new development. Although there are a few things that I don’t like about working in F# most of them just require more diligence on my part or are easily managed. I’ve only listed a few key areas where I think F# excels but I firmly believe that their strengths far outweigh any weaknesses.

Revisiting String Distances

Back in April I wrote about three different string distance algorithms.  If you’re not familiar with string distance algorithms I encourage you to read the original post but in short, these algorithms provide different ways to determine how many steps are required to translate one string into another.  Since the algorithms I wrote about aren’t particularly complex I thought translating them to F# would be a fun exercise.

In general the code is pretty similar to its C# counterpart but I was able to take advantage of a few F# constructs, tuples and pattern matching in particular.  These examples may or may not be the best way to implement the algorithms in F# (ok, I’ll be honest, they’re probably not) but they certainly show how well suited F# is to this type of work.

Hamming Distance

The Hamming distance algorithm simply counts how many characters are different in equal length strings.

let hammingDistance (source : string) (target : string) =
  if source.Length <> target.Length then failwith "Strings must be equal length"
 
  Array.zip (source.ToCharArray()) (target.ToCharArray())
  |> Array.fold (fun acc (x, y) -> acc + (if x = y then 0 else 1)) 0

hammingDistance "abcde" "abcde" |> printfn "%i" // 0
hammingDistance "abcde" "abcdz" |> printfn "%i" // 1
hammingDistance "abcde" "abcyz" |> printfn "%i" // 2
hammingDistance "abcde" "abxyz" |> printfn "%i" // 3
hammingDistance "abcde" "awxyz" |> printfn "%i" // 4
hammingDistance "abcde" "vwxyz" |> printfn "%i" // 5

Levenshtein Distance

The Levenshtein distance algorithm counts how many characters are different between any two strings taking into account character insertions and deletions.

open System
let levenshteinDistance (source : string) (target : string) =
  let wordLengths = (source.Length, target.Length)
  let matrix = Array2D.create ((fst wordLengths) + 1) ((snd wordLengths) + 1) 0

  for c1 = 0 to (fst wordLengths) do
    for c2 = 0 to (snd wordLengths) do
      matrix.[c1, c2] <-
        match (c1, c2) with
        | h, 0 -> h
        | 0, w -> w
        | h, w ->
          let sourceChar, targetChar = source.[h - 1], target.[w - 1]
          let cost = if sourceChar = targetChar then 0 else 1
          let insertion = matrix.[h, w - 1] + 1
          let deletion = matrix.[h - 1, w] + 1
          let subst = matrix.[h - 1, w - 1] + cost
          Math.Min(insertion, Math.Min(deletion, subst))
  matrix.[fst wordLengths, snd wordLengths]

levenshteinDistance "abcde" "abcde" |> printfn "%i" // 0
levenshteinDistance "abcde" "abcdz" |> printfn "%i" // 1
levenshteinDistance "abcde" "abcyz" |> printfn "%i" // 2
levenshteinDistance "abcde" "abxyz" |> printfn "%i" // 3
levenshteinDistance "abcde" "awxyz" |> printfn "%i" // 4
levenshteinDistance "abcde" "vwxyz" |> printfn "%i" // 5
levenshteinDistance "abcdefg" "abcde" |> printfn "%i" // 2
levenshteinDistance "abcdefg" "zxyde" |> printfn "%i" // 5
levenshteinDistance "abcde" "bacde" |> printfn "%i" // 2
levenshteinDistance "abcde" "baced" |> printfn "%i" // 4

Damerau-Levenshtein Distance

The Damerau-Levenshtein algorithm builds upon the Levenshtein algorithm by taking transpositions into account.

open System
let damerauLevenshteinDistance (source : string) (target : string) =
  let wordLengths = (source.Length, target.Length)
  let matrix = Array2D.create ((fst wordLengths) + 1) ((snd wordLengths) + 1) 0

  for h = 0 to (fst wordLengths) do
    for w = 0 to (snd wordLengths) do
      matrix.[h, w] <-
        match (h, w) with
        | x, 0 -> x
        | 0, y -> y
        | x, y ->
          let sourceChar, targetChar = source.[x - 1], target.[y - 1]
          let cost = if sourceChar = targetChar then 0 else 1
          let insertion = matrix.[x, y - 1] + 1
          let deletion = matrix.[x - 1, y] + 1
          let subst = matrix.[x - 1, y - 1] + cost

          let distance = Math.Min(insertion, Math.Min(deletion, subst))

          if x > 1 && w > 1 && sourceChar = target.[y - 2] && source.[x - 2] = targetChar then
            Math.Min(distance, matrix.[x - 2, y - 2] + cost)
          else
            distance
  matrix.[fst wordLengths, snd wordLengths]

damerauLevenshteinDistance "abcde" "abcde" |> printfn "%i" // 0
damerauLevenshteinDistance "abcde" "abcdz" |> printfn "%i" // 1
damerauLevenshteinDistance "abcde" "abcyz" |> printfn "%i" // 2
damerauLevenshteinDistance "abcde" "abxyz" |> printfn "%i" // 3
damerauLevenshteinDistance "abcde" "awxyz" |> printfn "%i" // 4
damerauLevenshteinDistance "abcde" "vwxyz" |> printfn "%i" // 5
damerauLevenshteinDistance "abcdefg" "abcde" |> printfn "%i" // 2
damerauLevenshteinDistance "abcdefg" "zxyde" |> printfn "%i" // 5
// Transpositions
damerauLevenshteinDistance "abcde" "bacde" |> printfn "%i" // 1
damerauLevenshteinDistance "abcde" "baced" |> printfn "%i" // 2

F# Enumerations

I was originally going to include this discussion about F# enumerations in the discriminated unions post because their syntax is so similar but the section quickly took on a life of its own. Enumerations in F# are the same as enumerations from other .NET languages so all of the same capabilities and restrictions apply.

Defining Enumerations

Defining an enumeration in F# is much like defining a discriminated union but here we bind an integral value. Unlike in C# though, F# doesn’t automatically generate values for each item so we need to give every case a value.

type DayOfWeek =
| Sunday = 0
| Monday = 1
| Tuesday = 2
| Wednesday = 3
| Thursday = 4
| Friday = 5
| Saturday = 6

Changing Base Type

We can also change the enumeration’s underlying type by changing the suffix of each value. For example, if we wanted to change the underlying type for the DayOfWeek enumeration to byte we could do so by appending a y to each value as follows:

type DayOfWeekByte =
| Sunday = 0y
| Monday = 1y
| Tuesday = 2y
| Wednesday = 3y
| Thursday = 4y
| Friday = 5y
| Saturday = 6y

Referencing Enumeration Values

As is the case with enumerations in other .NET languages enumeration values are referenced through the enumeration name.

> DayOfWeek.Friday
val it : DayOfWeek = Friday

FlagsAttribute

Also like in other .NET languages, F# enumerations support the FlagsAttribute to indicate that each value in the enumeration represents a bit position.

open System

[<Flags>]
type Border =
| None = 0
| Right = 1
| Top = 2
| Left = 4
| Bottom = 8;;

let showBorder = Border.Bottom ||| Border.Left

To determine whether a particular flag is set we can either use bit-math or we can use the enumeration’s HasFlag method. Since I generally prefer the code I don’t have to write so I generally favor built-in approach:

> showBorder.HasFlag Border.Top;;
val it : bool = false

> showBorder.HasFlag Border.Left;;
val it : bool = true

Conversion

For those times when we only have the underlying integer value and want to convert that value to the corresponding enum value we can use the built-in generic enum function:

> enum<DayOfWeek> 5
val it : DayOfWeek = Friday

Note: The signature of the enum function above is int32 -> 'U so it can only be used with enumerations based on int32. If your enumeration is based on another data type such as with the DayOfWeekByte example above you’ll need to use the generic EnumOfValue function from the Microsoft.FSharp.Core.LanguagePrimitives module instead.

Likewise, we can get the underlying value by calling the corresponding conversion function:

int DayOfWeek.Friday
// val it : int = 5

byte DayOfWeekByte.Friday
// val it : byte = 5uy

F# Discriminated Unions

When I started learning F# discriminated unions were probably the first thing that really sold me on the language only partially because there’s nothing like them in C#.  From a traditional .NET background a quick glance at them, particularly simple ones, may lead you to believe that they’re just glorified enumerations but in reality, discriminated unions are so much more powerful.  While enumerations are limited to a single, fixed integral value but applications for discriminated unions go far beyond that.

Like enumerations, discriminated unions allow us to define values that can only be one of a specified set of named cases but that’s where the similarities end.  Discriminated unions allow us to associate zero or more data items of any type with each case.  The effect is that we can easily define tree structures, simple object hierarchies, and even leverage pattern matching.

Discriminated Unions as Object Hierarchies

I think the easiest way to get into discriminated unions is to start off with an example.  In my studies of the language I’ve become quite fond of the example I first encountered in Programming F# where discriminated unions and a list comprehension are used to define a standard, unshuffled card deck.  This example does a great job illustrating not only simple, “dataless” unions but also shows how we can associate multiple data items and even combine discriminated unions to achieve more expressive code.

We start by defining a discriminated union for each suit. The suits don’t have any data associated with them so this discriminated union simply includes the case names.

type Suit =
| Heart
| Diamond
| Spade
| Club

The second discriminated union defines the individual cards. Each card belongs to a particular suit and non-face cards will also include their denomination.

type Card =
| Ace of Suit
| King of Suit
| Queen of Suit
| Jack of Suit
| ValueCard of Suit * int

With just these two constructs we can define a value representing any card in a standard deck.  For instance, if we wanted to represent the Ace of Spades, or 7 of Hearts we’d write Ace(Spade) and ValueCard(Heart, 7), respectively.

We don’t need to include to the actual discriminated union name when referring to a case. Should there be a naming conflict we’re free to include the union name with dot syntax (Suit.Heart) or we could instruct the compiler to force a qualified name with the RequireQualifiedAccess attribute.

If you were paying attention during the Card example you probably noticed a familiar syntax. When multiple data elements are associated with a case (such as with ValueCard) we use the tuple syntax to identify each element’s data type. While this allows for a lot of flexibility it also brings along some of the drawbacks of tuples such as not being able to define names for each value. We’ll look at one way to work around this limitation a little later when we discuss pattern matching.

The tuples in discriminated unions are syntactic tuples. The types the compiler generates for each case are nested classes that derive from the discriminated union type. The data elements are exposed as read-only properties following the same conventions as tuples but there aren’t any actual tuple class instances involved.

Now that we’re armed with some basic knowledge of discriminated unions we can use a list comprehension to easily construct a full, unshuffled deck.

let deck =
  [ for suit in [ Heart; Diamond; Spade; Club ] do
      yield Ace(suit)
      yield King(suit)
      yield Queen(suit)
      yield Jack(suit)
      for v in 2 .. 10 do
        yield ValueCard(suit, v)
  ]

Option Type

No introductory discussion about discriminated unions would be complete without at least a brief mention of the option type. The option type is a predefined discriminated union (Microsoft.FSharp.Core.FSharpOption) that’s useful for indicating when a value may or may not exist (remember: null generally isn’t used in F#). There are only two cases in the option type: Some(_) and None. Because the type itself is an unconstrained generic type, its cases can be used to indicate the presence of any value.

Although options are generally referenced in pattern matching expressions there’s an associated option module that contains several useful functions including Option.get, Option.bind, and Option.exists.

Discriminated Unions as Trees

A really nice aspect of discriminated unions is that they can be self-referencing. This makes creating simple tree structures trivial. To illustrate let’s create a very rudimentary HTML structure.

type Markup =
| Element of string * Markup list
| EmptyElement of string
| Content of string

Note that the Element case has two associated pieces of data: a string (the element name), and a list of additional Markup instances. From just these three cases we can quickly represent a simple HTML document:

let htmlTree =
  Element("html",
    [
      Element("head", [ Element("title", [ Content("Very Simple HTML") ]) ])
      Element("body",
        [
          Element("article",
            [
              Element("h1", [ Content("Discriminated Unions") ])
              Element("p",
                [
                  Content("See how ")
                  Element("b", [ Content("easy") ])
                  Content(" this is?")
                ])
              Element("p", [ Content("F#"); EmptyElement("br"); Content("is fun") ])
            ])
        ])
    ])

Discriminated Unions in Pattern Matching

We’ve seen how discriminated unions can be used to easily define both object hierarchies and tree structures. Like with so many things in F# though that power is amplified when combined with pattern matching. With pattern matching we can not only can we provide conditional logic for operating against each case but we can also work around the limitation I mentioned earlier where each associated data item is assigned a less than useful name.

Let’s expand upon the Markup example from the previous section to illustrate. Here we’ll define a recursive function to convert the htmlTree value to an HTML string.

let rec toHtml m : string =
  match m with
  | Element (tag, children) ->
      let sb = System.Text.StringBuilder()
      let append (s : string) = sb.Append s |> ignore

      append (sprintf "<%s>" tag)
      List.iter (fun c -> append (c |> toHtml)) children
      append (sprintf"</%s>" tag)
      sb.ToString()
  | EmptyElement tag -> sprintf "<%s />" tag
  | Content v -> sprintf "%s" v;;

htmlTree |> toHtml
> val it : string =
  "<html><head><title>Very Simple HTML</title></head><body><article><h1>Discriminated Unions</h1><p>See how <b>easy</b> this is?</p><p>F#<br />is fun</p></article></body></html>"

In this example we have a pattern for each Markup case. Each case is covered by a pattern so there’s no need to introduce the wildcard (_) pattern here. Each pattern includes an identifier for each data item associated with the case. The identifiers allow us to refer to each item with something meaningful rather than the default name. Identifiers are also useful within when constraints to provide more granular control over what matches a given pattern. For instance if we wanted to apply different handling for p elements we could add the pattern | Element (tag, children) when tag = "p" -> ... to the beginning of the match expression.

Members in Discriminated Unions

Just as with record types, discriminated unions ultimately compile down to classes. As such, we can expand upon them with additional members such as methods. Continuing with the Markup it makes sense to keep the toHtml method with the Markup type so let’s combine them.

To convert the function to a method we only need to make a few small tweaks. First, we’ll eliminate the argument since we’ll be operating against the current instance. We’ll also add a type annotation to the delegate used by the List.iter function.

type Markup =
| Element of string * Markup list
| EmptyElement of string
| Content of string
  member x.toHtml() : string =
    match x with
    | Element (tag, children) ->
        let sb = System.Text.StringBuilder()
        let append (s : string) = sb.Append s |> ignore

        append (sprintf "<%s>" tag)
        List.iter (fun (c : Markup) -> append (c.toHtml())) children
        append (sprintf"</%s>" tag)
        sb.ToString()
    | EmptyElement tag -> sprintf "<%s />" tag
    | Content v -> sprintf "%s" v

Note that the member function is no longer marked as recursive. This is because recursion is implicitly allowed for members.

Now that the toHtml function is a method of the Markup type we can call it like we would a method on any other type:

htmlTree.toHtml()
> 
val it : string =
  "<html><head><title>Very Simple HTML</title></head><body><article><h1>Discriminated Unions</h1><p>See how <b>easy</b> this is?</p><p>F#<br />is fun</p></article></body></html>"

Wrapping Up

Discriminated unions are a highly useful construct with applications far beyond that of enumerations. Whether they’re being used to construct an object hierarchy or represent a tree structure discriminated unions can greatly enhance the clarity of your code.

Sometimes a discriminated union is bit overkill for the problem space and an enumeration is all that’s needed. In the next post we’ll explore how to define enumerations in F#.

Concatenating PDFs with F# and PdfSharp

In order to test part some new functionality I’m working on I needed to combine a bunch of somewhat arbitrary documents into one. At first I thought about just re-scanning the documents but the scanner attached to my PC is a single-sheet flatbed. I also thought about running them through our sheet-feed scanner but it’s attached to my counterpart’s PC, not shared, and I didn’t want to bother him with it. I’ve been working a bit with the PdfSharp library lately so I decided to do the programmer thing and write a script to concatenate the documents. Of course, this would be a perfect chance to flex my developing F# muscles and came up with what I think is a pretty decent solution.

Before taking a look at the script there are a few things to note:

  • The script is intended to be run from within a folder that contains the documents that you want to concatenate. This is mainly because I had a bunch of documents I wanted to combine and including the full path to each one quickly became unwieldy.
  • I don’t check that files exist or that what’s supplied is actually a PDF. PdfSharp will throw an exception in those cases
  • I needed something quick so I hard-coded the destination file name as Result.pdf.
  • fsi.CommandLineArgs includes the name of the script as the first array item. The easiest thing I could think of for getting a list that didn’t include the script name was to create a new list with Array.toList and grab the Tail.
  • The PdfPages class is built around IEnumerable rather than IEnumerable<T> so I needed to use a comprehension to build the pages list.

Script:

// PdfConcat.fsx

#r "PdfSharp.dll"

open System;
open System.IO
open PdfSharp.Pdf
open PdfSharp.Pdf.IO

let readPages (sourceFileName : string) =
  use source = PdfReader.Open(sourceFileName, PdfDocumentOpenMode.Import)
  [ for p in source.Pages -> p ]

let createDocFromPages pages =
  let targetDoc = new PdfDocument()
  pages |> List.iter (fun p -> targetDoc.Pages.Add p |> ignore)
  targetDoc

let docNames = (Array.toList fsi.CommandLineArgs).Tail
let workingDirectory = Environment.CurrentDirectory;
let targetFileName = Path.Combine(workingDirectory, "Result.pdf")

let allPages =
  [ for n in docNames -> Path.Combine(workingDirectory, n) ]
  |> List.map readPages
  |> List.concat

let doc = createDocFromPages allPages
doc.Save targetFileName
doc.Dispose()

Usage:

D:\ScannedDocuments>fsi D:\Dev\FSharp\PdfConcat\PdfConcat.fsx "07-09-2012 05;29;44PM.PDF" "07-11-2012 08;47;13PM.PDF" "07-11-2012 08;48;24PM.PDF" "07-11-2012 08;50;27PM.PDF"

CODODN Notes

I spent Saturday (Dec 8) over in Columbus, OH attending the Central Ohio Day of .NET conference. As with any multi-track conference there were plenty of good sessions to choose from but of course, I could only attend five. What follows are my raw, nearly unedited notes from the sessions I selected.

Underestimated C# Language Features

John Michael Hauck (@john_hauck | http://w8isms.blogspot.com)

Delegation

  • Anonymous functions
  • Closures
  • Automatic closures compile to classes

Yay!  ANOTHER discussion about using var or type name for declaring variables

Enumeration

  • IEnumerable<T>
  • yield return
  • IEnumerable of delegates

Deep Dive into Garbage Collection

Patrick Delancy (@patrickdelancy | http://patrickdelancy.com/)

Automatic Memory Management

  • Reference counting
  • Track & mark

Allocating

  • Application virtual memory
  • Heap is automatically reserved
  • New declarations are allocated into segments in the heap
  • Ephemeral segments
  • Clean-up tries to move longer lasting objects into other dedicated ephemeral segments
  • Longer-lasting objects are collected less frequently

GC Timing

  • When an ephemeral segment is full
  • GC.Collect() method call
  • Low memory notifications from the OS (physical memory)

Only finalization is non-deterministic

GC Process

  • Marking – Identifying objects for removal
  • Relocating – Relocating surviving pointers to older segments
  • Compacting – Moving actual object memory

Dead or Alive

Starting points:

  • Static Data is rooted in garbage collector (never out of scope)
  • Stack Root (call stack) has variable references
  • GC Handles – special cases that can be created and freed

Collector walks the tree from each point

GC Handles

Struct that has a reference to a managed object and can be passed off to unmanaged/native code

Marshalling

GCHandle.Alloc() and .Free()

Handle Types:

  • Weak – tracks the object but allows collection; Zeroed when collected; Zeroed before finalizer
  • WeakTrackResurrection – Weak but is not zeroed when collected; Can resurrect in finalizer
  • Normal – Opaque address via the handle (can’t get the address); not collected by GC; Managed object w/o managed references
  • Pinned – Normal with accessible address. Not collected or moved by GC; free quickly if needed

Generations

  • Generation 0 – newly allocated objects
  • Generation 1 – objects that survived gen 0; infrequent collection
  • Generation 2 – Long-lived objects; things allocated as large objects too; infrequent collection

Anything larger than 85K in memory is considered large and automatically placed on the large object heap.

Threshold checks consider machine resources including bitness and is always in flux (except gen 0 which is based on segment size)

Configuration

Workstation & server mode:

  • Default behavior is workstation
  • Workstation
    • Single-processor machines
    • Collection happens on triggering thread
    • Normal thread priority
  • Server
    • Multi-processor
    • Separate GC per thread processor
    • Parallel on all threads
    • Collection at highest thread priority
    • Intended to maximize GC throughput and scalability

Concurrency:

  • Foreground
    • Non-concurrent
    • All other threads are suspended until collection finishes
  • Concurrent & Background
    • Collect concurrently while app is running
    • Background (v4) is the replacement for concurrent
    • Collects concurrently while the app is running
    • Only applies to Gen 2 collection
    • Prevents Gen 0 & 1 collections while running

Latency modes

Time the user knows the system is busy

Useful especially when users will notice the collection – graphics rendering

  • Batch
    • Default when concurrency is disabled
    • Most intrusive when running
  • Interactive
    • Default when concurrency is enabled
    • Less intrusive but still accommodates the GC
  • LowLatency
    • For short-term use – impedes the GC process
    • Suppresses Gen 2 collection
    • Workstation mode only
    • Still collects under low host memory
    • Allows manual collection
  • SustainedLowLatency
    • Added in 4.5
    • Contained but longer
    • Suppresses foreground gen 2 collection
    • Only available with concurrency
    • Does not compact managed heap

Finalization

Non-deterministic

Only used with native resources (IntPtr handles)

  • To release unmanaged resources ONLY
  • Objects with a finalizer get promoted to the next generation and the next time the collector hits that generation it calls the finalizer; added to finalizaton queue
  • Finalizer runs on another thread after collection has finished and runs at the highest thread priority
  • Managed objects may already be freed
  • Flagged for finalization upon allocation
  • When troubleshooting performance check the size of the finalizer queue and look for hung finalizer threads

Disposable

Objects using managed resources that need to be released

Debug Build

  • Does some artificial rooting due to CLR optimizations

Unleashing the Power: A Lap around PowerShell

Sarah Dutkiewicz (@sadukie | http://codinggeekette.com/)

Covering PowerShell 3.0

Some Cmdlets:

  • Clear – clears the screen
  • Get-verb
  • Get-unique

Console

  • Part of Windows Management Framework 3.0 suite
    • 32/64 bit available
    • Windows 8 & Server 2012 have it installed
  • Built on CLR 4
  • Adds support for
    • Networking
    • Parallelism
    • MEF
    • Compatibility/Deployment
    • WWF
    • WCF
  • show-command cmdlet
    • allows searching for commands and executing in the console
    • often easier than get-command | more
  • Updatable help
    • Not installed by default on Win8/Server 2012
    • Update-Help cmdlet
    • Must run as administrator
    • No restart required
    • Can store to a network share
  • Language improvements
    • Simplified foreach (automatic/implicit)
      • PS C:\windows\system32> $verbs = get-verb
      • PS C:\windows\system32> $verbs.Group | Get-Unique
    • Simplified where
      • PS C:\windows\system32> $verbs | where group -eq “Data”
    • Enhanced tab completion
  • Unblock files with Unblock-File cmdlet
  • JSON, REST, & HTML Parsing support
  • Session improvements
    • Disconnected sessions on remote computers can be reconnected later w/o losing state
    • Both ends need 3.0

Integrated Scripting Environment

  • IntelliSense
  • Show-Command window
  • Unified console pane
  • Rich copy
  • Block copy
  • Snippets
  • Brace Matching

Scheduled Jobs

  • PowerShell jobs can now be integrated with task scheduler
    • Register-ScheduledJob
    • Found under Microsoft / Windows / PowerShell / ScheduledJobs

Autoloading Modules

  • PowerShell 3 automatically loads a module when one of its cmdlet is invoked

PowerShell Web Access

  • PowerShell in a browser
  • Mostly 3.0 w/ some limitations due to remotin
  • Prereqs:
    • Server 2012
  • Setup is difficult
    • Install web access
    • Authorization configuration
    • PowerShell remoting must be enabled to connect
  • Limitations:
    • Some function keys aren’t supportd
    • Only top-level progress is shown
    • Input color cannot be changed
    • Writing to console doesn’t work

Management OData IIS Extension

  • Allows RESTful OData access
  • Requires Server 2012, not Server Core

MS Script Explorer for Windows PowerShell

  • 32/64-bit platforms
  • PowerShell ISE is required
  • Downloadable (Non-standard)

Building Large Maintainable JavaScript Applications w/o a Framework

Steve Horn (@stevehorn | http://blog.stevehorn.cc/blog)

Quote from Kahn about complaining about status quo and theorizing about how things should be

(I tried to find the actual quote but couldn’t – if anyone knows where I can find it I’ll be happy to update these notes)

Assumptions for JavaScript Applications

  • Not progressive enhancement
  • JavaScript is enabled
  • Rendering of HTML templates is done on the client
  • Server side is for querying or performing work
  • The UI is the most important part of the app

Each framework is giving its own world view of how you should build an app

Code Organization

Book: JavaScript Patterns (Stoyan Stefanov)

  • How can I create modules and namespaces?

window.nmap = window.nmap || {};

Let JavaScript be JavaScript; don’t worry about public/private/etc…    

Constructor members are recreated every time the constructor is invoked

jQuery Tiny Pub/Sub http://benalman.com

Designing for Windows 8 Apps

Dan Shultz (@dshultz)

Metro/Windows 8 Style Design

  • Modern Design – Bauhaus
  • International Typographic Style – Swiss design
  • Motion Design – Cinematography

Windows 8 grid

  • Grid units
  • 20×20 pixel grids
  • Title size: 42pt
  • Title line height: 48pt
  • Page header is 5 units from the top

Responsive Design

…is the approach that suggests that design and development should respond to the user’s behavior and environment based on screen size, platform, and orientation

Techniques

  • A flexible grid
  • Media queries

Media Query Ranges

  • Mobile portrait < 479 px wide
  • Mobile landscape 480 – 767 px
  • Tablet portrait 768 – 1023 px
  • Tablet Landscape >= 1024 px

Certification Tips

  • Apps need to work in snap view to pass certification
  • Keep functionality off of the margins to prevent interference with charms and app switching
  • Use charms contracts where applicable
    • Share
    • Search
    • Picker
  • Need to include a privacy statement within settings
  • Weight functionality toward edges for higher usability
    • Center screen requires a posture change
  • Sharing & Ages
    • Limit < age 12
  • Reserved Space
  • Asset sizes
    • 100%
    • 140%
    • 180%
  • Invest in a great live tile
    • Consider different sizes