Functional C#: Fluent Interfaces and Functional Method Chaining

This is adapted from a talk I’ve been refining a bit. I’m pretty happy with it overall but please let me know what you think in the comments.

Update: I went to correct a minor issue in a code sample and WordPress messed up the code formatting. Even after reverting to the previous version I still found issues with escaped quotes and casing changes on some generic type definitions. I’ve tried to fix the problems but I may have missed a few spots. I apologize for any odd formatting issues.

I’ve been developing software professionally for 15 years or so. Like many of today’s enterprise developers much of my career has been spent with object-oriented languages but when I discovered functional programming a few years ago it changed the way I think about code at the most fundamental levels. As such I no longer think about problems in terms of object hierarchies, encapsulation, or and associated behavior. Instead I think in terms of independent functions and the data upon which they operate in order to produce the desired result.

Functional programming allowed me recognize that virtually everything we do in software development follows the same basic patterns no matter how obscured they are by the layers of object-oriented complexity. Larry O’Brien summarized these patterns well when he wrote in a November 2013 SD Times article that functional programming is a sequence of FARTS:

  • Filters
  • Assignments
  • Reductions
  • Transformations
  • Slices

To me, the sequence of FARTS describes programming business applications in general quite well regardless of the development paradigm.  It just happens that in FP these patterns typically have nice, simple, yet descriptive names such as map or filter.

This insight has also led me to see most things we do in OO as unnecessarily complex. We bloat our software with things like DI frameworks and ORMs because OO languages don’t traditionally let us easily change program behavior by simply swapping out functions. And because we’re so concerned with managing state in an OO application we’ve devised design patterns such as flyweight and bridge. WTF is a flyweight, anyway?

The very nature of functional programming is such that once it takes hold of you, you start taking certain things for granted. For me, one such thing is pipelining. Pipelining is so named because it lets data flow from one function to another without the need to manage state via imperative code. Pipelining is idiomatic in F# but it’s also possible to achieve a similar effect in C# through method chaining and, by extension, fluent interfaces. Unfortunately, object-orientation’s very nature makes it such that these concepts are architectural patterns and must be designed into the types with which they’re used rather than a natural part of the language.

Take LINQ for instance. LINQ is built entirely around these concepts and that’s one of the key reasons why it’s so easy to work with. Imagine trying to work with LINQ in a traditional, imperative manner where we had to declare or reassign variables at every individual step.

As much as I’d prefer working in F# all the time the nature of the industry is still such that object-orientation is king. But wouldn’t it be nice to extend these chaining capabilities throughout C# and realize some of the benefits of pipelining across the language? Using the very same techniques that LINQ uses, we can do this quite easily.

Fluency

Let’s start by looking at an existing type that exposes what I consider to be a broken fluent interface. A fluent interface is a specialized, self-referencing form of method chaining where the context is maintained through the chain. The type I want to look at is one of the oldest classes in the .NET Framework – the StringBuilder.

I’ve written about extending the StringBuilder before. This section is derived from that work.

StringBuilder is a fantastic class and has served us very well since the earliest days of .NET. One really surprising aspect of the class is that it actually exposes a fluent interface. If you were to inspect its methods you’d see that almost every one returns StringBuilder.

StringBuilder Append Functions

See! How cool is that? The fluent interface means that we don’t always need to write code like this:

private static string BuildSelectBox(
    IDictionary<int, string> options,
    string id,
    bool includeUnknown)
{
    var html = new StringBuilder();
    html.AppendFormat("<select id=\"{0}\" name=\"{0}\">", id);</select>
    html.AppendLine();

    if (includeUnknown)
    {
        html.AppendLine("\tUnknown");
    }

    foreach (var opt in options)
    {
        html.AppendFormat("\t{1}", opt.Key, opt.Value);
        html.AppendLine();
    }

    html.AppendLine("");

    return html.ToString();
}

This code is a pretty typical example of working with the StringBuilder – even the MSDN examples follow this pattern. By using the fluent interface we can break away from this pattern…or can we? Let’s start refactoring to use the fluent interface and see how far we get before running into trouble.

We can easily convert the first three lines.

var html =
    new StringBuilder()
        .AppendFormat("<select id=\"{0}\" name=\"{0}\">", id)
        .AppendLine();

This pattern also exists in the foreach loop so we can take care of that one now, too.

html
    .AppendFormat("\t<option value=\"{0}\">{1}</option>", opt.Key, opt.Value)
    .AppendLine();

Before going any further let’s eliminate one of things that really annoys me when using the StringBuilder. How silly is it that appending a formatted string with a new line requires us to either include the new line in the format string or call AppendLine with no arguments as I’ve done here? Why not combine the methods with an extension method?

Let’s start building out an extension class and our first convenience method, AppendFormattedLine.

I’d typically host this class in the System.Text namespace to ensure that the extension methods are automatically imported whenever we use the StringBuilder.

public static class StringBuilderExtensions
{
    public static StringBuilder AppendFormattedLine(
        this StringBuilder @this,
        string format,
        params object[] args)
    {
        return @this.AppendFormat(format, args).AppendLine();
    }
}

Here we’ve simply leveraged the StringBuilder’s existing fluent interface to chain the AppendFormat and AppendLine methods.

With AppendFormattedLine in place we can simplify the aforementioned snippets. The first becomes:

var html =
    new StringBuilder()
        .AppendFormattedLine("<select id=\"{0}\" name=\"{0}\">", id);</select>

And the second becomes:

html.AppendFormattedLine("\t{1}", opt.Key, opt.Value);

It’s at this point where the StringBuilder’s built-in fluent interface breaks down. Rather than continuing to chain the various methods we’re forced to break out into imperative code to conditionally append a string and also to append strings iteratively.

That we have to revert to imperative code for these tasks isn’t particularly surprising given how StringBuilder predates all of the language features that make such chaining feasible. In the early days of C# and .NET we really had but one option for making it work: define a class to wrap the StringBuilder (remember, StringBuilder is sealed so we can’t derive from it) and expose methods that would accept either an interface implementation or a classic delegate. Of course, doing so would require us to expose all of the existing StringBuilder methods so we wouldn’t lose functionality. Defining such a class would be time-consuming and error prone.

Today is a different story. Because we now have generics, lambda expressions, and extension methods, it’s trivial to extend the StringBuilder’s functionality with a few higher-order extension methods to provide us with the functionality we desire.

Let’s start with one to conditionally append a string:

public static StringBuilder AppendLineWhen(
    this StringBuilder @this,
    Func<bool> predicate,
    string value)
{
    return
        predicate()
            ? @this.AppendLine(value)
            : @this;
}

The AppendLineWhen method takes three parameters:

  1. The StringBuilder used in the chain
  2. A Func<bool> representing the condition upon which appending the string is predicated
  3. The value to append

Some people question naming the StringBuilder “@this” but I think it’s acceptable because if the extension method were an instance member we’d be referring to the instance as “this” anyway. By using the prefixed identifier we keep things consistent.

The method body is simple; if the predicate evaluates to true we invoke AppendLine on the supplied StringBuilder and return the instance via the fluent interface otherwise we just return the supplied StringBuilder.

Now we can refactor our conditional append to use AppendLineWhen. After refactoring it becomes:

html
    .AppendLineWhen(
        () => includeUnknown,
        "\t<option>Unknown</option>");

We’ll come back and simplify the chain in a moment but let’s first look at how to include iterative appends in the chain. To accomplish this we’ll use a similar approach by defining another higher-order extension method. Let’s start by defining the method but keeping the imperative approach for the body.

public static StringBuilder AppendSequence(
    this StringBuilder @this,
    IEnumerable<T> sequence,
    Func<StringBuilder, T, StringBuilder> fn)
{
    foreach(var item in sequence)
    {
        fn(@this, item);
    }

    return @this;
}

Just as with the AppendLineWhen method we have three parameters:

  1. The StringBuilder
  2. The sequence we want to append
  3. A function that defines how to append each item

In this version we simply invoke the formatting function for each item in the sequence and return the supplied StringBuilder. Let’s refactor the original code to use AppendSequence to see how this looks.

html
    .AppendSequence(
        options,
        (sb, opt) =>
            sb.AppendFormattedLine("\t{1}", opt.Key, opt.Value));

This approach allows us the most control over appending each item in the sequence. By defining the behavior through a lambda expression that accepts both the StringBuilder instance and each item we can nest a chain for more complex formatting scenarios.

The imperative body certainly works but the purpose of this post is to move from imperative code to more functional code so let’s go back and see what we can do on that front.

Remember from before that functional programming is a sequence of FARTS. It could be argued that the AppendSequence method is transforming the sequence but I prefer to think of it as reducing the sequence for this exercise. What we’re doing is reducing the items into the single StringBuilder instance. (Technically we’re folding the sequence which involves both a transformation and reduction but that’s a discussion for another day.)

Let’s look at the function that AppendSequence wants. It’s defined as Func<StringBuilder, T, StringBuilder>. This says that the function must accept a StringBuilder and an instance of T and will return a StringBuilder. This signature can be further generalized to Func<T, U, T> which, not coincidentally, corresponds with the delegate accepted by an existing LINQ function: Aggregate.

Aggregate is to LINQ what reduce is to functional programmers. The overload we care about here accepts a delegate with signature Func<TAccumulate, TSource, TAccumulate>. This means that the function we’re passing to AppendSequence can also be passed to Aggregate so we can easily refactor AppendSequence to use a simple LINQ expression like this:

public static StringBuilder AppendSequence(
    this StringBuilder @this,
    IEnumerable<T> sequence,
    Func<StringBuilder, T, StringBuilder> fn)
{
    return sequence.Aggregate(@this, fn);
}

So now that we’ve defined a few extension methods to add conditional and iterative appending to StringBuilder’s fluent interface let’s go back and actually chain everything together. We can even go as far as reducing the entire method to a single method chain as follows:

private static string BuildSelectBox(
    IDictionary<int, string> options,
    string id,
    bool includeUnknown)
{
    return
        new StringBuilder()
            .AppendFormattedLine("<select id=\"{0}\" name=\"{0}\">", id)</select>
            .AppendLineWhen(
                () => includeUnknown,
                "\t<option>Unknown</option>")
            .AppendSequence(
                options,
                (sb, opt) => sb.AppendFormattedLine("\t<option value=\"{0}\">{1}</option>", opt.Key, opt.Value))
            .AppendLine("</select>")
            .ToString();
}

By combining every append into a single chain we’ve succesfully eliminated the html variable and instead let the original StringBuilder flow through the chain, ultimately returning the constructed string.

Before we move on, let’s revisit the AppendLineWhen method. The way it’s written restricts us to conditionally appending a line. If we wanted to conditionally execute any of the other append methods like Append or even AppendSequence we’d have to create additional extension methods. These would, of course each follow the exact pattern so let’s scratch that approach and replace it with one that’s a bit more flexible.

public static StringBuilder When(
    this StringBuilder @this,
    Func<bool> predicate,
    Func<StringBuilder, StringBuilder> fn)
{
    return
        predicate()
            ? fn(@this)
            : @this;
}

Here we’ve renamed the method from AppendLineWhen to When and rather than invoking the AppendLine we’re invoking a function that both accepts and returns a StringBuilder. This allows us to conditionally nest additional StringBuilder chains. Refactoring the code to use this new extension method results in the following:

private static string BuildSelectBox(
    IDictionary<int, string> options,
    string id,
    bool includeUnknown)
{
    return
        new StringBuilder()
            .AppendFormattedLine("<select id=\"{0}\" name=\"{0}\">", id)
            .When(
                () => includeUnknown,
                sb => sb.AppendLine("\t<option>Unknown</option>"))
            .AppendSequence(
                options,
                (sb, opt) => sb.AppendFormattedLine("\t<option value=\"{0}\">{1}</option>", opt.Key, opt.Value))
            .AppendLine("</select>")
            .ToString();
}

You can, of course, further generalize the When extension method to make it apply to any type but I’ll leave that as an exercise for you (hint: make When generic).

Generalized Method Chaining

Now that we’ve seen how powerful fluent interfaces can be by extending the StringBuilder let’s start looking at how we can carry this concept to other part of C#, even across types. One simple thing we can do is work around the language’s statement-centric nature.

Functional languages are typically expression-based. That is, everything returns a value – even language features that object-oriented programmers wouldn’t typically associate with return values. For instance, in C# if..else is a statement whereas in F# it’s an expression. So why is this distinction so important?

In C#, this distinction is the reason we need both the Action and Func generic delegate types (and their ilk) rather than simply having Func. It’s also the difference between using various concepts inline rather than breaking into an imperative style like we did with the StringBuilder before adding the extension methods.

C# already has a few workarounds such as the conditional operator which I generally prefer for simple assignments because of its inline nature. Consider the following assignment:

string posOrNeg;

if (value > 0)
    posOrNeg = "positive"
else
    posOrNeg = "negative"

Here we conditionally assign posOrNeg to a string depending on whether value is positive or negative but it requires us to define posOrNeg then assign it in two steps. I’d prefer to do it in a single step like this:

var posOrNeg =
    value > 0
        ? "positive"
        : "negative";

The expression-based approach is much more streamlined and eliminates some of the repetition of the imperative approach. The other benefit is that it also allows us to take advantage of C#’s type inference capabilities. But what about other statements where C# doesn’t have a built-in expression-based alternative? How can we handle those situations? By wrapping them in higher-order functions, of course!

One statement that I’ve found difficult to work with is the using statement. I’ve found that the majority of the time I’ve needed a using statement, it was to get some value from the associated IDisposable instance. Just as with the if..else statement, this approach requires defining the variable, entering the using block, and assigning the value within the block. Consider this code which simply reads the contents of a stream into a byte array:

I’ve written about a functional version of the using statement before. This section is derived from that work.

byte[] buffer;

using (var stream = StreamFactory.GetStream())
{
    buffer = new byte[stream.Length];
    stream.Read(buffer, 0, (int)stream.Length);
}

By wrapping the using statement within a higher-order function we can achieve the same result as replacing if..else with the conditional operator. Let’s define that function as a static method of a class we’ll call Disposable.

Note that we’re not defining an extension method because we’re not extending anything. Instead we’re simply defining a function that will handle the lifecycle of an IDisposable instance.

public static class Disposable
{
    public static TResult Using<TDisposable, TResult>
    (
      Func<TDisposable> factory,
      Func<TDisposable, TResult> fn)
      where TDisposable : IDisposable
    {
        using (var disposable = factory())
        {
            return fn(disposable);
        }
    }
}

This helper function, Disposable.Using accepts two delegates, one to create the IDisposable instance, and another to act upon the IDisposable instance within the using block. To make this type safe we’ve also constrained the TDisposable type parameter to IDisposable rather than using IDisposable directly. Now let’s look at how Disposable.Using can streamline the imperative version.

var buffer =
    Disposable
        .Using(
            () => StreamFactory.GetStream(),
            stream =>
            {
                var b = new byte[stream.Length];
                stream.Read(b, 0, (int)stream.Length);
                return b;
            });

Wait – I thought this was supposed to streamline things but all it looks like we’ve done is add some complexity! Well, we’re not quite done yet but we’ve already achieved two things. First, we’ve combined the variable definition and the assignment. The other thing is that we’ve allowed including using blocks within a method chain. I’ll revisit the later in a bit but first let’s see how we can further simplify this.

There’s one thing we can do without adding any additional code. Notice that Disposable.Using’s first parameter is Func<TDisposable> and that StreamFactory.GetStream satifies that. Rather than using a lambda expression we can simply use StreamFactory.GetStream as the first argument like this:

var buffer =
    Disposable
        .Using(
            StreamFactory.GetStream,
            stream =>
            {
                var b = new byte[stream.Length];
                stream.Read(b, 0, (int)stream.Length));
                return b;
            });

That’s a bit better but I still don’t like that lambda block. There’s a simple thing we can do to replace it with a basic lambda expression. Before I reveal the solution, let’s look at what that block is really doing. It:

  1. Defines and initializes a byte array to hold the stream contents
  2. Loads the stream contents into the byte array using a side-effecting function
  3. Returns the byte array

We could simply refactor the lambda body to a new named method but that seems overkill for something that’s only going to be used here. Think of it this way: would you really define a named method for filtering or sorting a sequence in LINQ? Probably not. What we really want is a more functional way to populate the byte array and return it. Not surprisingly we can borrow a pattern from functional programming to achieve just that. What’s more is that we can define it in such a manner that it’s applicable in more general cases. Let’s take a look at that function which we’ll place in a class called FunctionalExtensions.

public static class FunctionalExtensions
{
    public static T Tee<T>(this T @this, Action<T> action)
    {
        action(@this);
        return @this;
    }
}

The Tee extension method takes it’s name from the corresponding UNIX command which is used in command pipelines to cause a side-effect with a given input and return the original value. Here, our side-effect is populating a byte array but we could just as easily use Tee for logging or anything else for that matter. Here’s Tee in action with our Using method:

var buffer =
    Disposable
        .Using(
            StreamFactory.GetStream,
            stream => new byte[stream.Length].Tee(b => stream.Read(b, 0, (int)stream.Length)));

There, isn’t that better? By introducting Tee we’ve eliminated the lambda block so we can let the byte array flow through the delegate. We could take this even further by extending Stream with a ReadToBuffer method and simply passing that method to Tee but I think the point is clear.

Of course, the nature of programming is such that once we have a value we probably want to do something with it. In fact, that’s the whole idea behind pipelining. Why should we stop here? Why not ~~steal~~ borrow yet another idea from the functional world?

Thus far we haven’t discussed why we’re reading the stream. For the purposes of this discussion let’s assume that the stream contains a sequence of line-break delimited strings we want to pass to the BuildSelectBox method that we refactored during the StringBuilder part of this post. The remainder of this method might look like this:

var options =
    Encoding
        .UTF8
        .GetString(buffer)
        .Split(new[] { Environment.NewLine, }, StringSplitOptions.RemoveEmptyEntries)
        .Select((s, ix) => Tuple.Create(ix, s))
        .ToDictionary(k => k.Item1, v => v.Item2);

return BuildSelectBox(options, "strings", true);

Here we’re:

  1. converting the byte array which we obtained by reading the stream to a UTF8 string
  2. splitting the resulting string to an array
  3. transforming the array to a sequence of tuples containing the original string and index
  4. transforming the tuples to a dictionary
  5. building the select box from the dictionary

There’s clearly already a bunch of chaining going on here beginning with the call to GetString but by introducing just one more extension method, Map, we can combine each part of this method into a single chain. Here’s the Map method in its entirety:

public static TResult Map<tsource, tresult="">(
  this TSource @this,
  Func<TSource, TResult> fn)
{
    return fn(@this);
}

Notice how similar Map is to Tee. In both cases we’re accepting a value and applying a delegate to that value. Unlike Tee, Map requires two type parameters because it’s going to transform, or Map one value to the result of the function. Let’s go ahead and work this into the method body.

return
    Disposable
        .Using(
            StreamFactory.GetStream,
            stream => new byte[stream.Length].Tee(b => stream.Read(b, 0, (int)stream.Length)))
        .Map(Encoding.UTF8.GetString)
        .Split(new[] { Environment.NewLine, }, StringSplitOptions.RemoveEmptyEntries)
        .Select((s, ix) => Tuple.Create(ix, s))
        .ToDictionary(k => k.Item1, v => v.Item2)
        .Map(o => BuildSelectBox(o, "strings", true));

Through the magic of higher-order extension methods we’ve managed to transform our original imperative code into a single chain with clearly delineated operations and no variables exposed outside of the code where they’re actually meaningful.

I should point out here that C#’s type inference capabilities aren’t quite as powerful as I’d like. In many cases it’s adequate but I’ve found that there are times such as when including NHibernate’s ISession.Get in a chain that I have to be explicit with the type parameters on Map because C# won’t treat the value I’m passing in as object.

Before we conclude let’s look how we can leverage one more powerful functional programming concept: partial application.

Partial Application

Functional languages often employ a technique called currying which involves converting a multi-parameter function into a chain of functions where each parameter is applied individually. This is in stark contrast with object-oriented languages where the arguments are typically applied simultaneously. Much has been written about currying so I won’t go into much detail about it here (I cover it in more detail in my book ). Instead I want to focus on what currying enables: partial application.

In a curried function we can apply the first n parameters which results in a new function we can invoke later. Consider this stereotypical example of a curried add function using F# syntax.

let add x y = x + y

This may look like it accepts two parameters but its signature is:

int -> int -> int

Essentially this means that add is a function that accepts an int and returns another function which accepts int and returns int. Here’s where the power comes from – if we find outselves constantly adding five to something else we could leverage partial application and make a new addFive function like this (again using F# syntax):

let addFive = add 5

Once that function is defined we can invoke addFive supplying only one parameter.

Thanks to delegation we can achieve something similar in C# although it’s not nearly as clean as in F# so I recommend using it sparingly. That is, I’d leverage it in simple scenarios but avoid complex partial application chains unless you want your method signatures to start looking like common LISP with parenthesis replaced with angle brackets. That said, let’s see partial application in C# in action by making another change to the BuildSelectBox method.

private static Func<IDictionary<int, string>, string> BuildSelectBox(
    string id,
    bool includeUnknown)
{
    return options =>
        new StringBuilder()
            .AppendFormattedLine("<select id=\"{0}" name=\"{0}\">", id)
            .AppendWhen(
                () => includeUnknown,
                sb => ab.AppendLine("\t<option>Unknown</option>"))
            .AppendSequence(
                options,
                (sb, opt) => sb.AppendFormattedLine("\t<option value=\"{0}\">{1}</option>", opt.Key, opt.Value))
            .AppendLine("</select>")
            .ToString();
}

There are three changes here:

  1. The return type is now Func<IDictionary<int, string>, string>
  2. The options parameter has been removed
  3. The method body now returns a delegate

With these changes BuildSelectBox is no longer building the select box – it’s building a closure around the id and includeUnknown parameters and returning that closure as a delegate. You can think of these changes as essentially rearranging the parameters such that option is now the final parameter in a chain. To fully apply BuildSelectBox we’d need to invoke it like this:

BuildSelectBox("strings", true)(options);

This line highlights the fact that BuildSelectBox truly is returning a delegate. So how does this affect our stream reading chain? Well, because BuildSelectBox now returns a delegate whose signature just happens to correspond with that of the final call to Map, we can eliminate the lambda expression and just let the dictionary naturally flow into the delegate returned by BuildSelectBox like this:

return
    Disposable
        .Using(
            StreamFactory.GetStream,
            stream => new byte[stream.Length].Tee(b => stream.Read(b, 0, (int)stream.Length)))
        .Map(Encoding.UTF8.GetString);
        .Split(new[] { Environment.NewLine, }, StringSplitOptions.RemoveEmptyEntries)
        .Select((s, ix) => Tuple.Create(ix, s))
        .ToDictionary(k => k.Item1, v => v.Item2)
        .Map(BuildSelectBox("strings", true));

So there you have it – effective pipelining in C#. Pipelining is a powerful technique that lets data flow through the system in a declarative manner. By leveraging higher-order extension methods we can improve upon existing fluent interfaces or provide chaining capabilities throughout C#. We can even use higher-order static methods to convert some of the language’s statements into expressions thus extending the chaining capabilities even further. This approach doesn’t get us to the capabilities of what “true” functional languages like F# provide but it certainly makes working in the often imperative world of C# a bit more bearable to those of us used to having those tools at our disposal.

Advertisements

5 comments

  1. Great article! I just wanted to note I think the lines:
    .AppendLine(“”)
    .ToString();
    Should be:
    .AppendLine(“”)
    .ToString();
    I’m so used to the flow of html it threw me a bit at first :)

  2. Well it looks like it cut off my tags. I think the last AppendLine closing “option” should actually be “select”.

    1. Good catch! I’ve updated the post to reflect the correct tag. Naturally WordPress then messed up my formatting so that turned into quite a job.

  3. Flyweight is a relatively uncommon pattern in modern software. In software history it used by a very important pattern because you had kilobytes of RAM in the entire system. You had to store things in as few bytes as possible. The term flyweight is a reference to the boxing weight class that is for fighters that barely weigh 100lbs.

    The simplest viewing of flyweight is take the string “a” and the string “aaaa” you could easily replace “a” with the bit 1 now instead of 160-640 BITS of memory to store the string “aaaa” you could also store just 1111 as 4 BITS.

    This pattern still has modern usages, it is frequently used in word processors and as a bedrock in modern languages as a method to do string interning. In a many modern languages if you create the string “aaaa” in your application it will exist exactly once in memory.

Comments are closed.