Specifications, Expression Trees, and NHibernate

For my latest project we decided to follow a domain driven approach.  It wasn’t until after we had built out much of the domain model that we decided to change course and embrace CQRS and event sourcing.  By this point though we had already developed several specifications and wanted to adapt what we could on the query side.

The Specification Pattern

In its most basic form the specification pattern itself can be realized with a simple interface with a single method:

public interface ISpecification<T>
{
	bool IsSatisfiedBy(T obj);
}

To increase flexibility of the specifications we followed the composite variation where specifications can be chained together with some logical operators. We started with a base class to simplify implementation a bit:

public abstract class CompositeSpecificationBase<T> : ISpecification<T>
{
	private readonly ISpecification<T> _leftExpr;
	private readonly ISpecification<T> _rightExpr;

	protected CompositeSpecificationBase(
		ISpecification<T> left,
		ISpecification<T> right)
	{
		_leftExpr = left;
		_rightExpr = right;
	}

	public ISpecification<T> Left { get { return _leftExpr; } }
	public ISpecification<T> Right { get { return _rightExpr; } }

	public abstract bool IsSatisfiedBy(T obj);
}

To combine two specifications with an And operation we have the AndSpecification<T> class:

public class AndSpecification<T> : CompositeSpecificationBase<T>
{
	public AndSpecification(
		ISpecification<T> left,
		ISpecification<T> right)
		: base(left, right)
	{
	}

	public override bool IsSatisfiedBy(T obj)
	{
		return Left.IsSatisfiedBy(obj) && Right.IsSatisfiedBy(obj);
	}
}

…and its or counterpart:

public class OrSpecification<T> : CompositeSpecificationBase<T>
{
	public OrSpecification(
		ISpecification<T> left,
		ISpecification<T> right)
		: base(left, right)
	{
	}

	public override bool IsSatisfiedBy(T obj)
	{
		return Left.IsSatisfiedBy(obj) || Right.IsSatisfiedBy(obj);
	}
}

Of course, there were times when we need to negate a specification so we have a specification for that too:

public class NegatedSpecification<T> : ISpecification<T>
{
	private readonly ISpecification<T> _inner;

	public NegatedSpecification(ISpecification<T> inner)
	{
		_inner</span> = inner;
	}

	public ISpecification<T> Inner
	{
		get { return _inner; }
	}

	public bool IsSatisfiedBy(T obj)
	{
		return !_inner.IsSatisfiedBy(obj);
	}
}

To simplify the creating these objects we included some extension methods for the ISpecification<T> interface:

public static class ISpecificationExtensions
{
	public static ISpecification<T> And<T>(
		this ISpecification<T> left,
		ISpecification<T> right)
	{
		return new AndSpecification<T>(left, right);
	}

	public static ISpecification<T> Or<T>(
		this ISpecification<T> left,
		ISpecification<T> right)
	{
		return new OrSpecification<T>(left, right);
	}

	public static ISpecification<T> Negate<T>(this ISpecification<T> inner)
	{
		return new NegatedSpecification<T>(inner);
	}
}

As you can see, the specification pattern lends itself nicely to working with in-memory objects but what about including them as criteria in a query? Obviously we’d rather do our filtering within the database than bring back a massive result set to filter within our application so we looked at ways to extend the pattern.

In Domain Driven Design: Tackling Complexity in the Heart of Software (the blue book) Eric Evans describes an approach that involves extending the pattern to include an asSQL() method that produces the SQL appropriate for the specification. This approach would can certainly work but as Evans notes, it’s far from ideal and involves duplicating logic for both in-memory objects and database queries so we decided to try another approach.

Expression Trees

Introduced in .NET 3.5, expression trees give us a hierarchical structure that we can use to represent language-level code. Many of the LINQ providers use expression trees (via IQueryable<T>) but in most cases the expression trees are completely transparent to us because we generally define them with lambda expressions that the compiler converts to expression tree rather than a delegate.

Adapting the specification pattern to use expression trees is pretty straight forward. The most important change is to expand ISpecification<T> to expose an expression tree through a property.

Note: You’ll need to import the System.Linq.Expressions namespace to bring in the expression tree classes.

public interface ISpecification<T>
{
	Expression<Func<T, bool>> SpecExpression { get; }
	bool IsSatisfiedBy(T obj);
}

I don’t really like the name “SpecExpression” but it was preferable to aliasing Expression everywhere to avoid the naming conflict between the property and the class.

Due to how the expression trees will be used by both queries and the IsSatisfiedBy method we found it helpful to introduce a SpecificationBase<T> class to encapsulate some common code.

public abstract class SpecificationBase<T> : ISpecification<T>
{
	private Func<T, bool> _compiledExpression;

	private Func<T, bool> CompiledExpression
	{
		get { return _compiledExpression ?? (_compiledExpression = SpecExpression.Compile()); }
	}

	public abstract Expression<Func<T, bool>> SpecExpression { get; }

	public bool IsSatisfiedBy(T obj)
	{
		return CompiledExpression(obj);
	}
}

To avoid code duplication all of the specification logic is encapsulated within the individual expression trees. An expression tree isn’t executable code though so in order to use that logic in the IsSatisfiedBy method we need to compile the tree before it can be invoked. Compiling an expression tree is an expensive operation so rather than compiling it every time IsSatisfiedBy is called, we encapsulate the compilation within the private CompiledExpression property and cache the result.

With this new base class in place we can modify the remaining classes to use it. The only change that CompositeSpecificationBase<T> needs is to derive from SpecificationBase<T> so I won’t repeat the definition here. The remaining classes will take a slight bit more work.

The IsSatisfiedBy implementation should be removed from each class because it is fully defined in SpecificationBase<T> and we need to implement the SpecExpression property for each. Each implementation follows a similar pattern that builds a new lambda expression from the supplied expression(s). For brevity I’ll only show the SpecExpression implementations rather than the full class definitions.

// AndSpecification
public override Expression<Func<T, bool>> SpecExpression
{
	get
	{
		var objParam = Expression.Parameter(typeof(T), "obj");

		var newExpr = Expression.Lambda<Func<T, bool>>(
			Expression.AndAlso(
				Expression.Invoke(Left.SpecExpression, objParam),
				Expression.Invoke(Right.SpecExpression, objParam)
			),
			objParam
		);

		return newExpr;
	}
}

// OrSpecification
public override Expression<Func<T, bool>> SpecExpression
{
	get
	{
		var objParam = Expression.Parameter(typeof(T), "obj");

		var newExpr = Expression.Lambda<Func<T, bool>>(
			Expression.OrElse(
				Expression.Invoke(Left.SpecExpression, objParam),
				Expression.Invoke(Right.SpecExpression, objParam)
			),
			objParam
		);

		return newExpr;
	}
}

// NegatedSpecification
public override Expression<Func<T, bool>> SpecExpression
{
	get
	{
		var objParam = Expression.Parameter(typeof(T), "obj");

		var newExpr = Expression.Lambda<Func<T, bool>>(
			Expression.Not(
				Expression.Invoke(_innerExpr.SpecExpression, objParam)
			),
			objParam
		);

		return newExpr;
	}
}

The first thing each of these methods do is define the parameter that the expressions will accept. This parameter is also used when invoking the nested expressions within the new expression via the Invoke method.

After defining the parameter we define the new expression bodies. Each implementation uses a different expression method: AndAlso, OrElse, and Not. AndAlso performs a conditional And operation but only evaluates the second expression if the first evaluates to true. OrElse behaves the same way except that it performs an Or operation. Likewise, Not performs a logical not operation.

All that’s left now is to define some actual specifications and see how to chain them together. For illustration purposes let’s assume that we’re working with a real estate application. We could have a Listing class that includes some properties indicating the price, whether there’s a basement, number of bedrooms, number of bathrooms, and street address among other things.

We can easily define some specifications that can be combined to form more complex queries while still encapsulating the logic.

public class HasBasementSpecification : SpecificationBase<Listing>
{
	public override Expression<Func<Listing, bool>> SpecExpression
	{
		get { return l => l.HasBasement; }
	}
}

This first specification simply identifies whether a listing has a basement so it doesn’t require any additional information. Checking price ranges needs something slightly more complex. A single specification could suffice but treating the maximum as minimum separate specifications that can be combined is trivial and arguably less complex in that if you don’t want to specify part of the range you don’t need to write extra code to account for it.

public class MinimumPriceSpecification : SpecificationBase<Listing>
{
	private readonly double _minPrice;

	public MinimumPriceSpecification(double minPriceInclusive)
	{
		_minPrice = minPriceInclusive;
	}

	public double MinimumPriceInclusive
	{
		get { return _minPrice; }
	}

	public override Expression<Func<Listing, bool>> SpecExpression
	{
		get { return p => p.Price >= _minPrice; }
	}
}

public class MaximumPriceSpecification : SpecificationBase<Listing>
{
	private readonly double _maxPrice;

	public MaximumPriceSpecification(double maxPriceInclusive)
	{
		_maxPrice = maxPriceInclusive;
	}

	public double MaximumPriceInclusive
	{
		get { return _maxPrice; }
	}

	public override Expression<Func<Listing, bool>> SpecExpression
	{
		get { return p => p.Price <= _maxPrice; }
	}
}

These should be pretty self explanatory by this point. Just remember that the lambda expression within the SpecExpression property compiles to an expression tree and not a delegate.

Chaining these specifications together is trivial thanks to the extension methods we created earlier.

var spec =
	new HasBasementSpecification()
		.And(
			new MinimumPriceSpecification(175000)
				.And(
					new MaximumPriceSpecification(225000)
				)
		);

We’re now free to use this composite specification with either an in-memory collection or hand off its expression to an ORM that supports expression trees.

Using Expression Trees with NHibernate

It just so happens that NHibernate exposes several methods that work with expression trees. One thing of note though is that most of the methods will fail if we try to pass anything of any complexity to them so we need to make sure we’re using the correct one. For instance, if we try to use our expression tree with Restrictions.Where NHibernate will throw an exception stating that it couldn’t determine the member type. Similarly, we also can’t use the old NHibernate.Linq extension either. So what option does that leave us?

In short, we need to use the Query<T> method included in NHibernate 3 which has expanded support for expression trees.

var query = _session.Query<Listing>().Where(spec.SpecExpression);

NHibernate will take our expression tree and use it to build an appropriate where clause.

I haven’t tried this approach with more complex expression trees involving multiple mapped objects so I’m not sure how well it will work in those situations but I’ve been really happy with it for simple filtering. I like how it fully encapsulates the business rules while still providing an expressive way to build out complex criteria.

This approach also lends itself well to providing users with a mechanism for storing and retrieving common searches. Although expression trees aren’t serializable there’s nothing preventing us from serializing specifications but I’ll leave that discussion for another day :)

4 comments

  1. What kind of memory profile does this approach give when scaled up? I’ve had experience using a similar approach with EF and not NHibernate, so it’s probably a lot different. How long is the hit when you first cache something? Do the benefits of easy coding out way the performance benefits of more bare bones approaches like Dapper, Massive, or PetaPoco?

    I enjoyed the article, though it got me thinking. Thanks for writing it.

    1. I’ve honestly not tried it against EF or looked at the other approaches so I can’t really compare them effectively from a performance standpoint. It would definitely be something to look into. Some things I really like about this specification/expression tree approach though are:

      • strong typing via the lambda expressions – no magic strings
      • it’s completely database agnostic
      • specifications are easy to serialize

      The only performance issue I’ve had using NH so far isn’t so much with NH as it is Fluent NH (thinking of just going back to XML configuration). I’ve had no issues of note so far but then again, I’m still working against a small data set.

      I really can’t foresee this approach affecting application performance too much though since the specification classes themselves are small. The biggest impact I can see would be using the specifications against in-memory collections but caching the compilation result mitigates that somewhat.

  2. Hi Dave,

    Very interesting article! I’m currently beginning to look into data virtualization for an N-Tier system. (We’ll have to provide some kind of filterable data grid with hundreds of thousands of entities behind it.)
    I’ve tried out Bea’s approach as described here: http://bea.stollnitz.com/blog/?p=411. It works very well, but I would prefer a different solution for filtering than building SQL Where strings. It seems to me, that your specification/expression tree approach would be just the way to go. But since I’m pretty inexperienced in the filed of N-Tier architecture, I’m having trouble to come up with an adequate design for this. It would be awesome, if you could give me a short feedback about my ideas so far.
    Here’s what I have:
    – Let users specify filter conditions in the UI
    – Build specifications in the ViewModel and pass them to virtualized list, which is the DataContext of the grid
    – Whenever the list needs to load a new page of data, make appropriate WCF service call passing the specification to it
    – Translate specifications to expression trees in the data access layer.
    – Pass expression to IQueryable repository method to get a page of data

    My questions are:
    – Will NHibernate (or any ORM, which exposes an IQueryable method for data retrieval) be able to generate the appropriate SQL for filtering the result set in the DB instead of loading an unfiltered set into memory and filter it there?
    – I will have to compile the expression in the service for each page that I load, right?
    – Should I move the responsibility for expression generation out of the specification classes?

    All right, sorry to bother you with my little worries at such length. Any feedback would be much appreciated. And thank you again for your great article.

    Cheers,
    Alex

Comments are closed.