.NET

Topics specific to .NET development.

Fun with Code Diagnostic Analyzers

A few days ago I posted an article detailing how to construct a code diagnostic analyzer and code fix provider to detect if..else statements that simply assign a variable or return a value and replace the statements with a conditional operator. You know, the kind of things that code diagnostic analyzers and code fix providers are intended for. As I was developing those components I got to thinking about what kind of fun I could have while abusing the feature. More specifically, I wondered whether could I construct a code diagnostic analyzer such that it would highlight every line of C# code as a warning and recommend using F# instead.

It turns out, it’s actually really easy. The trick is to register a syntax tree action rather than a syntax node action and always report the diagnostic rule at the tree’s root location. For an extra bit of fun, I also set the diagnostic rule’s help link to fsharp.org so that clicking the link in the error list directs the user to that site.

Here’s the analyzer’s code listing in its entirety:

using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.Diagnostics;
using System.Collections.Immutable;

namespace UseFSharpAnalyzer
{
    [DiagnosticAnalyzer(LanguageNames.CSharp)]
    public class UseFSharpAnalyzerAnalyzer : DiagnosticAnalyzer
    {
        internal static DiagnosticDescriptor Rule =
            new DiagnosticDescriptor(
                "UseFSharpAnalyzer",
                "Use F#",
                "You're using C#; Try F# instead!",
                "Language Choice",
                DiagnosticSeverity.Warning,
                isEnabledByDefault: true,
                helpLink: "http://fsharp.org");

        public override ImmutableArray<DiagnosticDescriptor> SupportedDiagnostics { get { return ImmutableArray.Create(Rule); } }

        public override void Initialize(AnalysisContext context)
        {
            context.RegisterSyntaxTreeAction(AnalyzeTree);
        }

        private static void AnalyzeTree(SyntaxTreeAnalysisContext context)
        {
            var rootLocation = context.Tree.GetRoot().GetLocation();
            var diag = Diagnostic.Create(Rule, rootLocation);

            context.ReportDiagnostic(diag);
        }
    }
}

When applied to a C# file, the result is as follows:

Use F# Analyzer

This is clearly an example of what not to do with code analyzers but it was fun to put together and see the result nonetheless. If you’ve thought of any other entertaining uses for code analyzers, I’d love to hear about them!

Indianapolis F# Developers Coding Dojo

There’s a new developer group in town! The inaugural meeting of the Indianapolis F# Developers group is upon us. On Tuesday, November 18th I’ll be leading the group through the Digit Recognizer dojo from Community for F#!

Since I suspect we’ll have some people new to the language, we’ll spend some time talking about some language concepts that are critical to successfully completing the exercise. Make sure you bring a laptop because after the introduction we’ll then split into a few groups (pairs, probably) and work through the problem together. After some time we’ll come back together as a group to talk about the problem and discuss some of the ways that F# lets you focus on the task at hand.

If F#, functional programming, and/or machine learning are of interest to you, please register for the event on our meetup page. We’ll be meeting in the 20 person meeting room at Launch Fishers. The meetup starts at 7:00 PM. We’d love for you to join us!

I Can Analyze Code, And So Can You

[12 December 2014 – Update 1] Upon setting up a new VM I realized that I missed a prerequisite when writing this post. The Visual Studio 2015 Preview SDK is also required. This extension includes the VSIX project subtype required by the Diagnostic and Code Fix template used for the example. I’ve included a note about installing the SDK in the prerequisites section below.

[12 December 2014 – Update 2] The github project has moved under the .NET Analyzers organization. This organization is collecting diagnostics, code fixes, and refactorings like to showcase the capabilities of this technology. The project link has been updated accordingly.

Over the past several years, Microsoft has been hard at work on the .NET Compiler Platform (formerly Roslyn). Shipping with Visual Studio 2015 the .NET Compiler Platform includes a complete rewrite of the C# and Visual Basic compilers intended to bring features that developers have come to expect from modern compilers. In addition to the shiny new compilers the .NET Compiler Platform introduces a rich set of APIs that we can harness to build custom code diagnostic analyzers and code fix providers thus allowing us to detect and correct issues in our code.

When trying to decide on a useful demonstration for these features I thought of one of my coding pet peeves: using an if..else statement to conditionally set a variable or return a value. I prefer treating these scenarios as an expression via the conditional (ternary) operator and I often find myself refactoring these patterns in legacy code. It certainly would be nice to automate that process. It turns out that this is exactly the type of task at which diagnostic analyzers and code fix providers excel and we’ll walk through the process of creating such components in this post. (more…)

C# 6.0 – Expression-bodied Members

[7/30/2015] This article was written against a pre-release version of C# 6.0. Be sure to check out the list of my five favorite C# 6.0 features for content written against the release!

Expression-bodied members are another set of features that simply add some syntactic convenience to C# to continue streamlining type definitions. C# type definitions have always been a bit tedious, requiring us to provide curly braces and explicit returns even when the function consists of only a single line. Lambda expressions have spoiled us by abstracting away most of the syntax that’s typically associated with functions and finally making delegation feel like a natural part of the language. Expression-bodied members extend the convenience of lambda expressions to type members… to an extent.

According to the Visual Studio “14” CTP3 notes, expression-bodied members can be used for methods, user-defined operators, type conversions, and read-only properties including indexers (although expression-bodied indexers don’t work properly in the CTP).

To demonstrate the feature, I’ll define a custom RgbColor class and using expression-bodied members for both a hexadecimal formatting method and a custom negation operator.

public class RgbColor(int r, int g, int b)
{
  public int Red { get; } = r;
  public int Green { get; } = g;
  public int Blue { get; } = b;

  public string ToHex() =>
    string.Format("#{0:X2}{1:X2}{2:X2}", Red, Green, Blue);

  public static RgbColor operator - (RgbColor color) =>
    new RgbColor(
      color.Red ^ 0xFF,
      color.Green ^ 0xFF,
      color.Blue ^ 0xFF
    );
}

As convenient as expression-bodied members appear on the surface, they seem rather limited in their utility since their very nature requires expressions. This fact is compounded by the fact that C# isn’t an expression-based language; if it were (as F# is) we could get away with defining more complex bodies including some intermediate values such as a regular expression match result. As it stands today it’s not possible to do this without first defining that logic in a separate method that can be invoked from the expression-bodied member but that seems to defeat the purpose. Perhaps this would be a good use for the semicolon operator since we can’t even include that logic within curly braces as we could with a lambda expression.

C# 6.0 – Params IEnumerable

[7/30/2015] This article was written against a pre-release version of C# 6.0 and didn’t make it into the final product. Be sure to check out the list of my five favorite C# 6.0 features for content written against the release!

One of the items currently listed as “planned” on the Language Feature Implementation Status page is params IEnumerable. This is one of those features that I’ve thought could be nice but I’ve never really felt strongly about it one way or another mainly because I avoid params arrays whenever possible. I think the only time I actually use them consistently is with String.Format and its cousin methods. Before diving into the new feature, let’s revisit params arrays so I can better explain why I avoid them and why I’m so apathetic about this proposed feature.

The idea behind params arrays is that they let us pass an arbitrary number of arguments to a method by defining said method such that its final parameter is an array modified by the params keyword. When we invoke the function, the compiler implicitly creates an array from the final parameters and is then passed to the method. In the method body, we typically access those values through the array. The following function shows this in action:

string MyStringFormat(string format, params string[] tokens)
{
  return
    Regex.Replace(
      format,
      @"\{(?<index>\d+)\}",
      m => tokens[Int32.Parse(m.Groups["index"].Value)].ToString());
}

We could then invoke this function as follows:

var format = "[{0}] Hello, {1}.";
var time = DateTime.Now.ToShortTimeString();
var name = "Guest";

var str = MyStringFormat(format, time, name);
// [10:35 PM] Hello, Guest.

Back in the early days of the .NET Framework, before we had collection initializers, the above was far more convenient than using an array directly. Now that collection initializers let us define arrays inline, we can simply write this:

var str = MyStringFormat(format, new [] { time, name });
// [10:35 PM] Hello, Guest.

Arguably, the former is a bit cleaner than the later because of the inline array definition but I have two problems with it. First, it masks the fact that an array is involved, and second, params arrays can really wreak havoc on overload resolution if you’re not careful with them. Assume we need to overload the MyStringFormat method to accept another string that will control some formatting options. We can’t tack the string on to the end of the parameter list because params arrays must appear last so we insert it at the front like this:

Note: I’m using string constants here for illustration purposes. A less contrived example would likely use an enum, thus avoiding the problem.

static class StringTransform
{
  public const string Uppercase = "Uppercase";
  public const string Lowercase = "Lowercase";
}

string MyStringFormat(string transformation, string format, params string[] tokens)
{
  var str = MyStringFormat(format, tokens);
  
  switch (transformation)
  {
    case StringTransform.Uppercase: return str.ToUpper();
    case StringTransform.Lowercase: return str.ToLower();
  }      

  return str;
}

Now we can invoke this new overload as follows:

var str = MyStringFormat(StringTransform.Uppercase, format, time, name);

…and the result would be as we would expect:

[12:30 AM] HELLO, GUEST.

But now that we have two overloads (each accepting an indeterminate number of strings) the compiler cannot always properly determine which overload to invoke. Let’s see what happens when we supply only three arguments as we did in the first example.

var str = MyStringFormat(format, time, name);

Now the result is:

12:30 AM

What happened? The compiler recognized that there are multiple candidate overloads so it took a greedy approach, selecting the overload with the most explicit parameters. As a result, each of the arguments were essentially shifted to the left of what we expected. In other words, what should have been the format parameter was treated as the transformation parameter, the first of the token values was treated as the format parameter, and so on. To force the compiler to use the correct overload, we can just be explicit with the array as we saw in the second example above:

var str = MyStringFormat(format, new [] { time, name });

So what does all of this have to do with params IEnumerable? Params IEnumerable is an extension of the core idea; rather than treating the last arguments as an array within the method, we’d treat them as a sequence. That is, in our MyStringFormat method, we’d define the final parameter as params IEnumerable<string> tokens rather than params string[] tokens.

This change makes sense from a language perspective especially since params arrays were implemented long before IEnumerable<T> was a central concept within the framework. But, as the May 21st design notes state, when the method is used as a params method (rather than by passing an explicit sequence), the compiler will still generate an array at the call site. For all other cases, passing a sequence will behave no differently than if the parameter had been defined without the params modifier.

Given that:

  • it’s so easy to define sequences inline
  • using params arrays can easily confuse the overload resolver
  • params IEnumerable will only hide the sequence’s details from the method implementation but otherwise have no effect on the code

…I don’t see myself ever using this feature. I’ll likely continue designing methods to accept IEnumerable<T> without including the params modifier to avoid the problems and be more specific about what the method is expected to do.

C# 6.0 – Semicolon Operator

I first read about the proposed semicolon operator a few weeks ago and, to be honest, I’m a bit surprised by the desire for it in C# if for no reason other than that C# isn’t an expression-based language. To me, this feature feels like another attempt to shoehorn functional approaches into an object-oriented language. If I understand the feature correctly, the idea is to allow variables to be defined within and scoped to an expression. The following snippet adapted from the language feature implementation status page shows the operator in action:

var result = (var x = Foo(); Write(x); x * x);

In this code, everything within the parentheses constitutes a single expression. The expression invokes Foo, assigns the result to x, passes x to the Write function, then returns the square of x, and ultimately assigns the square to result. Because x is scoped to the expression, it is not visible outside of the parenthesis. I think this seems a bit awkward in C# and what’s more, I don’t know what value it adds that functions don’t already give us. I haven’t really decided whether the above example is more readable or maintainable than if we’d defined a function called WriteAndSquare.

Interestingly, this capability already exists in F# (albeit in a slightly more verbose form) which isn’t really all that surprising since F# is an expression-based language.

let result =
  let x = Foo()
  printfn "%i" x
  x * x

Even in F# though, I think I’d still prefer factoring out the expression into a function.

C# 6.0 – Null Propagation Operator

[7/30/2015] This article was written against a pre-release version of C# 6.0 but is still relevant. Be sure to check out the list of my five favorite C# 6.0 features for content written against the release!

One of the most common complaints about C# (especially among the F# crowd) is the amount of code required to properly deal with null values. Consider the following type definitions and invocation chain:

Type Definitions

class PriceBreak(int min, int max, double price)
{
  public int MinQuantity { get; } = min;
  public int MaxQuantity { get; } = max;
  public double Price { get; } = price;
}

class Product(IList<PriceBreak> priceBreaks)
{
  public IList<PriceBreak> PriceBreaks { get; } = priceBreaks;
}

Invocation Chain

var minPrice = product.PriceBreaks[0].Price;

As it’s written the preceding invocation chain is full of problems (we’ll ignore the potential ArgumentOutOfRangeException for this discussion). In fact, it can throw a NullReferenceException when any of the following values are null:

  • Product
  • Product.PriceBreaks
  • Product.PriceBreaks[0]

To properly avoid the exception we need to explicitly check for null. One possible approach would look like this:

double minPrice = 0;

if (product != null
    && product.PriceBreaks != null
    && product.PriceBreaks[0] != null)
{
  minPrice = product.PriceBreaks[0].Price;
}

Now we can be certain that we won’t encounter a NullReferenceException but we’ve greatly increased the complexity of the code. Rather than a single line, we have 5 that actually do something and of that, less than half are providing any value beyond making sure we don’t see an exception. C# 6.0’s null propagation operator (?) gives us the safety of checking for nulls while avoiding the verbose ceremony. The result is much cleaner and closely resembles the invocation chain shown earlier:

var minPrice = product?.PriceBreaks?[0]?.Price;

Introducing the null propagation operator allows us to chain each call while short-circuiting any nulls encountered along the way. In this case, the return type is double? (INullable) because anything preceding Price can be null. As such, we can either defer checking for null until we try to use the value or we can simply use the null coalescing operator with a default to coerce the type conversion as follows:

var minPrice = product?.PriceBreaks?[0]?.Price ?? 0;

One limitation of the null propagation operator is that it can’t work directly with delegate invocation. For instance, the following code is invalid:

var result = myDelegate?("someArgument");

Fortunately, there’s a workaround; rather than invoking the delegate directly as shown above, we can use invoke it via the Invoke method like this:

var result = myDelegate?.Invoke("someArgument");

The null propagation operator should eliminate a great deal of null reference checking code although I still believe F#’s approach of virtually eliminating null is superior. Honestly, my biggest concern about the null propagation operator is that it could encourage Law of Demeter violations but that’s a discussion for another day.