My development team is working to implement and enforce more formal development processes than we have used in the past. Part of this process involves deciding on which unit test framework to use going forward. Traditionally we have used NUnit and it has worked well for our needs but now that we’re implementing Visual Studio Team System we now have MSTest available. This has sparked a bit of a debate as to whether we should stick with NUnit or migrate to MSTest. As we examine the capabilities of each framework and weigh each of their advantages and disadvantages I’ve come to realize that the decision is a philosophical matter.
MSTest has a bit of a bad reputation. The general consensus seems to be that MSTest sucks. A few weeks ago I would have thoroughly agreed with that assessment but recently I’ve come to reconsider that position. The problem isn’t that MSTest sucks, it’s that MSTest follows a different paradigm than some other frameworks as to what a test framework should provide.
My favorite feature of NUnit is its rich, expressive syntax. I especially like NUnit’s constraint-based assertion model. By comparison, MSTest’s assertion model is limited, even restrictive if you’re used to the rich model offered by NUnit. Consider the following “classic” assertions from both frameworks:
NUnit | MSTest | |
---|---|---|
Equality/Inequality | Assert.AreEqual(e, a) Assert.AreNotEqual(e, a) Assert.Greater (e, a) Assert.LessOrEqual(e, a) |
Assert.AreEqual (e, a) Assert.AreNotEqual (e, a) Assert.IsTrue(a > e) Assert.IsTrue(a <= e) |
Boolean Values | Assert.IsTrue(a) Assert.IsFalse(a) |
Assert.IsTrue(a) Assert.IsFalse(a) |
Reference | Assert.AreSame(e, a) Assert.AreNotSame(e, a) |
Assert.AreSame(e, a) Assert.AreNotSame(e, a) |
Null | Assert.IsNull(a) Assert.IsNotNull(a) |
Assert.IsNull(a) Assert.IsNotNull(a) |
e – expected value a – actual value |
They’re similar aren’t they? Each of the assertions listed are functionally equivalent but notice how the Greater and LessOrEqual assertions are handled in MSTest. MSTest doesn’t provide assertion methods for these cases but instead relies on evaluating expressions to define the condition. This difference above all else defines the divergence in philosophy between the two frameworks. So why is this important?
Readability
Unit tests should be readable. In unit tests we often break established conventions and/or violate the coding standards we use in our product code. We sacrifice brevity in naming with Really_Long_Snake_Case_Names_So_They_Can_Be_Read_In_The_Test_Runner_By_Non_Developers. We sacrifice DRY to keep code together. All of these things are done in the name of readability.
The Readability Debate
Argument 1: A rich assertion model can unnecessarily complicate a suite of tests particularly when multiple developers are involved.
Rich assertion models make it possible to assert the same condition in a variety of ways resulting in a lack of consistency. Readability naturally falls out of a week assertion model because the guess work of which form of an assertion is being used is removed.
Argument 2: With a rich model there is no guess work because assertions are literally spelled out as explicitly as they can be.
Assert.Greater(e, a) doesn’t require a mental context shift from English to parsing an expression. The spelled out statement of intent is naturally more readable for developers and non-developers alike.
My Position
I strongly agree with argument 2. When I’m reading code I derive as much meaning from the method name as I can before examining the arguments. “Greater” conveys more contextual information than “IsTrue.” When I see “IsTrue” I immediately need to ask “What’s true?” then delve into an argument which could be anything that returns a boolean value. In any case I still need to think about what condition is supposed to be true.
NUnit takes expressiveness to another level with its constraint-based assertions. The table below lists the same assertions as the table above when written as constraint-based assertions.
Equality/Inequality | Assert.That(e, Is.EqualTo(a)) Assert.That(e, Is.Not.EqualTo(a)) Assert.That(e, Is.GreaterThan(a)) Assert.That(e, Is.LessThanOrEqualTo(a)) |
---|---|
Boolean Values | Assert.That(a, Is.True) Assert.That(a, Is.False) |
Reference | Assert.That(a, Is.SameAs(e)) Assert.That(a, Is.Not.SameAs(e)) |
Null | Assert.That(a, Is.Null) Assert.That(a, Is.Not.Null) |
e – expected value a – actual value |
Constraint-based assertions are virtually indistinguishable from English. To me this is about as readable as code can be.
Even the frameworks with a weak assertion model provide multiple ways of accomplishing the same task. Is it not true that Assert.AreEqual(e, a) is functionally equivalent to Assert.IsTrue(e == a)? Is it not also true that Assert.AreNotEqual(e, a) is functionally equivalent to Assert.IsTrue(e !=a)? Since virtually all assertions ultimately boil down to ensuring that some condition is true and throwing an exception when that condition is not true, shouldn’t weak assertion models be limited to little more than Assert.IsTrue(a)?
Clearly there are other considerations beyond readability when deciding upon a unit test framework but given that much of the power of a given framework is provided by the assertion model it’s among the most important. To me, an expressive assertion model is just as important as the tools associated with the framework.
Your thoughts?
Dave, that is such a minor part of why MSTest is awful.
The test runner is awful because:
It is awful because it copies all files into its own directory before testing them eating up time and space.
It is awful because when doing this copy it does not automatically include any “content” deployment items, forcing you to jump through all sorts of hurdles to be able to run tests that use those.
It is awful because the useless .testsettings .vsmdi files constantly gets messed up, fails with cryptic errors, and causes you to spend hours trying to figure out what went wrong..
It is awful because there is no way to turn off “run tests on multiple threads” which forces tests to run simultaneously which means if you’ve got tests that access a resource like a database file that is locked while in use you are up the creek.
And then there’s MSTest itself which is awful
It is awful because it flat out refuses to allow you any way of customizing it. Check out xunit’s Subspec – this flexibility is possible simply because the Fact attribute has a virtual method on it. Yet MsTest’s TestMethod is sealed. And from what I hear never will be unsealed.
It is awful because it makes assumptions about your testing style – there isn’t a single one and if you’re doing it right you’re using the right one for the right situation.
It is awful because it encourages fixture-per-class which is a beginner paradigm
It is awful because of cryptic errors and false negatives.
Overall the gap between something like NUnit (even xUnit) and MSTest is smaller than the gap between something like Git/Mercurial and TFS, but its also far more annoying. I spend a good 60-70% of my time writing code and immediately using the test framework to try it out. Every single moment the test framework forces me to deal or even think about its limitations is amplified hundred-fold.
Consider that even Microsoft devdiv does not use MSTest
Thanks for your feedback. You make plenty of valid points but that’s really not what this post was about. I was only focusing on the code aspects of the test framework itself rather than getting into the nuances of the individual test runners.
I have a few points to some comments though. To be clear, I’m not defending MSTest but rather stating some observations.
“It is awful because it copies all files into its own directory before testing them eating up time and space.”
NUnit also makes shadow copies. Yes, it’s configurable but it’s enabled by default.
“… if you’ve got tests that access a resource like a database file that is locked while in use you are up the creek.”
If you have tests that access a resource like a database you’re doing integration testing, not unit testing. This is why things like mocks, fakes, and DI exist.
“…it encourages fixture-per-class which is a beginner paradigm”
If you’re doing true unit testing this model makes sense since you’re testing the smallest piece possible.
“…Every single moment the test framework forces me to deal or even think about its limitations is amplified…”
I definitely agree. That’s part of why I prefer things like TFS to SVN/Git/Mercurial/etc… because when I’m in VS it stays out of the way. The same holds true for MSTest in that I can just natively run the test rather than having to jump out to an external test runner or mess with a plug-in.
Thanks for responding. In the spirit of the Internet I’m going to bang my fists and and huff and puff but, I actually think that we’ve got a fairly well-reasoned and decent discussion going. (Also if I don’t respond at least once I can’t click the “Notify me of follow-up comments” link below).
1) NUnit – I did not know that, its interesting that I’ve never had that issue. Perhaps because I’ve only ever run it via TestDriven.Net or Gallio Icarus which automatically configure it not to make copies? In any case, this is an aspect of the runner, not the framework.
2.1) Would it that the only tests useful to developers were straight-definition unit tests but that’s sadly just not the case. Very commonly I will start work on a feature by writing a couple “smoke tests” against the acceptance criteria that treat the middle-tier and a test database as a unit. I will TDD the individual components from there but this VERY common way paradigm needs to be supported.
2.2) Say I’m working on a component that parses xml. I will tend to write my test-xml in a “TestFixtureName_Data.xml” file set to deploy as Content. The test itself then reads in the file and passes the XElement into the system-under-test. This is perfectly within the definition of “Unit Test” and has plenty of advantages (syntax coloring in xml for example).
3) Common misunderstanding. A unit is NOT a class. A unit is a “unit of functionality”. That can be a class, or a set of related classes, or even a (bad idea) method. This is a natural result of the TDD process; it’s the third step! Say I test-drive out a component that does a bubble-sort. It passes all my tests (yippie). Now I go to refactor. I move some methods around, and I notice that some of the code would be better off abstracted into its own class – am I then obligated to write tests just for this new class? No! I already have tests covering the darn feature, the extract-class refactoring is an implementation detail.
4) TFS stays out of the way right up until you’re working on a slow connection, or you lose connectivity, or someone edits a binary file in the same branch that you’re working in, or the server goes down, or the domain controller goes down, or you want to make a quick branch to try out some ideas on a feature. Oh…or you want to make changes in an external editor because VS is just that damn slow, and god forbid you do this while in off-line mode. These are problems our team faces daily (I loathe TFS far more than I loathe MSTest, and that’s saying a lot). There are good reasons for .Net devs to have a distaste for Git but have you actually worked with Hg? It’s a dream in comparison.
As for MSTest, Most of our devs already run ReSharper. I prefer CodeRush and yes, I have to install the TestDriven.Net plugin but that’s a very simple one-time affair. After that it runs tests MUCH faster, allows me to navigate to the error location MUCH faster, and even prints out the test name correctly (something R# doesn’t even do), all without giving me tons of information that I simply don’t need.
In conclusion: a very good-natured “You’re wrong, sir”.
Huff and puff away but you’re not convincing me of anything. I’ve never once said that MSTest doesn’t suck but that it doesn’t suck as much as people seem to think it does. Anyway, regarding your new points:
2.1) I suppose 2 + 2 = 5 for very large values of 2 and what your definition of “is” is. One reason DI has become so popular is that it allows us to work with smaller units by swapping out dependencies (such as a DAL) with a mock, fake, or stub. This technique allows for more granular tests and easier failure simulation.
2.2) You’re not testing the XML, you’re testing something in the parser. Does it really matter if you’re loading it from a file or an inline XML string as far as the test is concerned?
3.) Not once did I claim that a unit is a class or even that it’s a method, just that the model makes sense. This is particularly true when defining the smallest unit possible which is often a method. If the unit has some dependency you can either include it in the unit if appropriate OR us DI and fake the dependency to better isolate where a problem may lie.
Not surprisingly I don’t agree with your assessment of TDD. TDD says to write a test, watch it fail, write code to make it pass, watch it pass, refactor, repeat. At no point does TDD say “don’t use a fake dependency to make the test pass”. TDD also doesn’t say don’t define additional units/tests upon the discovery of a dependency. There is nothing wrong with testing a SendMail method and injecting a fake mailer to simulate success or failure.
4.) I’ve used SVN for years and never really disliked it except for when I needed to do a cleanup or Tortoise froze the UI or I needed to do an update over a slow connection or I needed to move a file within a project or… If you’ve read my Everyday TFS post you know that I hated TFS initially but then I changed my mindset and found a few tools that overcame virtually all of its shortcomings. I’ll also be the first to admit that the power tools should be included out of the box just like I’ll admit that TFS has its problems as well (particularly for offline but TFS 2010 is better about it).
The fact of the matter is that SVN/Git/Mercurial and TFS are targeting different groups. SVN/Git/Mercurial are great for disconnected and/or distributed environments but in the typical corporate network TFS is generally a better fit. Good luck committing anything with SVN if your VCS server is down.
In short, if you have problems with version control servers or domain controllers going down you have bigger problems than your VCS.
TL;DR – different approach != wrong