Improving my Test driven development

I’ve just put a new system live, that I’ve been developing over the past few weeks. Basically its a document approval system. One set of users fill in a form, more users approve or reject it, finally one team signs it off and takes the document to the next system. Very simple.

I started well and designed a framework that I thought was nicely extensible. I wanted to avoid having to create the backend ever again, and simply plug UI and workflow approval definitions on top.

I decided that the UI would be pointless to test (initially) and so created unit tests for the backend. By using Linq to SQL I discovered that I could create a blank empty database every time I ran the tests as part of an Assembly level setup method, which considerably reduced my mocking needs. Okay it removed them. I built a complete set of unit Tests to cover the interaction as I saw them going through an entire lifecycle of every process I could think of.

Given such a good start, I have to admit that my final result is not impressive. As I moved into the UI layer I stopped running the tests (we don’t use CruiseControl). We discovered that there wasn’t just one big document, but it actually made more sense as two. This required quite a big change to the Engine and Database layers to handle this but unfortunately the unit tests stayed off.

As the application was small it was incredibly easy to test that the new design worked as planned without using the tests. I am used to an automated build / test runner, so I just carried on. Now as I come to run them again I find I can’t remember what the tests were supposed to do

Enter Behaviour Driven Design or BDD. BD seems to be TDD with a new name and a better way of constructing your tests to make them readable. It follows a grammar that allows business types to get involved.

As a <role> I want to <goal> so that <motivation>.

Given <assumption> when <action> then <test>

At this point I’ve only read about it but Scott Bellware’s article seems to be the best. There also tool support but none of it seems to go from specification to code.

Update: Dan North’s article is better http://dannorth.net/introducing-bdd

Have language features changed the way we work for the better?

Craig Murphy dropped a little coding challenge on his blog the other day. This prompted me to write the responses as below. I’ve tried to target each one towards using a particular technology as the framework has developed.

Have a look and tell me which is your preferred answer, or propose a new one (but again try to target a language feature). I’d love to know who you why you choose that language feature over another.

Option 1. With Linq

private const string alphabet = "abcdefghijklmnopqrstuvwxyz";

static void Main(string[] args)

    char c = Console.ReadKey().KeyChar;

    var myChars = from character in alphabet

                  where character <= c

                  select character;

    var fullRow = myChars.Skip(1).Reverse().Concat(myChars);

    var myRows = (from character in myChars

                  select ConvertToSpaces(character, fullRow));

    var reverseRows = myRows.Reverse().Skip(1);

    myRows = myRows.Concat(reverseRows);

    Console.WriteLine(Environment.NewLine);//Skip the ReadKey line

    foreach (string line in myRows)

        Console.WriteLine(line);

    Console.ReadKey();

private static string ConvertToSpaces(char wanted, IEnumerable<char> characters)

    return new string(

    (from character in characters

     select character == wanted ? character : ' ').ToArray());

Option 2. With Generics

class Program

      private const string alphabet = "abcdefghijklmnopqrstuvwxyz";

      static void Main(string[] args)

          char c = Console.ReadKey().KeyChar;

          List<char> myChars = new List<char>();

          foreach (char character in alphabet)

              if (character <= c)

                  myChars.Add(character);

          List<char> temp = new List<char>(Reverse<char>(AllButFirst<char>(myChars)));

          temp.AddRange(myChars);

          string fullRow = new string(temp.ToArray());

          List<string> myRows = new List<string>();

          foreach (char character in myChars)

              myRows.Add(ConvertToSpaces(character, fullRow));

          IEnumerable<string> reverseRows = AllButFirst<string>(Reverse<string>(myRows));

          myRows.AddRange(reverseRows);

          Console.WriteLine(Environment.NewLine);//Skip the ReadKey line

          foreach (string line in myRows)

              Console.WriteLine(line);

          Console.ReadKey();

      private static IEnumerable<T> Reverse<T>(IEnumerable<T> myChars)

          Stack<T> reverse = new Stack<T>();

          foreach (T character in myChars)

              reverse.Push(character);

          while (reverse.Count > 0)

              yield return reverse.Pop();

      private static IEnumerable<T> AllButFirst<T>(IEnumerable<T> myChars)

           bool first = true;

          foreach (T character in myChars)

              if (first)

                  first = false;

              else

                  yield return character;

      private static string ConvertToSpaces(char wanted, IEnumerable<char> characters)

          StringBuilder result = new StringBuilder();

          foreach (char character in characters)

              result.Append(character == wanted ? character : ' ');

          return result.ToString();

Option 3. Using IEnumerable

class ProgramDNv2

    private const string alphabet = "abcdefghijklmnopqrstuvwxyz";

    static void Main(string[] args)

        DiamondChars diamond = new DiamondChars();

        diamond.chars = GetChars(     Console.ReadKey().KeyChar);

        foreach (string line in diamond)

            Console.WriteLine(line);

        Console.ReadKey();

        private static string GetChars(char MidChar)

            string chars = alphabet.Substring(0, alphabet.IndexOf(MidChar) + 1);

            return chars;

    private class DiamondChars : IEnumerable<String>

        public string chars;

        #region IEnumerable<string> Members

        public IEnumerator<string> GetEnumerator()

            int index = 0;

            foreach (char character in chars)

                yield return MakeLine(character, index);

                index++;

            index--;

            foreach (char character in chars.Reverse())

                if (index < chars.Length - 1)

                    yield return MakeLine(character, index);

                index--;

        #endregion

        private string MakeLine(char character, int index)

            string line = new string(chars[index], 1).PadLeft(chars.Length - index);

            if (index > 0)

                 return line + new string(chars[index], 1).PadLeft(index * 2);

            else

                 return line;

        #region IEnumerable Members

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()

            return GetEnumerator();

        #endregion

Option 4. Simple recursion

    class ProgramDNv1

        static void Main(string[] args)

            string input = "abcde";

            StringBuilder result = new StringBuilder();

            BuildDiamond(0, input, result);

            Console.WriteLine(result);

        private static void BuildDiamond(int index, string input, StringBuilder result)

            string line = new string(input[index], 1).PadLeft(input.Length - index);

            string line2 = new string(input[index], 1).PadLeft(index * 2);

            result.Append(line);

            if (index > 0)

                result.Append(line2);

            result.Append(Environment.NewLine);

            if (index < input.Length - 1)

                BuildDiamond(index + 1, input, result);

                result.Append(line);

                if (index > 0)

                    result.Append(line2);

                result.Append(Environment.NewLine);

Now add a quick comment below telling you preferred implementation and briefly why. Thanks.

Providing extensible classes: An alternative to inheritance and events

I have a list that is cached. When I come to refresh the list I don’t want to just throw it away and start again, so I use the following.

private void UpdateChildren(IEnumerable<Guid> fromDb, IEnumerable<Model> cached )

    List<Model> unseen = new List<Model>(cached);

    foreach (Model child in cached)

        Guid childID = child.Id;

        if (fromDb.Contains(childID) == false)

            cached.Add(CreateModel(childID));

        //seen it now

        unseen.Remove(child);

    //Delete missing items

    foreach (Model child in unseen)

        DisposeModel(child);

        cached.Remove(child);

private void DisposeModel(Model child)

    // Maybe do nothing

private Model CreateModel(Guid child)

    return new Model(child);

This works fine but it isn’t very generic. Let’s make some changes

Creating a class

Currently this isn’t really a class, it could be expressed as three static methods. By including the cached list we can start to provide a reusable class with state.

class CachedCollection

    private List<Model> _cached = new List<Model>();

    private void UpdateChildren(IEnumerable<Guid> fromDb)

        List<Model> unseen = new List<Model>(_cached);

        foreach (Model child in _cached)

Mapping from Guid to Model

Currently we know how to compare a Guid and a Model, since the model has a property that exposes that Guid again. We cannot be certain that this so the case every time so now we need to go for a more generic pairing. In this case I suggest moving from a List<Model> to a Dictionary<Guid, Model>.

private Dictionary<Guid, Model> _known = new Dictionary<Guid, Model>();

public void UpdateItems(IEnumerable<Guid> iEnumerable)

...

    foreach (Model item in iEnumerable)

        if (_known.ContainsKey(item))

Now our paired relationship is completely external to the classes themselves. This is a common feature of code that has been made reusable, in that the method of relating things is sub optimal, in this case with a memory overhead. The dictionaries key hashing lookup will provide a performance boost whether that is required for this application or not.

Applying Generic types

Not much makes a code more generic than applying Generic types, in this case we can refactor the fixed types, but this is full of issues. In this case lets consider converting the Guid to TActual and the Model to TDesired

private Model CreateModel(Guid child)

    return new Model(child);

If we look at the CreateModel method we note that this is currently creating a new TDesired from a TActual. We can add a new constraint to the the class defintion, but I don’t know of a way to specify that a Type has a particular constructor that takes a single argument.

//Just gets me a ') expected' error.

class CachedCollection<TActual, TDesired> where TDesired:new(TActual)

I can of course make my class abstract and leave it to the derivation to handle, but that means I need a derived class.

abstract class CachedCollection<TActual, TDesired>

    private List<Model> _cached = new List<Model>();

    private void UpdateChildren(IEnumerable<TActual> fromDb)

        List<TDesired> unseen = new List<TDesired>(_cached);

        foreach (TDesired child in _cached)

            TActual childID = child.Id;

            if (fromDb.Contains(childID) == false)

                _cached.Add(CreateModel(childID));

            //seen it now

            unseen.Remove(child);

        //Delete missing items

        foreach (TDesired child in unseen)

            DisposeModel(child);

            _cached.Remove(child);

    private abstract void DisposeModel(TDesired child);

    private abstract TDesired CreateModel(TActual child);

Of course when we consider using abstract and virtual methods, we start to consider functions as pieces of code that can be referred to. Before DotNet 2.0 there the only way to do this was to use delegates. Now where these functions are optional you may see them exposed as events. These days we have some alternatives.

Generic delegates

There are three great Generic delegate types available since DotNet 2.0

//http://msdn.microsoft.com/en-us/library/kt456a2y.aspx

public delegate TOutput Converter<TInput, TOutput>(TInput input)

//http://msdn.microsoft.com/en-us/library/system.action.aspx

public delegate void Action<T>(T input)

//http://msdn.microsoft.com/en-us/library/bfcke1bz.aspx

public delegate bool Predicate<T>(T input)

These are great building blocks and along with 3.5’s lambda syntax let us provide functions almost as simply as by deriving a class. For example we can convert

private abstract TDesired CreateModel(TActual child);

public void Update()

...

    x = CreateModel(...);

...

Into

private Converter<TActualItem, TAlternateItem> _createAlternateItem

    = new Converter<TActualItem, TAlternateItem>(

        (TActualItem actual) => {throw new NotImplementedException();}

);

public void Update()

...

    x = _createAlternateItem (...);

...

Conclusion

For me, this is very powerful. It means we can take our original example and produce the following non-abstract class.

class CollectionSync<TActualItem, TAlternateItem>

    public CollectionSync(Converter<TActualItem, TAlternateItem> createAlternateItem)

        _createAlternateItem = createAlternateItem;

    public CollectionSync(Converter<TActualItem, TAlternateItem> createAlternateItem,

        IEnumerable<TActualItem> list)

        : this(createAlternateItem)

        foreach (TActualItem actual in list)

            _known.Add(actual, _createAlternateItem(actual));

    private Dictionary<TActualItem, TAlternateItem> _known

        = new Dictionary<TActualItem, TAlternateItem>();

    private Converter<TActualItem, TAlternateItem> _createAlternateItem = null;

    private Action<TAlternateItem> _disposeAlternate

        = new Action<TAlternateItem>((TAlternateItem alternate) => { });

    private Action<KeyValuePair<TActualItem, TAlternateItem>> _udateSingleAlternate

        = ((KeyValuePair<TActualItem, TAlternateItem> kvp) => { });

    public void UpdateItems(IEnumerable<TActualItem> iEnumerable)

        List<TActualItem> unseen = new List<TActualItem>(_known.Keys);

        foreach (TActualItem item in iEnumerable)

            if (_known.ContainsKey(item))

                _udateSingleAlternate(

                    new KeyValuePair<TActualItem, TAlternateItem>

                        (item, _known[item]));

                unseen.Remove(item);

            else

                TAlternateItem newLVI = _createAlternateItem(item);

                _known[item] = newLVI;

        //Delete missing items

        foreach (TActualItem item in unseen)

            _disposeAlternate(_known[item]);

            _known.Remove(item);

Streaming ForEach using a yield

I’ve been looking for an article on converting foreachs into a less memory intensive operation and I remember reading http://msdn2.microsoft.com/en-us/vcsharp/bb264519.aspx before, I just couldn’t find it.

Basically it uses an example of all numbers in the New York phone book to develop a means of streaming the loops using custom iterators using yield instead of loading it all in and looping through it all.

Porter Stemmer 2: C# implementation

UPDATE This code now available via https://bitbucket.org/alski/englishstemmer

I’ve been busy at work on non-coding things for a couple of weeks. I also have a few processes I want to introduce for the next stage of development with my team. So I took the opportunity to write some code and trial the processes out myself.

First the results

As I mentioned before, there is an updated version of the Porter Stemmer, but there wasn’t a C# implementation of it.

The two files that implement the algorithm are highlighted. Everything else is to enable Unit testing.The tests were built up initially to assist in the logic of parsing each method within the code, but also include regression tests using all examples given on the the tartarus website.

This implementation correctly parses the example files from the tartarus website with one exception, ‘fluently’ does now parse to ‘fluent’ instead of ‘fluentli’. This matches all the other -ly words.

I’ve pretty much just followed the description on the snowball stemmers site for the ‘English’ stemmer, and referenced the implementation of the algorithm in snowball where I had ambiguities.

Now, additionally developed processes

This is the first 100% completely Test Driven piece of work I have completed. It’s also the one where I have paid attention to the Coverage. I am quite pleased with the results.

The only parts that aren’t covered are extra code put in to check inputs are within range, and some of the Exception2() cases (shown).

While I got about 95% coverage the remaining 2% comes all from Martin’s excellent vocabulary.txt and output.txt (actually 97% comes from these I just replicate 95%).

What was interesting was that for the first time in quite a while I have had a unit of work that I knew exactly when it was completed. I managed to do the simplest thing that worked and didn’t just add in another feature.

Download

https://bitbucket.org/alski/englishstemmer

Word Stemming (updated)

It looks like there is an updated and hopefully improved stemmer, the English stemmer.

http://snowball.tartarus.org/algorithms/english/stemmer.html

Functionality v Framework

This is going to sound so simple, so bear with me, but at the moment its becoming my personal mantra.

There is a difference between the functionality of a system and the framework it runs in.

Lets take a simple example, we have a process at work. Basically it gets a list of files from a database copies the files over and updates a different database to point to the new files. It is currently hosted in a windows service. Can you separate the functionality from the framework?

Functionality

It gets a list of files from a database copies the files over and updates a different database to point to the new files

Framework

Hosted as a service

Why does this matter?

Well the problem is that I also need to run the functionality on ad-hoc basis with some special parameters. So what do we do, well simply we split the functionality from the framework, put the functionality in a separate assembly, call it once from the service and again separately from a console application or a WinForms GUI.

But that’s only the first level

We can consider the same at a method level, where we have a for loop with a code body. It is impossible to test a single iteration of the logic in the for body without going through the framework.

            foreach (string name in _list.Keys)
            {
                progress.DoOne("Adding " + name);

                TableRow tr = resultsTable.AddRow();
                tr.AddCell().Value = name;
                Dictionary<DateTime, TestResult> data = _list[name];
                //now add cells
                foreach (DateTime dt in _dates)
                {
                    TableCell cell = tr.AddCell();
                    if (data.ContainsKey(dt))
                    {
                        switch (data[dt])
                        {
                            case TestResult.Fail:
                                cell.Value = "Fail";
                                cell.BGColor = Color.Red;
                                break;

                            case TestResult.Pass:
                                cell.Value = "Ok";
                                cell.BGColor = Color.Green;
                                break;

                            case TestResult.Untested:
                                cell.Value = "";
                                cell.BGColor = Color.Yellow;
                                break;

                        }
                    }
                }

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

What is more is that we can ONLY call this functionality with its framework. We can refactor the body of a for loop out to a method that takes parameters or even a Visitor pattern. This lets us call it separately, for example we can now Unit Test the functionality.

At the class level we can also refactor so that we separate the multi-threading code out and leave the business logic. This lets us host the logic with or without the threading. So maybe, once again we can unit test the logic separately, or host in different applications as before.

So what does this mean?

I think it might be time for Al’s First law (see Haacked’s law).

The greatest barrier to reuse and testing is a mix of functionality and framework in the same unit.

I wonder if there will ever be 2nd?

Honest, I’m not copying off Jeff

After finally pressing post my last post on the Singleton Design pattern, I was quite surprised to find Jeff Atwood talking about Rethinking Design Patterns itself. I was however quite pleased to see that he hasn’t contradicted my conclusions.

Jeff proposes that Design Patterns is No Silver Bullet.

But I have two specific issues with the book:

Design patterns are a form of complexity. As with all complexity, I’d rather see developers focus on simpler solutions before going straight to a complex recipe of design patterns.
If you find yourself frequently writing a bunch of boilerplate design pattern code to deal with a “recurring design problem”, that’s not good engineering-it’s a sign that your language is fundamentally broken.

In fact in the comments he links to a previous post Head first design patterns where he proposes that this other book is a contradiction, the first part of this quote comes from the book.

First of all, when you design, solve things in the simplest way possible. Your goal should be simplicity, not “how can I apply a pattern to this problem.” Don’t feel like you aren’t a sophisticated developer if you don’t use a pattern to solve a problem. Other developers will appreciate and admire the simplicity of your design. That said, sometimes the best way to keep your design simple and flexible is to use a pattern.

Filling 593 pages with rah-rah pattern talk, and then tacking this critical guidance on at the end of the book is downright irresponsible. This advice should be in 72 point blinking Comic Sans on the very first page.

This is very much what I wanted to express. I think that people need to learn and learn and learn. They go through stages where their knowledge doesn’t have sufficient maturity to let them come to the right conclusion. They have to make mistakes first. One of those mistakes is trying to use patterns to solve every problem.

Where I think myself and Jeff differ is that he believes that the books and other sources should come with the warnings that they need to be used in moderation (strangely enough he also comes to a conclusion regarding moderation in the following days post The Technology Backlash). I believe that we need to make the mistakes where we learn why they are mistakes.

I suppose the big question is,

Who can afford for us to learn on their time and make mistakes in their code base as we develop design maturity ?

Technorati Tags: Jeff Atwood, Coding horror, Design Patterns

Design maturity of the singleton pattern

The more I am exposed to ‘professional developers’ the more I begin to realise that there multiple phases in their Design’s maturity. One of the best way to demonstrate this is to look at how people tend to use the Singleton pattern.

Phases over time

Oblivious
- The developer is unaware of Singleton. They don’t use it at all.
- If they come across it in a piece of code they don’t realise it is a common re-occurring piece of code
Discovery
- The developer comes across an example piece of code, an article or a book that contains singleton and thinks that is really nice.
- They understand how it is used in this context.
Familiarity
- The developer starts to use Singleton for the first time in one piece of work.
- They develop an understanding of the problem it solves.
Abuse
- They developer starts seeing the Singleton problem everywhere.
- They create a lot of classes as Singletons
Maturity
- The developer realises that there are alternatives to the Singleton pattern.
- They start to use Singleton as appropriate.

Now I currently am in a position where I have been through this cycle but I am now starting to see others doing the same thing. I think the important lesson to speed your progress from Abuse to Maturity is the common saying

Do the simplest thing that works.

Isn’t Singleton simple?

Singleton is not the simplest way to ensure that one and only one object is available at all times within the lifetime of a program. That is just a static instance.

static object _instance = new object();

Common usage of Singleton (including the Gang of Four example) combines lifetime availability with lazy instantiation. The issue here is whether lazy instantiation is required. Would it not be simpler to just use a static constructor?

Now I can’t answer that for your project, however what I can tell you is that today I have written my first Singleton in nearly 2 years, because previously all cases where I needed a long lived object that there was only one of, I just used a static instance. Each time these objects were wrapped inside a class, which itself was not static, and only internal members of that class needed access to the long lived object.

So why use a Singleton

Today however I find myself attempting to share a cache between two Dialogs. Both these dialogs use the cache, and both used keyed Indexers (i.e. cache[key]) to get and set the items cached. Unfortunately the definition for an indexer requires an instance variable.

object this[string key]
{ 
    get { ... }
    set { ... }
}

Now I could have instead dumped the indexer and gone with static methods, but my aim here was not to re-write the underlying cache but just provide a Dictionary<string, Assembly>. So sticking with the Do the simplest thing that works idea, my cache Is-a Dictionary<>, this means that I don’t have to re-write all of the methods that I want to expose. In fact I expose much functionality than I currently use. This meant that all I had to write was the singleton instance.

/// <remarks>
/// Would just be a Dictionary&lt;String, Assembly&gt; but 
/// I want to keep only a single static copy
/// </remarks>
class ChooserCache : Dictionary<string, Assembly>
{
    private static Dictionary<string, Assembly> _cached 
          = new Dictionary<string,Assembly>(); 

    private ChooserCache()
    { } 

    public static ChooserCache Instance
    {
        get { return _cached; }
    } 
}

The alternative was to go the Has-a route and implement each and every method as a static which referred to a static instance of the Dictionary. This also means that I have to add new methods as I use more functionality from the Dictionary itself.

Coding Horror: C# and the Compilation Tax

I’ve been watching this with interest

Coding Horror: C# and the Compilation Tax
Dennis Forbes – Pragmatic Software Development : Process, People, Programming
Coding Horror: Background Compilation and Background Spell Checking
Knowing.NET – Death and Taxes: Compilation, Type, and Test

My opinion is quite simple. Give people what they want.

If not they will implement it themselves.

I use a workaround. I have three build commands.

I’ve just changed some code, and I am not 100% certain it will compile so I want a quick answer. I use F6 redefined to Build.BuildSelection. This builds just the current project and its dependencies. It doesn’t require me to press No because it doesn’t ask me if I want to run the last good result. It just lets me see how many errors and start fixing them with Ctrl-Shift-F12. Its the fastest way I know to compile just what I have changed.
I’ve got an assembly building. Now I press either,
Repeat Test Run from TestDriven.net (Jamie, Can we have a keyboard shortcut please.)
or F5.
Either of these will now build just the application I need to run/test, and will detect that I have already built some of the dependencies so it doesn’t spend time rebuilding what it already has.
Finally its Ctrl-Shift-B. This is the checkin build. Build everything. Run all your unit tests, and finally checkin.

It gets me closer to an ideal. I am completely with Larry O’Brien on what I really want, but I am not going to get it until the next version of Studio, unless somebody comes up with a better IDE.

alski.net