Jul 15

This is part four of my refactoring series, you find the index here

Part one was a short success story, it can be found here.
Part two was a introduction to refactoring, available here.
In part three I talked about the requirements, you can read it here
You can read about the very basic refactoring methods in Part four of my series.

As I said in the last post, I’ll refactor some random open source projects, in this post it’s how updating the database schema is handled in Personal Task Manager.

If you have any suggestions what project to refactor, please leave a comment.

What the update method was before

public static void UpdateDataBase()
{
	bool findNextUpdate = true;
	while (findNextUpdate)
	{
		Configuration oldVersion = ConfigurationHelper.GetConfiguration(ConfigurationKey.DataBaseVersion);
		if (UpdateFromV00ToV09(oldVersion))
			continue;
		if (UpdateFromV09ToV091(oldVersion))
			continue;
		// SNIP (10 more updates)
		if (UpdateFromV_0_9_10_ToV_1_0_0(oldVersion))
			continue;
		if (UpdateFromV_1_0_0ToV_1_0_1(oldVersion))
			continue;
		if (UpdateFromV_1_0_1ToV_1_0_2(oldVersion))
			continue;
		findNextUpdate = false;
		RegisterAddins();
	}
}

Wrong:

As you can see, for every update you create, you have to touch this method, though it’s not complicated, you might simply forget it.

The continues are somewhat useless as applying a update results in the next version.

The call to RegisterAddins has nothing to do with the db updates, it should thereby be removed.

The version is a string, contained in the oldVersion variable as oldVersion.Value.ToString().

What the updates were before

private static bool UpdateFromV_1_0_1ToV_1_0_2(Configuration oldVersion)
{
    if (string.Compare(oldVersion.Value.ToString().Trim(), "1.0.1") == 0)
    {
        try
        {
            DbHelper.AddColumn("ApplicationsLog", "Caption", "VarChar(255)");
            ConfigurationHelper.SaveConfiguration(new Configuration(ConfigurationKey.DataBaseVersion, "1.0.2"));
            return true;
        }
        catch (OleDbException ex)
        {
            Logger.WriteException(ex);
            return false;
        }
    }
    return false;
}

private static bool UpdateFromV_0_9_10_ToV_1_0_0(Configuration oldVersion)
{
    if (string.Compare(oldVersion.Value.ToString().Trim(), "0.9.10") == 0)
    {
        try
        {
            DbHelper.AddColumn("Tasks", "Hidden", "Bit");
            DbHelper.AddColumn("Tasks", "Priority", "Integer");
            DbHelper.AddColumn("Tasks", "Notes", "VarChar(255)");
            ConfigurationHelper.SaveConfiguration(new Configuration(ConfigurationKey.DataBaseVersion, "1.0.0"));
            return true;
        }
        catch (OleDbException ex)
        {
            Logger.WriteException(ex);
            return false;
        }
    }
    return false;
}

Wrong:

The version string, actually object, is always manipulated in the same way

The try / catch block is always the same

Both violates DRY (don’t repeat yourself)

What the updates could look like

public abstract class DbUpdate
{
    public bool CanUpgradeFrom(Version currentVersion)
    {
        return currentVersion == UpgradeableVersion;
    }

    public abstract Version UpgradeableVersion { get; }
    public abstract Version NewVersion { get; }

    public bool Apply()
    {
        try
        {
            ApplyDbUpdate();
            return true;
        }
        catch (Exception ex)
        {
            Logger.WriteException(ex);
            return false;
        }
    }

    protected abstract void ApplyDbUpdate();

    public override string ToString()
    {
        return UpgradeableVersion + " To " + NewVersion;
    }
}

Changes:

The Version strings were compatible with the Version class, therefore the Version class is now used.

The Repeative code is encapsulated in the CanUpgradeFrom and Apply methods. DRY is now followed.

The ToString override is simply for convenience, if we have those DbUpdate instances in a collection and look at them in the debugger, you don’t have to step into them.

Encapsulating the update method into its own class

public class DbUpdater
{
    public void ApplyUpdatesAsNeeded()
    {
        var updates = OrderByVersion(GatherAllUpdates());
        var currentVersion = GetCurrentVersion();

        if (IsLatestVersion(currentVersion, updates))
        {
            return;
        }

        foreach (var update in updates)
        {
            if (!update.CanUpgradeFrom(currentVersion))
            {
                continue;
            }

            if (!update.Apply())
                break;

            currentVersion = update.NewVersion;
            SaveNewDbVersion(currentVersion);
        }
    }

    private bool IsLatestVersion(Version currentVersion, IList<DbUpdate> updates)
    {
        if (updates.Count == 0)
        {
            return true;
        }

        return updates[updates.Count - 1].NewVersion == currentVersion;
    }

    private void SaveNewDbVersion(Version version)
    {
        ConfigurationHelper.SaveConfiguration(new Configuration(ConfigurationKey.DataBaseVersion, version.ToString()));
    }

    private IList<DbUpdate> OrderByVersion(IEnumerable<DbUpdate> updates)
    {
        var result = new List<DbUpdate>(updates);
        result.Sort(delegate(DbUpdate a, DbUpdate b)
        {
            return Comparer<Version>.Default.Compare(a.NewVersion, b.NewVersion);
        });
        return result;
    }

    private Version GetCurrentVersion()
    {
        var versionString = ConfigurationHelper.GetConfiguration(ConfigurationKey.DataBaseVersion)
            .Value.ToString().Trim();

        return new Version(versionString);
    }

    private IList<DbUpdate> GatherAllUpdates()
    {
        var updates = new List<DbUpdate>();

        var types = typeof (DbUpdate).Assembly.GetTypes();
        foreach (var type in types)
        {
            if (!type.IsSubclassOf(typeof(DbUpdate)))
            {
                continue;
            }

            updates.Add((DbUpdate)Activator.CreateInstance(type));
        }

        return updates;
    }
}

Changes:

If the update failed, no other updates will be applied as the version is still the same and it would thereby be useless.

With the IsLatestVersion method the load time was reduced, although the updates are obtained via reflection.

The use of reflection was necessary as no IoC container was used and I didn’t want to introduce one for just this case nor did I want to hardcode each update.

Individual updates in their new form

public class Update101To102 : DbUpdate
{
    public override Version UpgradeableVersion
    {
        get { return new Version(1, 0, 1); }
    }

    public override Version NewVersion
    {
        get { return new Version(1, 0, 2); }
    }

    protected override void ApplyDbUpdate()
    {
        DbHelper.AddColumn("ApplicationsLog", "Caption", "VarChar(255)");
    }
}

public class Update0910To100 : DbUpdate
{
    public override Version UpgradeableVersion
    {
        get { return new Version(0, 9, 10); }
    }

    public override Version NewVersion
    {
        get { return new Version(1, 0, 0); }
    }

    protected override void ApplyDbUpdate()
    {
        DbHelper.AddColumn("Tasks", "Hidden", "Bit");
        DbHelper.AddColumn("Tasks", "Priority", "Integer");
        DbHelper.AddColumn("Tasks", "Notes", "VarChar(255)");
    }
}

The Result:

The db updating part of the application now follows the open closed principle, you don’t have to touch any of the written code, you just add new.

The DbUpdate class follows the single responsibility principle, you only have to change the DbUpdate implementation that represents the current update if you make changes to the db schema.

If you take single responsibility a little broader as in “updating the schema”, the DbUpdater class also follows that principle. (It gathers the updates, orders and applies them, if anything of that changes it has to change too, thereby, if you take the srp strict, it actually has three responsibilities.)

You can’t forget to add the new update as you don’t have to add it anywhere, you just create the new class.

Typos are much less likely as the version is now represented with the Version class.

Updating the db schema is child’s play now.

By the way, the updating method that was used before is a typical example of organically growth, you start by adding the first update method, later a second comes in an you “just add a conditional” but before you realize it, you have an abomination waiting for copy and paste errors. Try to refactor those methods the very moment you want to add this conditional and it’s likely that more will follow. Doing it before might violate YAGNI (You Ain’t Gonna Need It) and / or KISS(Keep It Simple Stupid)

A word to testing: I didn’t create any tests as there was no way to do it without really great effort. Almost everything in that application is static thereby stubbing / mocking alone would have required some serious changes and hard coding a test would just verify that “yes you can upgrade from specific version X to specific version Y”, not that useful compared to the required effort.

Tagged with:
Jul 14

This is part four of my refactoring series, you find the index here

Part one was a short success story, it can be found here.
Part two was a introduction to refactoring, available here.
In part three I talked about the requirements, you can read it here

In this part I’ll talk about very basic refactoring methods, something you can start with right away, without further knowledge and the like.

The following parts will contain “real world” refactoring from random open source projects.

Naming:

It’s very important to name your classes, methods and variables properly. If you have trouble to find a good name for a method or class, chances are high that it is doing too much. In such a case, you should split it up.

I don’t think that I should give you yet another example of bad naming, every developer knows the problems that arise from bad naming.

Conditionals:

Try to avoid unnecessary nesting, use guard clauses instead.

You might also try to consolidate the conditionals but be careful with it as readability might suffer. The favorable way is to encapsulate the whole condition in its own method.

Nested:

if (string.IsNullOrEmpty(input) == false))
{
    if (input.StartsWith("ABC"))
    {
        if (input.EndsWith("123"))
        {
            DoSomething(input);
            DoSomethingMore(input);
        }
    }
}

With guard clauses:

if (string.IsNullOrEmpty(input))
{
    return;
}

if (!input.StartsWith("ABC"))
{
    return;
}

if (!input.EndsWith("123"))
{
    return;
}

DoSomething(input);
DoSomethingMore(input);

Consolidated:

if (string.IsNullOrEmpty(input) || !input.StartsWith("ABC")
    || !input.EndsWith("123"))
{
    return;
}

DoSomething(input);
DoSomethingMore(input);

Encapsulated:

if (InputIsNotWellFormatted(input))
{
    return;
}

DoSomething(input);
DoSomethingMore(input);

Method Extraction:

Long methods should be split up, now we have to define what long means, some say your methods shouldn’t have more than 10 lines, I’d say, it depends on it’s readability. Yet I’d try to stay blow 20 lines.

If you have trouble to name a method properly, it’s actually a no brainer to refactor it because you should know what you want it to do, thereby you can simply extract what doesn’t belong there. It’s a little more difficult if the method is already dealing with just a single problem, yet the method covers multiple screens. In those cases you have to think how to split it up properly. More on that in the next parts.
Another reason for method extraction is pure readability.

Example:
If you want to process a customer registration, you validate it and only if it’s valid, it’s a real customer. The next steps would be to save it and send an activation mail out.

You could do it this way:

public void ProcessCustomerRegistration(CustomerRegistration registration)
{
    if (registration == null)
    {
        throw new ArgumentException();
    }

    if (registration.Adress == null)
    {
        throw new ArgumentException();
    }

    if (registration.ImportantProperty == null)
    {
        throw new ArgumentException();
    }

    if (registration.AnotherImportantProperty == null)
    {
        throw new ArgumentException();
    }

    var customer = new Customer();
    customer.Adress = registration.Adress;
    customer.Username = registration.Username;
    customer.OtherProperty = registration.OtherProperty;

    using (var con = new SqlConnection(_conStr))
    {
        using (var cmd = con.CreateCommand())
        {
            cmd.CommandText = "Insert Command";
            //parameter.....

            con.Open();
            cmd.ExecuteNonQuery();
        }
    }

    using (var mailClient = new MailClient())
    {
        var mail = new Mail()
        //.......
        //.......
        //.......
        //.......
        //.......
        mailClient.Send(mail);
    }
}

Or you could make your method reflect what you want to do:

public void ProcessCustomerRegistration(CustomerRegistration registration)
{
   if(IsInvalid(registration))
       throw new ArgumentException();

    var customer = ConvertToCustomer(registration);
    Persist(customer);
    SendActivationEMailTo(customer);
}

Tagged with:
Jul 13

This is part three of my refactoring series, you find the index here

Part one was a short success story, it can be found here.
Part two was a introduction to refactoring, available here.

In this part I’ll talk about what is required to start refactoring and what tools can help with it.

You have to care about the code:

This is the single most important point. If you don’t care about your code, refactoring is useless as you won’t gain anything from it because after a very short time you will stop refactoring it and thereby allowing it’s quality to degrade again. The good news is, as you are reading this, you obviously care about your code.

Do it continuously:

Although refactoring is still very great if done together with scheduled code reviews, the real power lies in continuous refactoring. Whenever you find something you don’t like, just change it right away. Code is not static, you won’t hurt anyone by changing it, quite the opposite is the case. If you made an awkward piece of code into something everyone understands without looking anything up, you saved those involved time. This is something you’d miss if you’d only refactor while doing scheduled code reviews. Although you’ll save time afterwards, you already lost some because those who worked with it had to look something up, read it twice or anything similar.

Testing:

Unit testing is not a fix requirement for refactoring, yet it gives you the confidence that you didn’t break anything with your changes. Further, unit testing will enable other changes as well, still for the same reason, you know when you broke something. If you are working without tests, you’ll most likely hesitate to change something that already works. Though you should remember, if the code is bad, maintenance hell is one step away. If you have to fix a bug in bad code, it’ll take longer and won’t be any fun at all.
If you have no experience with unit tests, I’d advice you to learn about it. Refactoring and unit testing are a perfect fit. This, this and this would be good starting points.

Don’t fight alone:

Try to win your team, all of you will have it easier in the end. Refactoring in a pair is not only more efficient, as four eyes see more than two, it’s also fun to listen to you coworker, especially if he wrote the code you are refactoring.
If no one besides you wants to do it, you could just start to refactor the parts you are working on and convince you team later by showing your results.
But be warned, refactoring alone is not easy, others might destroy your work by just getting their work done. It’s also important that you don’t refactor too much, you could fall behind with your tasks and thereby get in trouble. Try to meet schedules and refactor only when there is time.
The good news is that usually just a little effort is enough to win a few team members and if they start refactoring too, it’ll be easier for all of you.

Tools

Visual Studio AddIns:
JetBrains ReSharper: I just can’t work without it anymore, really great.
DevExpress CodeRush / Refactor: Although I never used it, many speak very highly about it.

Unit testing and mocking

MbUnit: My unit testing framework of choice
NUnit: Port of the popular JUnit.
Rhino.Mocks: Simply the best mocking framework I know.
Moq: Another great mocking framework.

Tagged with:
Jul 12

This is part two of my refactoring series, you can find the index here.

Part one was a short success story, it can be found here.

In this part I’ll introduce refactoring, what it is, what it can do for you and why you should do it.

What is Refactoring?

Refactoring is changing the appearance and structure of your code but not its behavior. Refactoring is an answer to ever changing requirements and the resulting modifications to your code. It won’t solve all problems but it will make your life way easier by improving the code quality.

What can it do if it is just a cosmetic change?

Have you ever struggled with reviewing someone else’s code or even your own after a month or two? Refactoring will help in these situations and more. If applied correctly, the code will be much easier to work with.

Does it really work?

Yes. Take for example the following snippet:

public IEnumerable<ContentInformation> GetContentFor(IEnumerable<MatrixKey> selection)
{
    var lowX = selection.OrderBy(x => x.X).First().X;
    var lowY = selection.OrderBy(x => x.Y).First().Y;

    foreach (var key in selection)
    {
        var x = key.X-lowX;
        while(x > _upperXBound)
        {
            if(_upperXBound == 0)
            {
                x = 0;
            }
            else
            {
                x -= _upperXBound + 1;
            }
        }
        var y = key.Y-lowY;
        while(y > _upperYBound)
        {
            if(_upperYBound == 0)
            {
                y = 0;
            }
            else
            {
                y -= _upperYBound+1;
            }
        }

        var internalKey = new MatrixKey(x, y);

        if (_content.ContainsKey(internalKey))
        {
            yield return _content[internalKey].CloneTo(key);
        }
        else
        {
            yield return new NullContentInformation(key);
        }
    }
}

And now, compare it to its refactored version:

public IEnumerable<ContentInformation> GetContentFor(IEnumerable<MatrixKey> selection)
{
    var lowX = selection.OrderBy(x => x.X).First().X;
    var lowY = selection.OrderBy(x => x.Y).First().Y;

    foreach (var key in selection)
    {
        var x = Transform(key.X, lowX, _upperXBound);
        var y = Transform(key.Y, lowY, _upperYBound);

        var internalKey = new MatrixKey(x, y);

        if (_content.ContainsKey(internalKey))
        {
            yield return _content[internalKey].CloneTo(key);
        }
        else
        {
            yield return new NullContentInformation(key);
        }
    }
}

private int Transform(int source, int lowerBound, int upperBound)
{
    if(upperBound == 0)
    {
        return 0;
    }

    var transformed = source - lowerBound;

    while(transformed > upperBound)
    {
        transformed -= upperBound + 1;
    }

    return transformed;
}

Why should you do it?

For one, it will make your life easier, you won’t get as much headaches from two screens long methods with nesting up to five levels deep. Simply because they just won’t exist anymore.
It will save money as the time to feature as well as time to fix a bug will reduce. New developers will be able to adopt your code earlier too.

Tagged with:
Jul 11

This is part one of my refactoring series, you can find the index here.

In this part I’ll tell a short story about refactoring and give a conclusion.

The Story

In my company we have a legacy VB6 application, a trainee started it about 6 years ago. As the requirements were very low, it took only a week to finish it as it. The expected lifetime was 3 month as it should just last the time that was required to evaluate proprietary solutions.

As it was just a quick and dirty solution by design, all ui elements still had their default name (command1, command2, label1, label2 etc.). Everything was done in the event handling methods (command1_click).

As time passed, it was decided to extend the application a little and as the trainee writing it in the first place left, it was done by new trainee. This trainee started the habit of using copy and paste. Code was simply copied and pasted whenever it was needed.

Three months later, the application has already seen 4 different trainees as it was the companies way of “testing new trainees”. It had about 5000 lines of code, the main form contained 3000 of those. The control names reached the 100, structure was not existent. But still, management decided, it was better than the ready and proven solutions and so it came that it kept being used. The only good thing was, it was no longer used to terrify new trainees.

From time to time, new features were requested and implemented, again without any structure, it was just kept as dirty as it was, everyone involved just wanted to get his work done. By all means.

One year ago, I started in the company. As I was mostly responsible for C# and our SQL Server, I had very little to do with our legacy applications but 2 month ago, my coworker who’s main task is to maintain the VB6 stuff, approached me and asked for help. It turned out, he was trying to implement a new feature and that since two weeks already.

This was the first time I encountered this mess, until then, I just heard complaints of my coworker and thought he was over exaggerating, it turned out he did not. In fact, it was much worse. By then, it consisted of 20000 lines of code and was just a mess with no structure at all. Almost no useful ui element names (my co worker named the elements he worked with properly, but as it seems, he was the first to do so.). Most event handling methods spanned multiple screens, nesting was up to 10 levels deep and that mostly without any indentation. The only used pattern was copy and paste.

I suggested to implement the new feature in isolation, therefore we created a new module and wrote it there. My coworker asked me now how we could make it work together with the rest of the code, my answer was that we would refactor the affected parts. As he never heard about refactoring, I had to explain it to him. Although he was a little skeptical about it, we began to change the existing methods but instead of just making it work together with the new module, I encouraged him to get all touched methods refactored as well.

After five days, the new feature worked nicely and was easy to maintain and change. We encapsulated about ten more distinct features into their own module. All of this lead to way better code. My coworker was so fond of it that we started to refactor more and more parts of the application and in just 2 month, we managed to reduce the 20000 lines of pure horror to just above 8000 lines of self explaining code. But the reduction was not the only benefit, the time to new feature was reduced drastically as well, from weeks to just days or less.

The Conclusion

You should always consider that you are not the only one who will touch the code. Even if that is not the case, in two month you are “someone else” too, at least in terms of code knowledge. Therefore, code should be written in a self explaining way.

Writing good code in the beginning has also the benefit of protecting it against getting bad in the near future. You can observe it in other areas too, if you have for example a clean room, it is likely to stay clean, dirty rooms on the other hand tend to get dirtier. The reason for that is, we think that someone cares about the clean room and therefore try to keep it clean but as no one seems to care about the dirty room, it is much easier for us to be lazy and just let it get dirtier.

If you didn’t create the mess, you shouldn’t just “deal with it”, you should improve it. That does not mean that you should stop to do anything else, but you should improve those areas you touch while maintaining the mess. After some time, the application as a whole will improve.

Bad code is not only a cosmetic problem, the time to feature and thereby cost of new features increase, bug fixes affected by this too. If the code quality is too bad, some suggest a rewrite. But that is not only expensive, it won’t fix the problem at all, it is just matter of time until the code quality is that bad quality again.

Tagged with:
Jul 11

In the following days, I’ll create a series about refactoring. As this is index, you’ll find new posts right here.

I’ll start the series with a short success story about refactoring followed by an introduction to refactoring.

Then we will get started, I’ll explain and show some of the commonly used methods.

Throughout the series, I’ll show and explain the following:

  • What refactoring is
  • Common ways to refactor
  • How patterns can help
  • Important patterns
  • How to avoid bad code

Here is the index:

  1. Part 1: A Short Story
  2. Part 2: Introduction to Refactoring
  3. Part 3: Requirements
  4. Part 4: Basicis
  5. Part 5: Some Real World Refactoring
Tagged with:
preload preload preload