Automated distribution creation (2)

In my previous post I talked about how I managed to automatically download the release notes from our issue tracker web site. These notes still needed adding to our NEWs file, which describes the changes between releases.

There are really two scenarios to deal with here: the release notes for the current release either are already in the NEWS file, or they are not. They are already there when you rebuild the distribution for a release, for example when you’ve found something wrong with it and fixed that. For a human, this is pretty simple to detect, but how does an Ant script know?

Enter the Ant filter chain. This construct resembles a Unix pipe in that you can use it to feed output of one as input to the other. Here’s how I retrieve the version that is currently in the NEWS file:

<loadfile property="current.version"
    srcFile="${news.file}>
  <filterchain>
    <headfilter lines="1"/>
    <striplinebreaks>
    <tokenfilter>
      <replaceregex pattern="[a-zA-Z\s]*([1-9]+\.[0-9]+).*"
          replace="\1"/>          
      <replacestring from="." to="\."/>
    </tokenfilter>
    </striplinebreaks>
  </filterchain>
</loadfile>

The loadfile task loads the srcFile into the current.version property. But not just as is, no there is a filterchain applied first. The first item in the chain is headfilter, which works just like the Unix head command: in this case it gives the first line of the NEWS file. I don’t want a line, but a string, so next I remove the line ending with the striplinebreaks filter.

Then it’s time for some good old regular expression to extract the version number from the string. The first line of the NEWS file looks like this: Changes in 1.4.0. So I match the text with [a-zA-Z\s]* and then the actual version number with ([1-9]+\.[0-9]+).*.

Note that I use a group to capture only the major and minor version (1.4 in the previous example). The reason for that is that whenever we deliver patch releases, we don’t add a whole new section to the NEWS file, but just expand the current section with the few cases that were fixed by the patch. Since we sort the cases in descending order of reporting, the patch cases will always be at the top.

Following the regular expression there is a replacestring filter that inserts backslashes before points. The reason for that becomes clear when we look at how the Ant script actually uses the current.version property:

<condition property="same.release">
  <matches string="${full.version}"
      pattern="${current.version}"/>
</condition>
<antcall target="--remove-current-release-from-news"/>
<antcall target="--add-current-release-to-news"/>

The --remove-current-release-from-news target is only executed when the same.release property is true:

<target name="--remove-current-release-from-news"
    if="same.release">
  <property name="previous.version.file"
      value="${news.dir}/previous.version.txt"/>
  <echo message="${previous.version}"
      file="${previous.version.file}"/>
  <loadfile srcFile="${previous.version.file}"
      property="escaped.previous.version">
    <filterchain>
      <tokenfilter>
        <replacestring from="." to="\."/>          
      </tokenfilter><tokenfilter>
    </tokenfilter>
  </filterchain>
  </loadfile>
  <delete file="${previous.version.file}"/>
  <replaceregexp file="${news.file}"
      match=".*(Changes in ${escaped.previous.version}.*)"
      replace="\1" flags="s"/>          
</target>

The bulk of the work is done in the final replaceregexp task, where everything before the text Changes in <x>.<y>.<z> is deleted. The code before that is just a convoluted way to escape points in the previous version number. Unfortunately, I’m not aware of any Ant task that can execute a regular expression against a property, so I first put the property into a temporary file and then operate on that file.

Finally, all that is left, is to add the release notes for the current version to the NEWS file:

<target name="--add-current-release-to-news">
  <property name="new.news.file"
      value="${news.dir}/new.news.txt"/>
  <concat destfile="${new.news.file}">
    <path>
      <pathelement location="${release.news.file}"/>
      <pathelement location="${news.file}"/>
    </path>
  </concat>
  <move file="${new.news.file}"
      tofile="${news.file}"/>
  <delete file="${release.news.file}"/>
</target>

The only tricky part here is that the concat task doesn’t allow one of its input files to also be the output file. So I have to introduce a temporary file. Then when all is done, the file containing the NEWS section for this release, release.news.file, is no longer needed.

Automated distribution creation

So we have this automated build with CruiseControl. It generates code, compiles, deploys, and tests. It’s saved my skin a gazillion times. It’s really great.

But it could be even better. It could also build a complete distribution, making the whole software release process a non-event. That’s one of my goals for the coming weeks. So stay tuned. 😉

Currently, the process to build a distribution of our product requires a couple of manual steps. One of these steps is to update the NEWS file, which describes the changes between releases. Of course, everything that changes between releases, is documented in the issue tracking system, in our case FogBugz. (FogBugz is OK to work with most of the time, although I think there are better alternatives, like Jira.)

FogBugz lets you add release notes to each issue (which it calls case), and it provides a standard report to show the release notes for all cases scheduled for a specific release. You can even download this report in XML.

The only problem is that this functionality doesn’t work most of the time. The only time when it is guaranteed to work, is when you try it on the server that hosts FogBugz. Since this machine is in the server room, this is inconvenient to say the least. But even if this functionality worked flawlessly every time, everywhere, it would still be a manual step to collect the XML file.

So I turned to HtmlUnit, a “browser for Java programs. It models HTML documents and provides an API that allows you to invoke pages, fill out forms, click links, etc… just like you do in your normal browser.” We use this great tool a lot to write our acceptance tests.

This time, I used HtmlUnit’s WebClient from within an Ant task to log in into FogBugz, generate the release notes report, extract cases with release notes (some cases have none, since they are too trivial to bother the end user with), and write them to an XML file. This allows me to transform the XML file to plain text using XSLT, giving a NEWS file section for the current release. The next step is to automagically add this to the existing NEWS file. This should be easy enough using Ant’s concat task. I will let you know how this works out.

Breaking Encapsulation

Last week, we tested the upgrade procedure for the new version of our product. We got a backup from one of our clients that was over 60Gb, so we could put it to good use by testing the performance of the upgrade against it. This sort of testing is always crucial for making sure the upgrade won’t disrupt production too much.

One of the steps in the upgrade was the deletion of stale data. It dealt with two entities in a one-to-many relationship. For this discussion, lets call these entities A and B. For each A, there can be multiple Bs, whereas each B is associated with exactly one A. The upgrade used our product’s API to select A objects matching the required criteria, and then deleting them. The API implementation makes sure that when an A object is deleted, its B objects are also deleted.

This is standard encapsulation practice, nothing fancy. But there was one problem with it: the deletion process was way too slow. We broke it off after over three and a half hours, which is clearly unacceptable.

So we turned to the code, and found two loops: one iterating over the A objects, and within the delete() of A, one iterating over the B objects. Since there can be many, many B objects to search through, this inner loop really hurts when executed repeatedly. We say that this algorithm is O(n×m), where n is the number of A objects and m the number of B objects. By first deleting all B objects related to A objects that match the criteria, and only then deleting the A objects, we could potentially change the algorithm to O(n+m), which of course is much faster.

That didn’t work out, though, since the delete() method in class A still contained the loop over B objects, even though we now knew for sure that none of the B objects would match (since we deleted them previously). So we broke encapsulation by extracting a doDelete() method that just deletes the A object, nothing more.

We had a similar problem with B’s delete() method. This code sends a notification to its A object and performs other housekeeping. In our situation, this is clearly unnecessary, since that A object is about to be deleted as well. So we again broke encapsulation and extracted a doDelete() method for class B as well.

Now we had the performance we required: the deletion process was down to two minutes. But we lost encapsulation. Being well-experienced in object oriented techniques, we knew that would open the door to all sorts of trouble. But we also knew that this change was absolutely necessary to get the required performance.

So we went into damage control mode. We made the doDelete() methods protected, and moved the upgrade code to the same package as the API implementation code, to still be able to call the doDelete()s. Still not optimal, but sometimes a man’s got to do what a man’s got to do…

The Law of Demeter

In my previous post I used the Law of Demeter as a motivation for the Hide Delegate refactoring:

The Law of Demeter for functions requires that a method M of an object O may only invoke the methods of the following kinds of objects:

  1. O itself
  2. M’s parameters
  3. any objects created/instantiated within M
  4. O’s direct component objects

Code that violates the Law of Demeter is a candidate for Hide Delegate, e.g. manager = john.getDepartment().getManager() can be refactored to manager = john.getManager(), where the Employee class gets a new getManager() method.

However, not all such refactorings make as much sense. Consider, for example, someone who’s trying to kiss up to his boss: sendFlowers(john.getManager().getSpouse()). Applying Hide Delegate here would yield a getManagersSpouse() method in Employee. Yuck.

I have a couple of problems this use of Hide Delegate. First of all, it creates methods that by definition reek of feature envy. Second, methods like getManagersSpouse() clearly violate the single responsibility principle. Finally, the Law of Demeter clashes with the concept of fluent interfaces.

Luckily, you can always back out from an adverse Hide Delegate by applying the opposite refactoring: Remove Middle Man.

Automating refactorings

I’m a big fan of both refactoring and automation. It’s no wonder, then, that the support for automated refactoring in Eclipse makes me very happy. I find that it makes me a lot more productive, and I produce better code. That’s because performing a refactoring is easy and fast enough to actually do it.

I also find that I refactor routinely. Where Martin Fowler, in his classic book Refactoring, gives the advice not to mix refactoring and adding new functionality, I do it almost mindlessly anyway. No need to run unit tests before and after the refactorings, since I know they Just Work™.

Not so with any refactorings that are not supported by the tool, though. For instance, when trying to adhere to the Law of Demeter, one would want to perform the refactoring Hide Delegate. Unfortunately, Eclipse has no support for this refactoring 😦 You can, however, simulate this refactoring using a combination of other refactorings. Let me explain that using a simple example.

We start with the following abstract code that shows the situation before we want to apply Hide Delegate:

public class Client {

  public void example() {
    final Server server = new Server();
    server.getDelegate().method();  
  }

}

public class Server {

  private final Delegate delegate = new Delegate();

  public Delegate getDelegate() {
    return delegate;
  }
  
}

public class Delegate {

  public void method() {
   // Do it...
  }
  
}

First, we perform Extract Method on server.getDelegate().method() (make the method public):

public class Client {

  public void example() {
    final Server server = new Server();
    method(server);  
  }

  public void method(final Server server) {
    server.getDelegate().method();
  }

}

Next, perform Move Method to move method() to Server:

public class Client {

  public void example() {
    final Server server = new Server();
    server.method();  
  }

}

public class Server {

 private final Delegate delegate = new Delegate();

  public Delegate getDelegate() {
    return delegate;
  }

  public void method() {
    getDelegate().method();
  }
  
}

And, voila, we have performed Hide Delegate!

Importing large data sets

For performance testing, it is often necessary to import a large data set to test against. However, importing large data sets presents its own challenges. Below I want to give some tips on how to deal with those.

  1. Begin with making backups. Not just of your current data, but also of the large data set you want to import. You might just want to transform the data to import, and then it is useful to be able to go back to the original.
  2. Start with a representative subset of the large data set. This will allow you to test the import process without having to wait hours for feedback. Only when you’re convinced that everything works as expected, do you import the whole large data set.
  3. Test the limited data set end-to-end. For instance, the product I’m currently working on consists of a Content Management System (CMS, where people author content) and a Delivery System (DS, where people use the content). Data is imported into the CMS, edited, and finally published to the DS. In this situation, it is not enough to have a successful import into CMS. The publication to DS must also succeed.
  4. Automate the import. When things go wrong, you need to perform the import multiple times. It saves time to be able to run the import with a single command. Even if the import succeeds on the first try (one can dream), you might want to redo the import later, e.g. for performance testing against a new release, or when a new, even larger, data set becomes available.
  5. If you need to transform the data to make the import work, make sure to put the transformation scripts under version control, like your regular code (you do use a version control system, do you?). The build scripts that automate the import should also be put under version control.
  6. If you cannot get your hands on real-world data, you may still be able to do performance testing using generated data. The downside of this approach is that the generated data will probably not contain the exotic border cases that are usually present in real-life data.

Strange things happen…

On Fridays, I work from home, to prevent wasting time commuting. So today, I started out fresh, ready to rock and roll. But alas, I was off to a slow start.

To access my company’s resources, I use a Virtual Private Network (VPN). In particular, I use the VPN client software from Cisco on Ubuntu GNU/Linux. This piece of software has the tendency to break on every kernel update, however 😦 Yes, you guessed right, this week I received a kernel update to 2.6.24-17.

When I previously upgraded Ubuntu to 8.04, I received kernel 2.6.24-16 and then the VPN client broke as well. I had to apply a patch, which didn’t work: it couldn’t apply all changes. I then manually fixed the code to make sure all the changes in the patch were applied. And then the VPN client finally worked.

So I expected another one of those sessions. But this time, googling turned up nothing. Since I’m close to a deadline, I decided to simply restart my computer and choose the 2.6.24-16 kernel from the GRUB boot menu. Since it used to work with this kernel, I expected it to work now. But no such luck. I still got an error about the Connection Manager being unable to read the connection entry.

Getting a bit desperate, I redid the VPN client installation. Now it worked 😀 Feeling lucky, I rebooted into kernel 2.6.24-17, and it still worked. Sometimes I just don’t understand computers…

Update 2008-08-15: Check out this page with Unofficial Cisco VPN Client Updates for Linux.

Broken windows

In my previous post, I told you that I encountered some non-optimal code. (Yes, really!) You might wonder what I did about that. Well, I fixed it, of course.

Should I have left it alone? After all, I was in the process of implementing this super-critical feature that this super-important client needed super-urgently. Did I waste time on something not-so-important?

Well, yes and no. Yes in the short term, for sure. But I like to think no in the long term. And I’m in for the long haul. I work on a software product, not a project, so I can actually afford to take a long term view.

I’m not alone on this one. The excellent book The Pragmatic Programmer talks about this as well. The authors call it fixing broken windows after a landmark sociology study:

Consider a building with a few broken windows. If the windows are not repaired, the tendency
is for vandals to break a few more windows. Eventually, they may even break into the building,
and if it’s unoccupied, perhaps become squatters or light fires inside.

Or consider a sidewalk. Some litter accumulates. Soon, more litter accumulates. Eventually,
people even start leaving bags of trash from take-out restaurants there or breaking into cars.

But fixing the one broken window prevents other windows from getting broken, claim the authors.

There has been considerable criticism on the broken window hypothesis by other sociologists. And I know of no scientific evidence for the effects of fixing broken windows in software. But the hypothesis makes intuitive sense to me. And so I keep fixing broken windows whenever I encounter them as much as possible.

By the way, if you haven’t read The Pragmatic Programmer yet, then stop wasting your time reading silly blogs, and go buy this book. You will want to keep it along with other classics such as Design Patterns, Refactoring, and The Mythical Man-Month.

The Power of Convention

I ran into some code the other day that wasn’t obvious to me right away. Why? Because it didn’t follow conventions. Let me explain.

In our system, we have what we call ‘binding objects’. These are objects that are persisted in our XML database and that we can use through Java interfaces. The code needed to access them is mostly boiler plate, so we generate it. Since the system was build on Java 1.4, we couldn’t use annotations for this. So, being an XML company, we naturally decided to describe the binding objects in XML and use XSLT to generate both the interface and the boiler plate implementation. We then add functionality by extending both of them.

Now, an often used convention in Java is to have the implementation for interface Xxx be named XxxImpl. We follow this convention often in our code, but for some reason did something else here. This cause me to look around wondering why I couldn’t find the implementation right away.

At this point, I’m sure someone will ask ‘So what?’. That’s always a good question 😉 For instance, in Eclipse, you would just select the interface and press either F4 to show the Type Hierarchy view, or Ctrl-T for a popup listing the same.

That’s true. But it requires a conscious action. Instead, if we had followed the convention, we would get this information for free. For instance, if both the interface and the implementation were in the same package, Eclipse’s Package Explorer would show them above each other. And even if they were in different packages, searching for the interface using Ctrl-Shift-T would also find the implementation. This free-riding is called a ‘widgetless feature‘. It is a Big Thing™ in user interface design.

Granted, it is a minor advantage. But this is just one example. Multiply by all the other places you can use conventions to reduce information overload, and you’ll find you’ll be more productive.

If you’re still not convinced, take advice from someone else. For instance, look at Maven: one of its central tenets is ‘convention over configuration‘.