JUnit Gotcha!

October 24, 2008

I’m in the process of getting the newly separated cmlxom library ready for release. As I’m an obsessive upgrader I decided to update the junit library cmlxom uses from 4.3 to 4.5. And a load of tests broke. What?

The reason: <code>Assert.assertEquals(double, double)</code> has been deprecated for a while, replaced by <code>Assert.assertEquals(double expected, double actual, double delta)</code>, and rightly so. Instead of removing <code>Assert.assertEquals(double, double></code> they chose to make it throw AssertionFailed.

Moral of the tale: examine deprecation warnings before upgrading dependencies!

Java Resource Listing

November 30, 2007

I’ve been working on a SWORD client for SPECTRa for the last day or so, and an got a little sidetracked into mavenizing the SWORD Java code, and further sidetracked into refactoring some of it as I went. Part of the SWORD code is jar including a CLI and a Swing GUI. Far maximum convenience the code and dependencies are assembled into a single jar (run using “java -jar …”).

The original author of the code, Neil Taylor (Aberystwyth) has been careful throughout the code to access all resources through the ClassLoader.getResource[AsStream] methods, through the InputStream and URL abstractions. So far so froody. There’s a wrinkle, though – the help system launches the user’s browser with the location of the help index file as an argument, and this is a limitation to the “everything in the jar file” approach – the code needs to executed from the correct pwd (or passed a parameter) for the help to display correctly.

Most (all?) web browsers are unable to understand the “jar:file:” protocol to get hold of the help pages directly from the jar. Well, I thought, that’s not a problem, I’ll copy the resources out of the jar into a tmp directory and point the browser there. Well, this would work fine, but I hit a snag – there’s no way to list, search for or glob resources through the ClassLoader. I’d have to have an explicit list of all the help resources, which would suck. Sam Adams suggested a solution he used for JNI-InChI: pull the jar file location out of the “jar:file:” URL, then use the java.util.jar.JarFile class to find the relevant entries.

It’s verbose and hacky (in a bad way), but it does allow you to have filesystem-like handling of resources and still distribute as a single executable jar, which is a good thing. Here’s the code, in all it’s filthy glory: –


ClassLoader cl = getClass().getClassLoader();
URL help = cl.getResource("help");
if ("file".equals(help.getProtocol())) {
File from = new File(help.toURI());
FileUtils.copyDirectory(from, helpDir);
} else if ("jar".equals(help.getProtocol())) {
// Strip between 'jar:file:'
String jarLoc = help.toString().substring(9,
help.toString().lastIndexOf("!"));
File f = new File(jarLoc);
JarFile jarFile = new JarFile(jarLoc);
for (Enumeration entries = jarFile.entries();
entries.hasMoreElements();) {
JarEntry je = entries.nextElement();
if (je.getName().startsWith("help/")) {
// Trim the 'help/' off and fix up the file separators
String filename = je.getName().substring(5).replaceAll(
"/", File.separator);
File destination = new File(helpDir, filename);
File directory = je.isDirectory() ? destination
: destination.getParentFile();
log.debug("Creating " + directory
+ " and copying resource to " + destination);
if (!(directory.exists() || directory.mkdirs())) {
throw new IOException(
"Problem creating temp help directory, couldn't "
+"create:"+ directory);
}
if (!je.isDirectory()) {
FileUtils.copyURLToFile(cl.getResource(je.getName()),
destination);
}
}
}

Java is the new Bash?

September 7, 2007

Shock! Dilbert programs in Java!

Since that link probably won’t survive the test of time, today’s Dilbert cartoon has Dilbert coding an incompetent co-worker job in Java. This is (hopefully) a reference to the original line: “I will replace you with a shell script. A very short shell script.“. Amusing how this translates into Java in Dilbert: –

  • Java is the language of choice for replacing incompetent people
  • Even an incompetent person’s competent funcations can’t be coded in a short app

Damnation by slight praise, then?

Foo-Oriented Software

September 5, 2007

Erlang is oh-so-hot at the moment (which must be a novel experience for such an old, mature language), but Tim Bray isn’t convinced: –

… I think that the human mind naturally thinks of solving problems along the lines “First you do this, then you do that” and thinks that Variables are naturally, you know, variable, and has grown comfortable with living in a world of classes and objects and methods.

This got me thinking. My reaction to Erlang variables was the same: “they’re not variables, why don’t you call them something else?”. I think the answer is basically that it’s easier to understand the concept by remembering them as variables that aren’t, rather than having to build up a whole new concept. The other thing about variables is that they’re not all that instinctive when you’re learning to program – they don’t work like in maths, so an expression looks like an equation, but doesn’t (in imperative languages) work like one. In Erlang, it does (or it fails ;-)).

It is practically impossible to teach good programming style to students that have had prior exposure to Basic; as potential programmers they are mentally mutilated beyond hope of regeneration. — Edsger Dijkstra

Most people start OO code by writing huge long main methods and writing most other code in static methods with some objects only if they need data structures, i.e. the most natural way to solve a problem is to start somewhere and perform a sequence of actions. So why use OO at all? To abstract the solution to solve a whole a family of problems, to decompose the problem to make it easier to solve and to make the solution easier to understand etc. The thing I like best about in Java is that it’s fairly easy to create code that can be easily understood by others. I suspect that this quality is at the root of Java’s adoption, and that the tools are as important as the language features (or lack thereof…); Javadoc’s contribution shouldn’t be underestimated.

More investigation is required to see how well Erlang does in creating comprehensible, reusable code!

The DSpace@Cambridge service is looking for a developer.

I’ve (very belatedly) deployed a binary of Peter Corbett’s OSCAR3 release alpha 2 to the WWMM maven2 repository (http://wwmm.ch.cam.ac.uk/maven2/). Use groupId:wwmm, artifactId:oscar, version:3a2.

Caveat: The OSCAR jar includes all it’s dependencies, so this jar might not play nicely if you’re using any of its dependencies, including lucene, cdk and weka. I’m hoping to persuade Peter to let me mavenize OSCAR in the near future which will sort this problem out.

XML Databases link

August 24, 2007

Elliotte Rusty Harold rounds-up the state of the art on XML databases, concluding: –

The XML database space is not nearly as mature as the relational database space. The players are still marching onto the field. The game has not even begun. However it promises to be very exciting when it does.

Depressing, really – why has it taken this long? We’ve all been sitting in the stadium getting cold for 6 years.

When and what to throw

July 23, 2007

Interesting post from Elliotte Harold, proposing a new nomenclature for exceptions, whereby checked exceptions are referred to as “external” and runtime exceptions as “internal”.

The discussion is also interesting. Like Ingo, the guidance I’ve been giving is to only throw a checked exception if it’s conceivable that the caller could do something about it. This is fine, but not so useful for programmers who lack the experience to spot opportunities for good error handling. I’ll give the internal / external approach a bash and see how it goes with people.

C#: Fat or Fit?

July 12, 2007

In C# 3.0 Considered Rubenesque?, David Ing considers (mainly) whether C# needs the functional extensions being proposed for v3.0 and by the by touches on a discussion I’ve been having internally about Java, Python and Erlang. The main points for me revolve around language complexity, the timing of the need for functional programming and when not to abstract.

Elliotte Harold has been vocal against adding complexity to the Java language. The complexity loving geek in my stamps his foot (“… but I want properties / closures / etc”) but Elliotte’s right – programming languages don’t have to accrete features, and in the real world most Java programmers get dazed and confused if you throw inner classes at them. On the other hand python has managed to support a functional style of programming (although I’m told that this isn’t totally pythonic) and is still considered easy to learn, so perhaps it can be done simply.

You don’t need to reach for Mother Shipton’s almanac to know that functional programming languages are going to be hugely important, with processors now not getting much faster, but gaining more cores. The languages that enable normal rank and file developers to take advantage of that are going to win in the way memory managed languages did, and in the way dynamic languages are doing. The question is not really “if” but “when”, and the answer probably depends who you are and what you need to do.

If I were writing an data analysis program to run a cluster (say 32 machines with dual-dual processors), then I’d be reaching for any language that let me use them as efficiently as possible – give the air con something to think about. Even if I had a dual-dual server dedicated to a single service I’d probably do so too, even though I’d be unlikely to get a 4x speed up due to I/O contention (example from Pragmatic Programmer). Most of our current web facing servers (where most of my code will end up running) are single or dual core, and aren’t due for an upgrade any time soon. So I’m better off sticking to what I know for now.

The final point is probably the most important, since it’s not future gazing: –

There is a school of thought that when you’re working with a relational database rather than a collection of in-memory objects, then you should not lose track of the various nuances and advantages of the stores – abstraction to save typing can come back to bite you?

Replace “working with a relational database” with “machine boundary” or “transactions” and it’s still true. Remember the first law of software complexity – you can’t get rid of complexity, just move it around. The more amazing an abstraction is at hiding complexity from you, the more likely it is to leak.

No JSF here, thank-you

June 28, 2007

I occasionally post about technologies I like, less often about ones I don’t.The inimitable Koranteng has just come across JSF, and he doesn’t like it either.