December 19, 2006

There are two vacancies for repository developers with the JISC repositories support team.

DSpace and Derby?

December 14, 2006

Derby is a Java native RDBMS that is the open source successor to IBM’s cloudscape product. Dan Scott writes that he’s hoping to port DSpace to use Derby instead of PostgreSQL, and reckons that Derby would make a better default db for DSpace.

We’ve had a couple of brief encounters with Hypersonic and Derby here and I’ve been a bit underwhelmed.

The lack of scalability is a problem. Both Hypersonic and Derby start to grind badly over 500,000 rows in a table. This would be a problem for DSpace because DSpace places all metadata statements about items in a single table, and that tends to be the largest table in the database. Consider that each item has O(10) metadata statements and you’re looking at only 50,000 items before Derby / Hypersonic starts grinding on the simplest operations.

I also wonder whether you’re actually getting any advantage in usability. I stopped using Hypersonic in one project recently not because the performance per query was poor, but because initializing the database to run in-process took forever, and consequently my choices were to run it in client-server mode or just use Postgres. Maybe Derby doesn’t have such a naive approach to persisting its state to file, I’ll have to look.

The point is that unless you can run the database in process and have to run it in traditional client-server mode, you’ve got just as many problems in security, configuration and so on as you have with PostgreSQL.

That said – it’s obvious the Derby has legs for the vast majority of web-applications that don’t have large tables or more than one web-app accessing a database, and I’ll definitely be giving it a spin on the next DB backed prototype or play project I have. I’ll be using Hibernate, though, just in case I need to jump ship to Postgres later!

Groovy nearing 1.0

December 12, 2006

The first release candidate of Groovy 1.0 has been announced. I played with an earlier incarnation of Groovy, so I’ll have to find some time to play again over Christmas.

OAI-ORE is a web extension

December 11, 2006

I’m encouraged at reading this ORE briefing presentation (via Richard Jones). Highlights: –

Whatever we do needs to be embedded int he web; we are not creating a parallel universe.

The Canonical Representation Format (CaRF) [is the] Format to express a manifest of all available Representations (and Resources) for a
Resource. Fleshing out the CaRF is probably the core effort of OAI-ORE.

It’ll be interesting to see how ORE pans out – the outputs could well solve a number of my current problems.

As a brief aside: that briefing makes no mention of surrogates or FRBR. Have the surrogates gone in the name of being web compatible? I didn’t really get the point of them – they just looked like another representation to me.

Adobe MARS

December 11, 2006

Eliot Kimber has some nice things to say about Adobe MARS: –

After seeing Adobe’s presentation and talking to the guys from Adobe it’s clear that what they’ve done is a sincere and well-thought-out attempt to Do The Right Thing rather than a cynical recasting of proprietary stuff into markup so it’s “open.”

MARS tries to use standards as much as it can and it seems to do so to a remarkable level of completeness. It uses SVG for representing each page, supports the usual standards for media objects (bitmaps, videos, etc.). Uses Zip for packaging, and so on.

A few things strike me about this. Firstly, MARS could throw document preservation a lifeline. The ODP/M$ tangle looks like it’s going to go the full 10 rounds, and possibly not fall out the way we would like for preservation.

Next, this could be the shot in the arm SVG needs. With the benefit of hindsight, it seems crazy that Adobe pulled support for their SVG viewer plugin at such an early stage (as soon as there was nodding support for SVG in browsers). Just imagine what web2.0 would look like if they’d had graphics to mash up (on the client side). We did this to a certain extent at Paribus; thematic maps (or chloropleths, to give them their proper name) were done by serving static SVG and dynamic CSS.

Finally – does this mean I might finally get the super-duper SVG viewer I want to zoom around my concept maps in presentations?

DSpace 1.4.1 released

December 8, 2006

DSpace 1.4.1 has been release (announcement).

The 1.4.1 release includes a bagload of bug fixes and minor improvements, many in the area of improved standards compliance (e.g. if-modified-since responds properly, error pages return correct HTTP codes, XHTML compliance etc etc etc).

Congratulations to the release managers Scott Yeadon and Claudia Juergen!

A brief DRM rant

December 8, 2006

iTunes store allows you authorize the music you’ve purchased on 5 machines. I have a Powerbook at work and a mac mini at home. Recently, the Powerbook developed a fault and needed a new logic board. iTunes consequently thinks it’s a new machine. So that’s 3/5 authorized machines down already. Grrrrrrr.

Hands off my fair use, evil Apple types! I’ve already signed up for a service that gives me unjiggered mp3s.

DSpace Collation Tip

December 7, 2006

Collation in database backed, multilingual web applications is a real PITA. PostgreSQL only supports one locale at a time, so if you want to do a custom sort over you have to do it in memory in Java using a Collator. [Hold nose, pull chain].

DSpace has had to face this problem, and there is a sneaky feature to help that Dorothea Salo found. Good tip!

To follow the meme of setting quizes in blog posts, here’s a real life Java gotcha that came up this morning. Pint to the first correct answer in a comment post (Justin and Joe need not apply, since I’ve told them already).

Justin is writing a Java project that checks CML files for experimental data errors. In this project he has a config file that stores the details of the checks to perform. He was loading it like this:

URL confUrl = getClass().getClassLoader()
File confFile = new File(confUrl.toUri());
Document doc = new Builder(confFile).build();

The config file was included as a resource in the jar, and everything worked fine in his test harness.

Then he used his jar in a separate project which is a web application. No dice. The Jar was there, the config file was in the jar, but the resource loading failed on the File constructor with a URISyntaxException saying “This URI is not hierarchical” or somesuch. For once I don’t think the message was particularly helpful, the solution just requires thinking about what’s happening.

Why didn’t the config file loading work in the web application? What did Justin do to solve this?

Bryan Alexander writes on Archiving OpenCourseWare in DSpace.

Comment: The OCW approach to archiving learning materials was a large influence in SPECTRa’s approach to archiving chemistry data, so it’s always interesting to read about the problems they have, since we’re likely to run into them too!