More on repositories and search engines

I referenced a note by Andy Powell on institutional repositories and Google in an earlier post. Herbert Van de Sompel left a comment pointing to a short document he has prepared addressing some of Andy’s concerns based on recent work with OAI-ORE. Here are a couple of the opening paragraphs ….

This write-up is an impromptu response to Andy Powell`s Repository Usability blog entry. It also touches on some issues that Andy raised in another entry, Freedom, Google-juice and institutional mandates. The purpose of the write-up is to try and alleviate some of Andy`s pain regarding the status quo of scholarly repositories: while the current situation may indeed not be perfect, a possible solution may not be too hard to establish. The solution I describe uses the OAI-ORE specifications, and quite some other techniques that have been introduced by several communities over the past years. Hey, a technological mash-up, one could say. Use and reuse what is there before inventing new stuff is the motto. I am afraid that the solution may cause Andy some phantom pain, since it also leverages OAI-PMH. While I agree that there are a few things we didn`t get quite right with OAI-PMH, I don`t think it`s the cause of all evil in the (repository) world, and I actually even think we can leverage the existing deployed PMH repositories for a good cause

Anyhow, I think the DSpace example of Andy`s blog entry is a nice one to have a close look at, indeed. The four URIs that are a source of frustration for Andy, are an indication to me that OAI-ORE Aggregations can come to the rescue. As a matter of fact, the ORE Primer uses an arXiv example that is quite similar to the DSpace one: lots of URIs flying around that somehow belong together. [Repository Usability – Herbert’s Take]

2 thoughts on “More on repositories and search engines”

  1. I’m sorry, but the handle stuff is just retarded. It creates some revenue and busywork for the people who run the handle system, but it does nothing to make resources more persistent.
    For instance, the resource has to stay online somewhere: the handle system doesn’t store a copy of the file, so it doesn’t absolve anyone from the major cost of making it available.
    If you move a resource, there’s something called a 301 redirect. It makes sure that people who go to the URL end up in the right place: search engines understand it too. You do have to keep some kind of a server up, and keep registering the domain name, but that doesn’t cost more than what the handle system wants.
    Then there;s the whole question: is going to be there in 20 years? It destroys value by confusing people and damaging usability, it doesn’t create value. It contributes to the ghettoization of “digital libraries”.
    But why should I complain? It means that people in the private sector have that much less competition when it comes to making entertaining and informative sites.

  2. Paul, I don’t think my document contained any value assessment regarding the handle system. I looked at the DSpace example, noted they use – among other – handle URIs, and then incorporated those URIs in the proposed solution because those URIs are part of the addressing mechanism for DSpace items. Let’s just say I was working with the tools I was presented with.

    So, I suggest that if you have problems with the concept of the handle system, you’d better argue with Bob Kahn, co-inventor of the fundamental communication protocols of the Internet with Vint Cerf. Alternatively, you could engage with the DSpace community and discuss their motivations for using handles.

    BTW, I find it interesting to see quite some support for PURLs around the Web, and quite a dislike of handles. In essence, I think both technologies fulfill very similar functions. Hence, I genuinely wonder what the reasons are for the different perception.

Comments are closed.