Three Semantic Web developments

Two projects and some food for thought, all new this week and all in our sector. Open Calais and Museums The Powerhouse Museum is using Reuters’ Open Calais service launched earlier this year to generate tags for their online collections. The ReadWrite Web article notes that, That the museum has so much of its collection [...]

Two projects and some food for thought, all new this week and all in our sector.

Open Calais and Museums

The Powerhouse Museum is using Reuters’ Open Calais service launched earlier this year to generate tags for their online collections. The ReadWrite Web article notes that,

That the museum has so much of its collection online is actually quite impressive in its own right. About 70% of the museum’s electronically documented collection is online in the database which went live in June 2006. Museum objects are searchable, taggable (by humans) and painstakingly described.

A bit of extra background about this museum that’s interesting for libraries (It’s located in my home state in Australia). The Powerhouse is government-funded by the state, and it falls under the Arts portfolio, which also includes the State Library. There’s some healthy competition between them, which is a good thing – libraries are looking to the museum sector for ideas and inspiration. We should look at the museum sector more especially as their online education and digital programmes increase. The Powerhouse is also behind a project to create an email archive, complimenting the National Library’s Pandora program which archives Australian websites.

LCSH as Linked Data

Here’s something interesting and fun – take a look at the Library of Congress Subject Headings represented as Linked Data. The site makes use of several Linked Data browsers which provide a different type of interface to browse through the headings. The good thing about this project is that it uses a concept most librarians would be familiar with (subject headings), which may make understanding a new concept (Linked Data) easier.

Semantifying existing content

Richard Akerman at Science Library Pad writes about the Five Laws of Library Science and adds two new laws for the machine.  He discusses how and where people might add new informatino to existing content to make it Semantic, but adds the caveat -

“Now I have no Semantic Web illusions that people are going to nobly go back and markup all their content with semantic information, that vision is a fantasy that lingers with us from the SGML days and it’s never going to happen.”

And didn’t we say the same about metadata, migrating between HTML versions, etc etc?

Akerman goes on to discuss the advantages of microformats and points out that findability is important -

“Even a slight advantage in discovery can be a huge motivator to people. “

I agree. If you can make your content or data or whatever you have on the web more easily located with little effort people will do it. When blogging first arrived, there was no way of notifying people that you’d updated. Now, you wouldn’t think of having a blog without a feed. Although it’s mostly part and parcel for new bloggers these days, a few years back it required some effort to add this functionality (I used to handcode mine before I started using Movable Type). RSS added great benefits to blogs for relatively little effort and that’s how the Semantic Web has to be too.

4 Comments

  • Fiona:

    I’ll go out on a bit of a limb here and say that I believe development #1 somewhat negates #3 Akerman’s statements about people not going back and marking up their information for semantic consumption. While it is certainly not the whole solution – tools like Calais can get you 80% of the way there in an automated fashion. The last 20% – like everywhere else – will require hard work.

    Tom

  • Thanks for your thoughts, Tom!

    I am interested to see how many take on similar projects to the Powerhouse and mark up their information. I suspect many cultural institutions, libraries, researchers and corporations will investigate it as they have the resources and a more definable benefit to their datasets but the average person running a small to medium size website or dataset will not.

  • Fiona:

    Absolutely true – the value comes when you have either a large corpus of information (a museum collection for example) or a high throughput (filtering 1,000 news feeds by concept in real time) of unstructured information.

    We’re not surprised to see collection managers / librarians being among the first to take advantage of Calais. Though their work often isn’t as flashy as is fashionable – it has always been about making large, complex and often unstructured information assets more accessible to the public. We love it.

    We’re looking into augmenting our technology to better address the needs of collection managers by incorporating several lexicons suggested by Powerhouse. Maybe in a month or two.

    Regards,

  • OpenCalais is great, I use it in a new website I created recently : http://www.klezio.com
    News are automatically classified and news metadata extracted ; Contextual information is fetched from apps such as wikipedia, flickr, twitter or delicious.
    Hope it’ll serve.