Archive for the “data” Category

Data seems to be the hot topic right now. It’s all about how we store it, share it, and make it play nice with other data. There is an enthusiasm for openness and a move towards standardisation of data and the ways we share it, but there’s a also a worrying trend - competing standards and protocols.

Ross Singer at Panlibus discusses a draft recommendation from the Digital Library Federation ILS and Discovery System Task Force and notes that while it’s certainly a welcome move, that -

The problem here is that they generally give multiple options for achieving the goal of any given method. So this means that any ILS vendor can choose from a variety of protocols for implementing the spec and that a different vendor can choose alternate standards for the exact same functionality.

Singer goes on to describe scenarios in which this causes all sorts of problems - for example, vendors choose differing open standards and systems still can’t communicate.

Something similar looks to be happening in data exchange, with Google, Facebook and MySpace all announcing last week that they have their own ways of sharing profile data. There are two key concepts in play - data portability, and data availability. In the first, instance, the goal of data portability is user control and options over how you use your data. In the second, companies are entering agreements with eachother and I don’t see this giving the user the level of control many really want. It’s not a huge leap further than allowing, say, Facebook to access your Gmail contacts. You still have no way to export that data for yourself - it is handled company-to-company. Data portability is definitely my preference.

As we look to the future of the ILS, which may include data sharing and embedding on other services (with formats like RDF) and other semantic developments, it’s interesting to see how we face many similar issues in different domains at the same time. On the reason why Data Portability has taken off this year, Daniela Barbosa who has been involved with the project from inception says -

Call it timing, call it good marketing, call it luck- call it what you wish- i like to say it has to do with a need…a need by users, vendors and technologists to have one forum to discuss and act on the various issues and opportunities around user data and the usage of that data (the ‘Graph’).

I will be interested to see if the wider social networking world and libraries will turn to other forms of networking and identity down the line. Laura J. Smart wrote about Thompson’s ResearchID platform, which for want of a better term you could describe as an identity service for researchers. You can post a profile of yourself, link to your papers, and in theory meet other people working in the same field as you. Other companies have similar services, like CSA’s Scholar Universe. It would be really great if these services, like Facebook and mySpace, were a part of the data/identity portability movement.

So it seems that we’re all moving in the same direction at the moment, and though there may never be just one protocol or standard to rule all of our identities, hopefully they will at least talk to eachother.

Comments No Comments »

I’ve written previously about the importance of the mobile web and the role of mobiles in social change. Now the United Nations Foundation (with Vodafone) has released a report on the use of mobiles by relief, advocacy, and development organisations (via Read/Write Web, Report: ‘Mobile Activism’ on the Rise).

One of the most interesting case studies look at the use of mobile devices is collecting and using health data [PDF]. Forms were created that health workers can use on a PDA to collect and update health statistics. The system is much more time-efficient and reliable than paper-based methods used previously.

Mobile Learning

Other mobile web initiatives include mobile learning. Dr Steve Yuen describes a project he is developing at Cell Phone Learning Support System (CPLSS) (Tech Learning blog) -

“My current (CPLSS) project is attempted to deliver instructional content and learning materials in way that fits into students’ cell phones - their digital lives. The CPLSS is designed to work with many cell phones, smartphones, or PDA phones and will have four major modules: Java book, Web book, audio book, and video book.”

While the concepts themselves aren’t new, as we’ve had previous learning systems developed for online learning, and mp3 players, what I like about this idea is simply the idea of shifting technology to new platforms.

Designing for all

Something we have to keep in mind as we create and modify information for new devices is the breadth of people who will use them. A post that linked to my earlier post about the mobile web asks if we are taking into account the needs of older mobile users. In Mobility issues or digital natives as seniors? from C3 Library -

“…this IS the main communication link up for so many but what will it morph into for the aging digital native? Not something we have to solve but an interesting issue given that the current devices aren’t really usable by the majority of the senior population. As we size everything down to be lighter and portable we also exclude and narrow the user group.”

I agree that it is essential to design for all. Not just those who are of a certain socioeconomic group or age, but also those with disabilities. If your site is built to mobile and web-accessibility standards, that takes care of the content, but what about the devices that you use the content on? There are some mobile screen reader programs available, such as Mobile Speak and Magnifier and TALKS&ZOOMS which magnify, highlight and read text much like PC software does now. Some phones can also use voice commands instead of the buttons. But on the whole, it is true that everything is getting smaller, with only a few exceptions of phones designed for older users that seem to appeal based on their being basic and excluding most features - like web browsing and email.

Learn More - on your phone!

If you want to learn more about the Mobile Web, the W3C is running a free online training course (keep your eye on the next one, registration for May 2008 just closed), An Introduction to W3C’s Mobile Web Best Practices.

More information about the work covered in the UN Foundation report can be found at Mobile Active, billed as “A resource for activists using mobile technology worldwide”.

Comments No Comments »

Last week there was a flurry of comments around a post by Bret Taylor, We need a Wikipedia for data.  Taylor describes a model for a wiki that would aggregate common data in one database that could be cross-searched. Great idea.
One interesting thing about the types of datasets he mentions are that they are all copyrighted - stations own TV schedules, exchanges own market data (the free stuff is usually 20 minutes delayed) and a variety of companies own publishing rights over telephone numbers. This is the data that could be really useful if it was truly free, but given the amount of updating required, I wonder who would do so without a business or legislative imperative.

But that issue is perhaps besides the point. There are many, many incredible datasets out there, everything from Census data to older market information to astronomy. Reading the comments and suggestions on Taylor’s post and Read/Write Web’s post about the topic revealed dozens of sites to find these resources.

I did feel that looking through the list libraries may have missed an opportunity. We have been recommending and linking to various datasets on our websites for years, but there is a huge potential to go beyond this and build something collaboratively and use it as an input for different libraries. Many libraries now take in Open Access Journal records to their catalogues and search engines via DOAJ but there is no reason to not do something similar for Open Data.

Certainly, it is an issue that few of these datasets can talk to eachother - but perhaps the move towards a more standards-based Semantic Web will encourage standardisation and interoperability, at least within, for example, individual government departments so that Census records can be analysed against education records.

One of the sites recommended by Read/Write Web is CKAN, which is backed by the Open Knowledge Foundation that counts someone who has worked in the library sector amongst their leadership. Are these the types of groups more of us should be involved in to have a role in information access on a larger scale?

Comments 2 Comments »

Web sites and applications burst on the scene out of nowhere, attract massive usage and undergo continual improvements to make them better. We wonder how we ever got along without them, until they get bought out, put up access or paywalls, or just disappear.

Libraries have long been concerned with preserving information for the future, and increasingly that includes digital information and websites (for example, Pandora at the National Library of Australia which archives everything from blogs to the 2000 Games site).

So where do they intersect? And how can we take a more proactive approach to design for sustainability rather than saving retrospectively? The Semantic Web is all about linking, openness, and relationships between data. In some ways the Semantic Web is, in my view, how we will move towards a more Sustainable Web.

What might the Sustainable Web be?

Adapting the Triple Bottom Line approach to sustainability, web developers and those who create data could take a lifecycle approach to how they create, manage and produce sites and information. When planning a new website, dataset or service, in addition to deciding on purpose, standards and features, you could also include a statement about how you would -

  • Distribute the data if you were no longer maintaining the site (using a LOCKSS principle, perhaps?)
  • Migrate to future standards
  • Ensure that your site is indexed in the Internet Archive (all pages and data, not just the index)
  • Give people ownership of their data (if you’re running an online service where people store or save information) so they can get it out when they want, or own it if the site closes or the terms of service changes significantly (eg, in the instance of a buyout).

Depending on what type of site it is, there may be governance and political impacts now or in the future. If you’re running a scientific research portal, how might changes in government policy affect the site? What obligations might be imposed on sharing or accessing the data you provide?

Using open standards as the backbone

A starting point is to use open standards. In addition to W3C standards most of us already know (like HTML and CSS), we can extend this to Semantic Web standards like OWL and RDF. Adherance to standards allows information to be interpreted correctly, exchanged, and migrated to newer standards in the future. Standards may also make it easier to hand datsets over to someone else or distribute copies to keep it accessible. It’s a key part of understanding the potential of the Semantic Web according this summary of a talk by Nova Spivack at last week’s The Next Web -

“The semantic web is not so much about “semantics” as it is set of open standards defined at W3C. The semantic web approach builds on open standard meta data which is in line with previous presentations that supported the open data approach. The idea is that everyone profits from everyone’s metadata. The semantic web is a compromise in making the data smarter and the software smarter. It is the best of both worlds.”

Keeping data usable

Over the past two years, libraries, museums, companies and other organisations have set up pages in Facebook, mySpace and other social networking sites. In some libraries, this is the work of an emerging technologies specialist, in others it’s an added role for an individual that may or not be sustained if that person leaves or changes job focus.

Whatever the situation, it’s not the best use of time to have to create a new profile and create networks in every service. This is where a move towards data standards and portability is a plus. Being able to move data between and in/out of these services saves time and sustains online networks and communities. Data Portability is one of the major projects looking at these issues. According to Chris Saad from the project, “The new innovation platform is data” and this is certainly true if looking at things from a Semantic Web point of view.

Libraries and the sustainable web

A recent article in Interactions stresses the importance of designing for sustainability of content on the web - the authors note that libraries and other cultural insitutitions will be at the heart of these efforts,

“Digital technology makes it possible to extend the walls of the archive beyond a single space or person, as well as ensure preservation and acccess in locations around the world [...] Libraries, museums, and archives will need to collaborate with business interests to build lasting social structures that are sustainable over time.” (Churchill E, Ubois J, 2008)

Libraries have played a significant role in participating in a variety of digital and web preservation projects over the years, but what’s the next step? How do we get more involved in conversations that take place in business?

———–
Churchill, E, Ubois J. 2008. Designing for Digital Archives. Interactions. March/April 2008. Retrieved from: http://interactions.acm.org/content/?p=1089 (full text via ACM Portal)

Comments 3 Comments »