Archive for the “search” Category

ReadWrite Web has a recent article on how semantic search works, Semantic Search: Myth and Reality. The article points out that semantic search will not solve all our search problems, it’s simply impossible. For those working with datasets, there will always be a great deal of cross-tabulation, manipulation (to reformat and present data) and manual work to be done to bring together answers to queries. Yes, semantic search might help save time in the initial stages of a query by giving more meaning to the terms we use to query with, but search is still, at its heart, a human construct and will always be open to interpretation and error. The article makes the important point that -

These are computationally challenging problems that really have nothing to do with understanding semantics. The misconception has been perpetuated since early days of the Semantic Web that somehow, because we will annotate the web, we will be able to solve these super complex problems. This is simply not true. There are fundamental limits to what we can compute, and a class of problems that have an exponential number of possible solutions is not going to be magically solved because we represent data as RDF.

Right now we search based on word occurance (ignoring for the moment Google’s fancy rankings, inbound link rankings, and other fancy search criteria and patterns in high-end databases). In the future we will search using semantics for words+concepts+inferred trust. We will probably never be able to search purely by opinion or emotion. And that’s fine - because we can make that judgement for ourselves. We’re human after all.

Comments 2 Comments »

If there was any doubt that libraries belong in semantic web development, check out these two examples of the semantic web applications that are emerging. The majority of applications available right now (or in preview/beta) focus on information management and search. Sounds like our business to me. With 2008 predicted by some as the year of the semantic web, the time for libraries to take a closer look is now.

Freebase

Pulls in data from datasets such as Wikipedia and creates relationships and meaning between different data points. Their example of usage:

For example, if you ask Freebase for Jennifer Connelly films with actors who have appeared in a Steven Spielberg movie, you”ll get a tidy list of eight movies.

Also includes an API to add Freebase results to your own web projects. Possibly useful at the refdesk.

If you’re a statistics person, this concept will be very familiar to you. Essentially, it’s a giant database of cross-tabulated data. But unlike the kinds of data that you might be used to cross-tabulating, such as government statistics or census data (as I used to do) there is a big difference with Freebase inherent in the datasets they have chosen to populate their database -

Because Freebase lets anyone edit the data, there’s always a chance that somebody has, intentionally or unintentionally, introduced a mistake.

Hakia

Aims to retrieve more relevant, reliable search results. As librarians used to teaching all kinds of complicated methods to retrieve better results not only from search engines but scholarly databases we might wonder whether such a goal is possible! This and other semantic search projects will be interesting to watch over the next year.

Comments No Comments »