How to Defrost the Digital Library


Their exists a large amount of online libraries. Their articles identified by URIs and DOIs

DOIs are a specific type of URI and similar to the International Standard Book Numbers (ISBN), allowing persistent and unique identification of a publication (or indeed part of a publication), independently of its location. DOIs can be used to retrieve metadata for a given publication using a DOI resolver such as CrossRef

The biggest problems with these online libraries are that personalization and socialization are not both implemented, another important problem is that they also differ considerably in the nature and power of their indexing by which users can search them on specific topics of metadata. This last difference makes it difficult to exchange personal libraries or switch between applications for collecting personal libraries to share with others.

These web based libraries provide similar basic facilities for searching and browsing publications. They differ in size and coverage and they differ in the subscription, personalization, and citation-tracking features.

The focus of this Review is largely about searching and organizing literature data together with their metadata and defrosting the library by using software for personalization and socialization of PDFs collections of scientists.

Work flow when using online libraries:

work flow library search

There is no universal method to retrieve a given paper, because there is no single way of identifying publications across all digital libraries on the Web. Publication metadata often gets “divorced” from the data it is about, and this forces users to manage each independently, a cumbersome and error-prone process. There is no single way of representing metadata, and without adherence to common standards (which largely already exist, but in a plurality) there never will be

Tools for defrosting libraries, which means personalizing and socializing libraries:

  • Zotero is an extension for the Firefox browser that enables users to manage references directly from the Web browser. Zotero can recognise and extract data and metadata from a range of different digital libraries. Users can bookmark publications, and then add their own personal tags and notes. Zotero doesn’t let the user share their tags. Zotero bookmarks cannot be identified using URIs, so it is not possible to link in from external sources to these personal collections. Mendeley is a similar application that helps to manage and share research papers. It has a Web-based browser version and a desktop client that automatically extracts metadata from PDF files, but it can only do this where metadata is available in an amenable format.
  • MyNCBI. MyNCBI allows users to save PubMed searches and to customize search results. You can also save searches as RSS feeds and get regular updates or request e-mail updates from personalized searches. It has no way to share searches or results. It also features an option to update and e-mail search results automatically from saved searches.
  • Mekentosj Papers is an application for managing electronic publications, it can be closely integrated with several services on the Web-like Google Scholar, PubMed, ISI Web of Knowledge, and Scopus mentioned in the Digital Libraries section of this paper. It can manage a large collection of PDF files on your harddrive and it looks and behaves much like Apple’s iTunes, an application for managing music files. Tags will retrieve your files. It is only available for Apple Macintosh users, and there is no version for Windows, which limits its uptake by scientists. Personalization yes but no socialization.
  • CiteULike is a free online service to organize academic publications, it was the first Web-based social bookmarking tool designed specifically for the needs of scientists and scholars. It allows users to bookmark or “tag” URIs with personal metadata using a Web browser; these bookmarks can then be shared using simple links such as those shown below.CiteULike provides metadata for all publications in RIS (EndNote) and BibTeX. It also notes how often it is tagged by other users and how often it is read by other users.
  • Connotea is run by Nature Publishing Group and provides a similar set of features to CiteULike with some differences.These differences are that metadata are available from Connotea in a wider variety of formats than from CiteULike, including RIS, BibTeX, MODS, Word 2007 bibliography, and RDF, but these have to be downloaded in bulk only, rather than individually per publication URI. The source code for Connotea is available, and there is an API that allows software engineers to build extra functionality around Connnotea.
  • HubMed is a “rewired” version of PubMed, and provides an alternative interface with extra features, such as standard metadata and Web feeds which can be subscribed to using a feed reader. This allows users to subscribe to a particular journal and receive updates when new content (e.g., a new issue) becomes available. HubMed provides metadata in RIS (for EndNote), BibTeX, RDF, and MODS style XML. Users can also log in to HubMed to use various personalized features such as tagging.

The first three in this list can be used for personalization the last three for socialization, but what we really want is one solution for both without loosing the important metadata.

Warmer digital libraries cannot be achieved by software tools alone. The digital libraries themselves can take simple steps to make data and metadata more amenable to human and automated use, making their content more useful and useable. Only with proper and better access to linked data and metadata can the tools that computational biologists require be built. We make the following recommendations to achieve this goal.

Especially the basic ability to identify publications and their authors uniquely is currently a huge barrier to making digital libraries more personal, sociable, and integrated. The identity of people is a twofold problem because applications need to identify people as users in a system and as authors of publications. Many users distrust especially commercial publishing companies. What will they do with their content and metadata.

Other recommendations can be read in this excellent review: Defrosting the Digital Library: Bibliographic Tools for the Next Generation Web in PLoS Computational Biology, this was just a short excerpt focused on the software and online implications. Are these issues important to scientists, what do you think?

Related posts on this blog:
Mendeleye manage share and discover research papers

Labmeeting Social Networks for Scientists

How to manage your pdf load

Gopubmed, What Where Who and When
Duncan Hull, Steve R. Pettifer, Douglas B. Kell (2008). Defrosting the Digital Library: Bibliographic Tools for the Next Generation Web PLoS Computational Biology, 4 (10) DOI: 10.1371/journal.pcbi.1000204