Discoverability and Its Challenges

Sarah Kirkley

Alexis Wichowski discusses folksonomies and tagging as a way in which outlying information (not happening in the mainstream of the information environment) is managed in “Survival of the Fittest.” She says that folksonomies are tags, used by everyday or “ordinary” users, that have created a new information organization system to help manage the volume of information online. Tags can be used to simply remind a user to return to a particular site or to organize them based on content or other descriptors for ease of retrieval and use. Wichowski discusses their evolution over time and their uniqueness from traditional metadata schemas or controlled vocabularies used in the information science world; one flaw she points out about folksonomies is their lack of context. Wichowski starts the article by stating that information must be useful and findable, and the natural language can contribute to the discoverability of information simply because it has been described in language familiar to those seeking the content.

Mary W. Elings and Gunter Waibels’s “Metadata for All” describes the challenges faced by institutions and inconsistency in metadata. They describe the various standards used within libraries, archives, and museums and cover the history of the use of those standards and how they have evolved or moved to others over time. They argue for modifying the standards based on material type, rather than institution or community (library, archives, museum, etc.) type. For instance, cultural heritage materials at museums, libraries, and archives would all be described using the same standards, while books at those same institutions would all be described using their own set of standards separate from cultural materials. The current lack of interoperability ultimately negatively impacts user experiences in accessing resources, which is detrimental to all of the communities mentioned.

I think online tagging both facilitates access to information but can also make it more difficult in some situations. Users who are used to getting relevant search results by using a colloquial term via Google may be surprised when that same search yields no results in a library catalog or database. The Library of Congress Subject Headings are often outdated, such as the much discussed change from ‘Illegal Aliens’ subject heading to ‘Noncitizens’ last year. The lack of natural language in the subject headings make some research more difficult than necessary, and it omits certain groups of people and perspectives. The lack of diversity among those in the field is reflected in the language used, and the natural language used in online tagging and folksonomies offers a chance to include some of those underrepresented populations and to ultimately make the information realm more inclusive in its descriptions.

While I know that interoperability among libraries, archives, and museums would have a great impact on the discoverability of resources, I wonder how feasible this move is. MARC has been rumored to be on its way out for decades but is still in use. Even within the library in which I work, multiple standards being used (MARC in the catalog, Dublin Core in the institutional repository and digitized special collections, EAD for archival finding aids, etc.). Those standards are content-based but having multiple systems of access can lead to confusion and lack of use. Trying to cross populate content management systems or adding a discovery layer can help, but still leads to confusion and there are often just too many clicks to get to the actually content. And lastly, the institutions discussed in the readings are often constrained by budgets and may only be able to afford one platform with one metadata standard for all content.


  1. lmilone2016 says:

    Identifying, Organizing and Retrieving Data in the Wild World of Web 2.0

    When the American History Association and the Foundation Center planned for and solicited participation in the development of new taxonomies (in the first I was an online observer, in the second a participant), the idea of Folksonomies, as set out by Alexis Wichowski, struck me as an invitation to anarchy. Still, the attempt by these organizations to put order to the process of creating an accepted, standard taxonomy was nothing short of misery too, so perhaps whether it is an open or a closed process it is chaos. In both cases, the taxonomies grew by the hour as each new participant added a “refinement” to element after element of the starting list. Had each organization not set a timeline for the process, it would have been endless and the lists infinite. Therefore, I was intrigued by Wichowski’s statement, “Further, when folksonomies were combined with the directories with controlled vocabularies, precision and recall results were higher than in searches using the controlled vocabularies alone.”

    Given my ignorance of metadata considerations, and having read, “Using the Dublin Core,” by Dianne Hillman, it seemed useful to find out how the technology world was looking at integrating folksonomies and the Dublin Core. I found only one article that seemed to be directly on point (although I found it published several times from 2008 to 2011). “Relating Folksonomies to the Dublin Core,” came out of the 2008 DCMI International Conference on the Dublin Core and Metadata Applications. Written by Maria Elisabete Catarino and Ana Alice Baptista, it is a report on their study of 5,098 tags, created by 15,381 users. The study consisted of five stages, with the first two requiring a pilot study from which 25 percent of the tags could not be assigned any Dublin Core properties. In explaining this study’s relation to the “Kinds of Tags” project (KoT), the authors indicated they considered this article a continuation of KoT. The Catarino/Batista study recommended four additional properties for social tagging to the four that had already been recommended by KoT. Our reading on the Dublin Core had a link to the 2011 revision of the user guide on the Dublin Core wiki. Clearly, like everything involving the technological world, organization and retrieval are works in progress that will continue to change as our understanding of what is being published on the web, how to determine its value, and for which disciplines, becomes more informed and sophisticated.

    It was encouraging to see the work being done to simplify retrieval of information through standardization of types of images/data across institutional lines for libraries, archives and museums. As a person who has benefited by understandable online findings and digitized archival information, I am grateful for this work. Making it possible to review findings well in advance of visiting an archive, having the material that you have to review waiting for you upon your arrival, makes for efficient use of the researcher’s time creating a much more cost effective experience. The added benefit of digitized archival material cannot be overstated. Having the information from the Library of Congress or the archives of hundreds of universities and libraries, as well as the National Archives available on our computers to access from where ever and whenever we want is of constant wonder to someone who started an academic experience when travel was the only way to get information from such sources.

    However, it appears the fundamental questions raised by this week’s explorations are: 1) Is standardization of identifying properties the direction we want to go as the hundreds of millions of publications are made available on Web 2.0; 2) Instead, do we think that folksonomies are the better method, each of us using our own identifiers for our own retrieval without regard to how others will retrieve that information; 3) If we go the second route, how much information would be lost that we might want to access or will there be a fairly universal set of identifying properties that people come to naturally for the most important information, whatever that might be; 4) If there is not that universal set of properties, will those that do exist be able to be identified by electronic search engines without human intercession.

    I fall out on the side of an established set of properties that people/publishers of information can choose to use in the hope that most published information from “reliable” sources (whatever those may be) for at least the arts and sciences will use those properties. If any of that is possible, then that information will hopefully be readily retrievable. But, then I never did find the “wild west” all that enticing whether it was in a fantasized Texas or on the web.


