Subject Index

Organizational Theory
Topics include classification theory, classification systems, controlled vocabularies, indexing, index design, thesaurus design, and library and information science.

For related resources, see Content Management.

The following resources are our top picks in this category.

Glossary of Terminology in Abstracting, Classification, Indexing, and Thesaurus Construction. Hans H. Wellisch. 2nd ed. (2000)
This book defines terms used in texts on abstracting, indexing, classification and thesaurus construction, as well as terms for the most common types of documents and their parts. The definitions are derived from such authoritative sources as ISO, ANSI/NISO and BSI.

Indexing Books. Nancy C. Mulvany. (1994)
Expanding on the discussions in the standard style guides, this book explains to authors and professional indexers aspects of analysis and judgment such as what to include and exclude from the index, the structure, how indexing fits into the publishing industry, whether to do it yourself or hire it out, deciphering publishers guidelines, and choosing appropriate software.

Indexing From A to Z. Hans H. Wellisch. 2nd ed. (1996)
For authors, students, and beginners as well as experienced indexers, the author covers not only back-of-the-book indexing but also the indexing of periodicals and non-print material, with practical examples of correct and incorrect indexing.

The Intellectual Foundation of Information Organization. Elaine Svenonius. (2000)
The effectiveness of a system for accessing information is a direct function of the intelligence put into organizing it. Integrating the disparate disciplines of descriptive cataloging, subject cataloging, indexing, and classification, this book adopts a conceptual framework that views the process of organizing information as the use of a special language of description called a bibliographic language.

Organizing Knowledge: An Introduction to Managing Access to Information. Jennifer Rowley and John Farrow. 3rd ed. (2000)
For the third edition, this standard text on knowledge organization and retrieval has been extensively revised and restructured to accommodate the increased significance of electronic information resources.

Thesaurus Construction and Use: A Practical Manual. Jean Aitchison, Alan Gilchrist, and David Bawden. (1997)
A practical, concise guide to the construction of thesauri for use in information retrieval, written by leading experts in the field. This new edition takes account of advances in information retrieval and software capabilities, and now also includes the uses of thesauri.

The following are also excellent resources.

A - Z of Thesauri.
A listing of thesauri available on the web.

The Amazing Internet Challenge: How Leading Projects Use Library Skills to Organize the Web. Amy Tracy Wells, Susan Calcari, and Travis Koplow. (1999)
The authors selected eight U.S. and four British projects for organizing information on the web as representative ways of successfully taming the often-chaotic web. For each site, similar information is furnished, including those responsible for the site, the site's mission statement, funding sources, classification system, selection criteria, and more.

Automatic Indexing of Documents. Valery I. Frants. A chapter in: Automated Information Retrieval: Theory and Methods. Valery I. Frants, Jacob Shapiro, and Vladimir G. Voiskunskii. (1997)
This chapter is devoted to the consideration of various approaches to developing automatic document indexing algorithms. It does not consider empirical (manual) methods or recommendations to indexers.

Beyond Bookmarks: Schemes for Organizing the Web. Gerry McKiernan.
A clearinghouse of web sites that have applied or adopted standard classification schemes or controlled vocabularies to organize or provide enhanced access to Internet resources.

Cataloging and Classification: An Introduction. Lois Mai Chan. 2nd ed. (1994)
This book covers general principles of bibliography, cataloging, and indexing, and provides exercises to reinforce the concepts.

Classification and Indexing in the Humanities. Derek Wilton Langridge.
The author places the humanities in the context of the whole of knowledge and compares their nature and problems with those of science, technology and the social sciences. The philosophical basis of the classification of knowledge is discussed and modern theory of bibliographic classification is outlined.
Note: Currently out of print.

Classification Schemes and Thesauri On-line. Anne Betz.
This guide is a collection of thesauri based on the results of the interdisciplinary seminar "Terminology Documentation and Multilingual Thesauri" held in summer 1998.

The Concept of "Aboutness" in Subject Indexing. W.J. Hutchins. From: Aslib Proceedings. 30, 172-81 (1978)
The author points out that there is very little research (as of 1978) into how indexers and classifiers determine the subject of a document. The general assumption is that indexers are able to state what a document is about by summarizing its contents, however the author proposes an alternative concept of "aboutness" based on a linguistic analysis of the text.

Controlled Vocabularies Resource Guide. Michael Middleton.
This guide provides links to examples of thesauri and to classification schemes that may be used for controlling database or web page subject content. It also provides links to descriptive and critical material about such meta-information.

Data Mountain: Using Spatial Memory for Document Management. George Robertson and et al. From: Proceedings of the 11th Annual ACM Symposium on User Interface Software and Technology. (November 1-4, 1998)
The authors describe a new technique for document management called the "Data Mountain," which allows users to place documents at arbitrary positions on an inclined plane in a 3D desktop virtual environment using a simple 2D interaction technique.
Note: Also available through ACM. Registration is required.

Depth vs Breadth in the Arrangement of Web Links. Panayiotis Zaphiris and Lianaeli Mtei. (1997)
The purpose of this study was to examine the effect of depth and breadth of web site structure on user response time. The results indicated that response time increased as the depth of the web site structure increased.

The Digital Library Tool Kit. Peter Noerr. (March 2000)
This document is designed to help those who are contemplating setting up a digital library. Whether this is a first time computerization effort or an extension of an existing library's services, there are questions to be answered, decisions to be made, and work to be done.

Guidelines for Indexes and Related Information Retrieval Devices. James D. Anderson. (1997)
This report provides expert guidance on designing indexes for every kind of document, which includes automatic indexing and indexing based on intellectual analysis and the use of controlled vocabularies. A comprehensive glossary of indexing terms is provided and recommended introductory text for print and back-of-the-book indexes, database indexes, computer produced indexes, and electronic search indexes are given.
Note: Also available through TechStreet.

Guidelines for the Construction, Format, and Management of Monolingual Thesauri. (1993)
This standard is the authoritative guide for constructing single-language thesauri, one of the most powerful tools for information retrieval. This report shows how to formulate descriptors, establish relationships among terms, and present the information in print and on a screen,and includes thesaurus maintenance procedures and recommended features for thesaurus management systems.
Note: Also available through TechStreet.

Indeterminacy in the Subject Access to Documents. David C. Blair. From: Information Processing & Management. 22:2, 229-41 (1986)
Subject access to documents is influenced by two kinds of indeterminacy: the indeterminacy of the indexer's selection of indexing descriptors and the indeterminacy of the inquirer's selection of search terms. How these indeterminacies interact is discussed, and ways of reducing the effect of one of these two indeterminacies is suggested.

Indexing and Abstracting in Theory and Practice. F.W. Lancaster. 2nd ed. (1998)
This is a textbook for a course in either an academic or a professional education program for librarians. It reviews the principles, practice, consistency, and quality of indexing; the types and functions of abstracts; natural language in information retrieval; and the future of indexing and abstracting services.

Indexing and Access for Digital Libraries and the Internet: Human, Database, and Domain Factors. Marcia J. Bates. From: Journal of the American Society for Information Science (JASIS). 49:13, 1185-205 (1998)
Factors and issues regarding content indexing and access to digital resources are reviewed and implications drawn for information system design.

Indexing by Latent Semantic Analysis. Scott Deerwester and et al. From: Journal of the American Society for Information Science (JASIS). 41:6, 391-407 (1990)
A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents in order to improve the detection of relevant documents on the basis of terms found in queries.

Inductive Learning Algorithms and Representations for Text Categorization. Susan Dumais and et al. From: Proceedings of the Seventh International Conference on Information and Knowledge Management. (November 3-7, 1998)
Text categorization -- the assignment of natural language texts to one or more predefined categories based on their content -- is an important component in many information organization and management tasks. The effectiveness of five different automatic learning algorithms for text categorization in terms of learning speed, real-time classification speed, and classification accuracy is compared.
Note: Also available through ACM. Registration is required.

Less Is More: Eliminating Index Terms From Subordinate Clauses. Simon H. Corston-Oliver and William B. Dolan. From: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics. (June 20-26, 1999)
The authors perform a linguistic analysis of documents during indexing for information retrieval. By eliminating index terms that occur only in subordinate clauses, index size is reduced by approximately 30% without adversely affecting precision or recall.

Library of Congress Thesauri Home Page.
This resource provides access to the Thesaurus for the Global Legal Information Network (GLIN), Legislative Indexing Vocabulary (LIV), Thesaurus for Graphic Materials I: Subject Terms (TGM I), and Thesaurus for Graphic Materials II: Genre and Physical Characteristic Terms (TGM II). These tools allow for bettter navigation by including broader, narrower and related terms.

The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information. G. Miller. From: Psychological Review. 63:2, 81-97 (1956)
The span of absolute judgment and the span of immediate memory impose severe limitations on the amount of information that we are able to receive, process, and remember. The process of recoding is a very important one in human psychology and deserves much more explicit attention than it has received.
Note: See Larson, Kevin and Mary Czerwinski. Web Page Design: Implications of Memory, Structure and Scent for Information Retrieval.

A Method for Evaluating the Organization of Content of a Web Site. Sharon L. Smith, Karen Angelli, and Dennis Wixom. From: Common Ground. 8:2 (May 1998)
This article describes an objective, structured method for evaluating how well different category schemes help people find information. It presents a case study of the applications of the method to an existing web site.

Multitrees: Enriching and Reusing Hierarchical Structure. George W. Furnas and Jeff Zacks. From: Proceedings of the CHI 1994 Conference on Human Factors in Computing Systems. (April 24-28, 1994)
This paper introduces multitrees, a new type of structure for representing information. These multitrees have subtrees, which both together provide a graphical representation of the hierarchy in the system.
Note: Registration is required.

Natural Language Versus Controlled Vocabulary in Information Retrieval: A Case Study in Soil Mechanics. Manikya Rao Muddamalle. From: Journal of the American Society for Information Science (JASIS). 49:10, 881-87 (1998)
The effectiveness of two tools, thesaurus and natural language, in an article record retrieval system has been studied. Since both thesaurus and natural language have shown identical performance in information retrieval, a combination of these two has been suggested for making searches and providing relevant information.

Now, It's The "Vectories" That Are Coming! Danny Sullivan. From: The Search Engine Report. 45 (August 2, 2000)
An article about a new group of services emerging, those that offer to automatically categorize web pages into Yahoo-like directories. Includes a look at some of the "vector" technology players.

Practical Taxonomies. Sarah L. Roberts-Witt. From: Knowledge Management. 47-54 (January 1999)
This article provides advice for building a knowledge classification system that categorizes all the information the organization chooses to track in a logical manner so that it can be reliably accessed by anyone in the organization.

The Revenge of the Library Scientist. Bob Ainsbury and Michelle Futornick. From: Online. 24:6 (November 2000)
This article discusses how the skills of library scientists make them perfectly suited to the online world.

Sorting Things Out: Classification and Its Consequences. Geoffrey C. Bowker and Susan Leigh Star. (2000)
This book is an attempt to sort out exactly how and why we classify and categorize the things and concepts we encounter day to day. With precise academic language, the authors pick apart our information systems and language structures that lie deeper than the everyday categories we use.

Specification for Resource Description Methods. Part 3: The Role of Classification Schemes in Internet Resource Description and Discovery. Traugott Koch and Michael Day.
This study discusses the role of classification schemes in resource description and discovery. It recommends automatic classification processes if large robot-generated services are to offer a good browsing structure for their documents or advanced filtering techniques as well as proper query expansion tools to improve the search process.
Note: Deliverable D3.2 for the DESIRE (Development of a European Service for Information on Research and Education) project.

Syntax for the Digital Object Identifier. (2000)
This standard defines the order and components of the Digital Object Identifier (DOI) the first identification system for intellectual property in the digital environment. Introduced in 1997, the DOI provides a unique way to identify content in all media, plus links users to rights holders to facilitate seamless e-commerce.
Note: Also available through TechStreet.

Thesauri on the Web: Current Developments and Trends. Ali Asghar Shiri and Crawford Revie. From: Online Information Review. 24:4, 273-80 (2000)
This article describes some recent thesaurus projects undertaken to facilitate resource description and discovery and access to wide-ranging information resources on the Internet. Types of thesauri available on the web, thesauri integrated in databases and information retrieval systems, and multiple-thesaurus systems for cross-database searching are also discussed.

Thesaurus Construction. Tim Craven. (July 3, 1998)
A tutorial on the basics of constructing an information retrieval thesaurus. It includes a glossary of thesaurus terms.

User Interaction in Machine Aided Text Categorization: Design Considerations for an Indexing Assistant. Catherine Baudin and Scott Waterman. (February 1998)
This study investigates requirements for effectively using automatic categorization technology to support human decision making. The researchers present the Indexing Assistant, a prototype tool that uses technical term extraction and text categorization to help humans categorize documents in technical domains.

Web Page Design: Implications of Memory, Structure and Scent for Information Retrieval. Kevin Larson and Mary Czerwinski. From: Proceedings of the CHI 1998 Conference on Human Factors in Computing Systems. (April 21-23, 1998)
The authors describe an experiment to see if large breadth and decreased depth is preferable, both subjectively and via performance data, while attempting to design for optimal scent throughout different structures of a web site. This work is testing the theories of Miller in his classic "The Magical Number Seven, Plus or Minus Two."
Note: Also available through ACM. Registration is required.

Web Thesaurus Compendium. Barbara Lutes.
The thesauri and classification schemes in this collection are all available on the web with various search and browse facilities, and various degrees of hypertext linking. The term "thesaurus" is used loosely here to refer to any structured collection of interrelated terms; often, but not necessarily, in a certain domain.

Who Needs Controlled Vocabulary? Raya Fidel. From: Special Libraries. 1-9 (Winter 1992)
Observation of 281 real-life searches shows that although some searchers preferred descriptors and other textwords, the decision about which type to use depended on each specific situation. Searchers' reasons for search-term selection revealed a set of rules that guided their selection.

Women, Fire, and Dangerous Things: What Categories Reveal About the Mind. George Lakoff. (1990)
What do categories of language and thought reveal about the human mind? The book has repercussions in a variety of disciplines, ranging from anthropology and psychology to epistemology and the philosophy of science.