Peter Morville's bi-weekly column on the evolving definition of information architecture

Software for Information Architects

Information professionals have a love-hate relationship with technology.

We love IT because it has made our jobs necessary by enabling the creation and connection of tremendous volumes of content, applications and processes.

We hate IT because it constantly threatens to replace the need for us.

Anyone who's seen the 1957 film, Desk Set, in which the librarians fear the "electronic brain" will steal their jobs, understands the enduring nature of this struggle.

Love it or hate it, we are participants in a co-evolutionary journey with technology that is defined by unremitting rapid change.

We have a real opportunity (if not an ethical obligation) to positively influence outcomes by injecting our understanding and healthy skepticism into the information technology acquisition and integration process.

The Next Five Years

We're living in the stone age when it comes to software for information architects. The products are crude and so is our understanding of what we really need.

When people get together to discuss experiences with enterprise-wide applications to support web sites and intranets, pain and suffering are dominant themes. Many organizations become so distracted and discouraged by their first web application, they fail to explore the products in related categories.

This will change. Within the next 5 years, all large web sites and intranets will leverage software applications from a wide variety of categories. We will not choose between automated classification software and a collaborative filtering engine. We will need both, and more.

And, information architects will play an integral role, working closely with business managers, content managers, and software engineers to select, acquire, integrate, and leverage this sophisticated suite of applications. None of these people can do this work well alone.

Categories in Chaos

It's rather ironic that one of the toughest challenges in understanding software for information architects involves trying to define meaningful categories for the darned stuff.

There are huge overlaps between products. These overlaps are further exaggerated by overzealous marketing efforts that claim the software can create taxonomies, manage content, fix dinner, and tie your shoes.

And of course, the vendors and their products are multiplying, merging, and mutating at a terrific pace.

Given this fluid, ambiguous context, here is an early attempt to define just a few of the product categories that information architects will need to work with in the coming years.

Automated Classification

Definition: Software that leverages human-defined rules or pattern matching algorithms to automatically assign index terms to documents.

Synonyms: automated categorization, automated indexing, automated tagging

Examples: Interwoven Metatagger, Autonomy Categorizer, Inktomi Search CCE, InXight Categorizer, Mohomine mohoClassifier

Comments: We see great promise to integrate human expertise in designing taxonomies with software that populates those taxonomies quickly, consistently, and inexpensively. Note that this software:

  • works best on full-text document collections
  • can't index images, applications, or other multimedia
  • does not adjust for user needs or business goals
  • does not understand meaning
To learn more, read Kat Hagedorn's upcoming White Paper (available early March) and Kathy Adams' Article.

Automated Category Generation

Definition: Software that leverages pattern-matching algorithms to automatically generate categories or taxonomies.

Examples: Semio Taxonomy, Autonomy Portal-in-a-Box

Comments: Proceed with great caution! The demos we've seen produce truly confusing category schemes with tremendous redundancy and mixed granularity. These could be useful tools for information architects performing content analysis, but not at current prices. To learn more, read Little Blue Folders.

Search Engines

Definition: Software that provides full text indexing and searching capabilities.

Examples: Inktomi Search, Verity, Google SiteSearch, Oingo DirectSearch

Comments: Abundant sillyness. As content volume grows, search will become the heart of most web sites and intranets, yet few vendors admit they're selling a search engine; they all have "portal solutions." Meanwhile, the true challenge involves getting the IT people, who currently own the search engines within most corporations, to share their toys with people who understand how and why to connect users and content. The current difficulties in this category are not due to technology. It's a people problem! To learn more, visit Search Tools.

Thesaurus Management

Definition: Tools that provide support for the development and maintenance of controlled vocabularies and thesauri.

Examples: MultiTes, Lexico, Oracle interMedia, Verity

Comments: The bleeding edge! We heard one success story from someone who integrated MultiTes and Verity. However, most of the early adopters have had to do a lot of custom development and integration in this area. The hard part is supporting controlled vocabulary management in today's decentralized environments. See ACIA Seminar Resources to learn more.

Collaborative Filtering

Definition: Tools that leverage user preferences, patterns, and purchasing behavior to customize organization and navigation systems.

Examples: Macromedia LikeMinds, beFree's BSELECT

Comments: We've all experienced the power and the pitfalls of Amazon's "Customers who bought this book also bought these books." Within 5 years, collaborative filtering will fill a niche role on most web sites, enabling a rich set of associative links (see also, see related) at the document or application or product level. This is a low-cost, bottom-up, adaptive approach with real value. On the other hand, it doesn't replace the need for associative links that are defined by subject matter experts or by business rules. And, of course, as Samantha Bailey always says, "Beware the Tyranny of Popularity." Learn more from this ZDNet Article.

Portal Solutions

Definition: Tools that say they provide "completely integrated enterprise portal solutions."

Examples: Plumtree, Sagemaker, MS SharePoint

Comments: The vision of seamless, intuitive access to all enterprise and 3rd party content independent of geography, ownership, and format is compelling and completely unrealized. These tools claim to do everything. Make sure you know what they do well.

Content Management

Definition: Software that manages workflow from content authoring to editing to publishing.

Examples: Interwoven TeamSite, Vignette, Broadvision, Open Market Content Server, NCompass, Documentum

Comments: Forrester Research calls these product offerings "immature." The problems stem from the fact that content management is very complex and very context-sensitive. Inevitably, you'll need to buy and then customize extensively. This is a headache few large organizations will be able to avoid. Read this article to learn more.


Definition: Software that analyzes online and offline sources of customer behavior data to enable improved customer interactions, at call centers, in marketing campaigns, and on web sites.

Synonyms: e-Marketing, e-Business intelligence, eCRM, data mining, web mining

Examples: Personify, Accrue, NetGenesis, digiMine

Comments: I have little experience with these tools. The fact they stretch beyond the Web, into that other world of phone calls and in-store purchases is admirable. My guess is they'll do a good job of telling you what's not working, but they won't help you understand how to fix it. To learn much more about this topic, read Karl Fast's upcoming White Paper (available early March).

Database Management

Definition: Tools for managing and providing access to structured data such as facts and figures.

Examples: Oracle, Microsoft SQL Server

Comments: We do not want to experience the problems of the "hidden Web" within our own web sites and intranets. To prevent valuable data from being buried alive in isolated databases, information architects need to collaborate with developers and system integrators to provide users with intellectual access to information and data, independant of format.

Information Architecture Productivity Software

Definition: Software that information architects use to do their jobs.

Examples: Visio, Adobe Illustrator, Adobe InDesign, QuarkXPress, Inspiration, Macromedia FreeHand, Storyspace, DENIM & SILK, BPwin

Comments: OK, this is a slightly ridiculous category, but what the heck. Key IA work products include blueprints, wireframes, and controlled vocabularies. Microsoft Word, Excel, Access, Visio, and PowerPoint are the basic tools of the trade, but MS Spokespeople claim not to have a monopoly on the category.

Questions to Ask

Whatever the category, when you're involved in selecting complex, expensive software, there are a number of important questions to ask.

You'll need to determine whether it's best to build it yourself, buy a product, or contract with an ASP. You'll want to know about the total cost of ownership, from purchase to integration to customization to maintenance to upgrade. You'll want to know about the long-term outlook for the vendor; in other words, will they be there to answer the phone in 6 months.

Most importantly you need to find an engineer in the vendor's firm who will answer these questions. One of the many truisms from the world of Dilbert is that engineers are like Vulcans, they cannot tell a lie. They will happily contradict their company's marketing hype, usually without even the slightest provocation, telling you:

  • What their product does well
  • What their product does poorly
  • What they wish their product could do

So, even though engineers are the ones who are actually working hard to automate us out of a job, we should still like them, because they're helpful and honest, and because they will only need us more in the coming years, to make productive use of the fascinating new tools they are building.

End Notes

Have I missed your favorite software category or tool?
Let me know and I'll add it to this article.

Please send your rants and raves to Peter Morville.

Subscribe to our bi-weekly newsletter for notification of new articles.

If you'd like to bookmark this column use this and if you'd like to bookmark this article use that.