Friday, April 11, 2014

Taxonomy Software Directories

It's difficult to find a list of taxonomy management software that is both comprehensive and up to date, yet not overwhelmed with related products and services. I define taxonomy management software as a tool to manually build and edit taxonomies, controlled vocabularies, and thesauri in accordance with industry standards. It should be the primary tool used by those who work as taxonomists. Lists of  “taxonomy software,” however, may include more than just tools for taxonomy management, such as auto-classification/auto-categorization/auto-indexing software, search software that utilizes taxonomies, or mind-mapping and other graphical categorization tools, etc.

Taxonomy maintenance, unfortunately, is just too small of a niche area for the major evaluators of software, whether consultancies, industry research firms, or trade publications, to find it worth their time to study. Companies that research the information technology market, such as Forrester Research, Gartner, International Data Corporation (IDC), and Real Story Group, won't get the commercial payoff from preparing studies of the taxonomy management software industry and products.

At the time I wrote my book, the most comprehensive directory of taxonomy software I found and refer my readers to was that of the British consultant Leonard Will, on the website of his consulting business Willpower Information, which lists 38 software packages, both commercial and freeware. Leonard Will had contacted each vendor and thus provided descriptive and contact information for each tool. The fact that this was a directory of "thesaurus" software and not “taxonomy” software is not an issue, and it was probably a good thing to include only software that meets thesaurus expectations. This directory was very comprehensive, including lesser-known free and open source software, which over time tended to become unsupported or even unavailable. With an interest in posterity, Leonard Will kept the unavailable software listed in his directory merely with a note to that effect. This may have been interesting for anyone thinking of developing their own thesaurus software, as they may be able to track down these other developers. For someone looking for a good commercial solution, however, there are far too many outdated products to weed through.

After Leonard Will retired, he decided he did not want to spend the time maintaining his directory, which he last updated in 2007, and in 2011 he offered the content of his directory to someone else, specifically contacting both Margie Hlava of Access Innovations and myself. Then Margie and I had to figure out which one of us would take it, fully aware that the rich content on a website would help our own respective business websites, yet it would also take quite a bit of time and effort to set up and maintain. After a year of hoping to find time, I finally relented that I would not and told Margie she could take it. The successor to the Willpower Thesaurus software directory, maintained by Margie’s employee Eric Ziecker, now resides at http://www.taxobank.org/content/thesauri-and-vocabulary-control-thesaurus-software

The core of TaxoBank's directory “Software for building and editing thesauri” at present is still essentially the same as the Willpower site, maintaining the original tabular content, style, colors, etc of that site, so visitors to the TaxoBank site may recognize it from Willpower. Posterity still seems to be valued, as all but one of the same 38 software packages are still there, although in two cases there is a note saying “The particular software referenced above is no longer available.” The notes section for many packages has been updated with additional content extracted from the vendor websites. More updating is still pending, though, as operating systems listed are dated, such as “Windows 95/98/NT/2000/XP.”

The main difference from the original Willpower site is the addition of 63 other products in a new section, separated by the note “Additional indexing, taxonomy, controlled vocabulary, thesaurus, classification, mapping and ontology software and services not referenced in Leonard Will's original listing follows below.” These additional products include many products not specific to “building and editing thesauri,” such as Apache Lucene, EMC Documentum, Oracle Endeca, Google site search, HP Autonomy, IBM Infosphere, and Microsoft SharePoint, along with one taxonomy consulting service. In my opinion, it might be better to have the related products and services on a separate web page to avoid possible confusion and to keep the list to a manageable length, as the total web page is currently 145 printed pages long. Despite these issues, I praise Margie and Eric for taking efforts to maintain this valuable resource.

As for a shorter list focused on current commercial software dedicated to supporting the manual creation and editing of thesauri and taxonomies, that may have to wait until the next edition (not yet started) of my book. For now, there are the products, as of early 2010, listed in Chapter 5 of  The Accidental Taxonomist book website links page. To this list, I would now add at least PoolParty and TopBraid Enterprise Vocabulary Net, both introduced since the book went to press. Meanwhile, taxonomy consultants still remain a valuable source of advice on taxonomy/thesaurus management software.