Accumo Classifier refines results

reviews
Apr 18, 20052 mins

Java-based data-clustering engine shows promise and signs of youth

Some call search engines the next killer application, though I’d argue that delivering a torrent of unstructured results is hardly useful. It doesn’t have to be this way, as evidenced by Accumo Classifier — a Java-based classification engine with a simple-to-use API that automatically organizes text information into intuitive clusters.

After testing Classifier for a few weeks, I see likely benefits to organizations but I also see work Accumo should consider to broaden the product’s appeal. In the plus column, Classifier runs on common Java application servers; I selected the open source Apache Tomcat. To use Accumo Classifier you create a search servlet or JSP that retrieves results from your search engine and then calls Classifier to obtain the clusters.

Accumo provides two starter JSP applications — one uses the Google API and another “scours” HTML pages. Out of the box, Google worked well; Classifier’s combination of artificial intelligence, natural language processing, and statistical analysis clustered results into the neatly presented topics I would expect. With a little tweaking, the HTML ScreenScraper let me cluster results from a Verity Ultraseek intranet search. In both tests, results were clustered in about one second.

From here Accumo Classifier shows signs of an early release. Administrators have limited control of the system’s behavior. For example, you can enter synonyms so that related content appears in the same cluster, but there’s no easy way to format results to match an existing site’s design. Moreover, to be truly useful this package needs more standards support, such as the ability to read XML-formatted content. And with disparate information sources, enterprises want a way to federate search results from one query.

Still, developers will find this version a good tool for enhancing intranet search or categorizing documents.

AccumoClassifier 1.12

Accumo

Cost: Pricing available upon request; free 30-day trial

Available: Now