6 1990 – 1999 Innovation in retrieval technology
Before looking at the enterprise search business itself there were important developments in the understanding of how people searched, and in novel technical advances in search. Marcia Bates started to make us think about search behaviour in her 1989 paper on berry picking as a metaphor for the process of discovery. Peter Pirolli’s work on information foraging was published in 1999. Although this is right at the very end of the decade being covered it is indicative of the research that was being undertaken looking at information systems from a user behaviour perspective, with Jakob Nielsen (the founder with Don Norman of the Nielsen Norman Group waiting in the wings at Sun Microsystems from 1994 to 1998. From an enterprise search perspective the work that was undertaken at the University of Huddersfield by Stephen Pollitt on faceted navigation was ground-breaking. The concept was taken up and developed further by Marti Hearst with her Flamenco project.
From a technical perspective the challenges of indexing and searching the World Wide Web were now starting to be addressed, taking search in some very different directions. Alta Vista was not the first WWW search engine but the team working on it gained an immense amount of knowledge about web crawling and indexing at scale. Two members of the team founded Exalead in 2000. Google followed in 1998 and of course the arrival of enterprise web applications such as intranets opened up a potentially very large market for enterprise-level search. Sadly the IBM HITS algorithm (later integrated into the IBM Clever project) didn’t have a chance against the Google PR machine. During the late 1980s and then into the 1990s advances in natural language processing were rapid as machine learning approaches and developments in machine translation opened up new opportunities for search. Latent Semantic Analysis first emerged in 1988 and Probabilistic Latent Semantic Analysis in 1999, the latter forming the basis of the Recommind e-Discovery application, now owned by OpenText. Lucene, written by Doug Cutting, also appeared in 1999. This was (and remains) a free open-source search engine software library and is now widely used in conjunction with Solr (developed by Yonik Seeley), ElasticSearch and Lucidworks, amongst many others.
The stage was set for the emergence of a significant number of search vendors. Verity was gaining momentum but finding it difficult to achieve profitability. In 1993 RetrievalWare emerged and started a trend for search software companies to have multiple owners. How it ended up in FAST Search and Transfer via Excalibur is, to say the least, complicated.
The Infoseek/Ultraseek/Inktomi/Verity/Autonomy saga, which started in 1993, was yet another complicated journey. Interestingly Ultraseek was branded as Ultraseek Enterprise Search and by the time it was acquired by Autonomy had around 15,000 customers. Verity achieved an IPO in 1995, achieving funding of $40m, double the amount anticipated. This probably encouraged (at least indirectly) the arrival of Autonomy (1996), FAST Search and Transfer (1997) and Endeca (1999).
The development of the enterprise search business in the early 1990s is not well documented. Many of the entrepreneurs who had a vision for search have been interviewed by Stephen Arnold in his invaluable Wizards Index column. In the paragraph above most of the links are to Wikipedia entries, which inevitably vary in quality and depth but hopefully are at least a starting point for research. The distinguished journalist and philanthropist Esther Dyson tracked the development of internet companies during this period.