The major technology advance over the period from the initial availability of enterprise search applications in the 1980s running on mini-computers until around 2020 was arguably the gradual introduction of the BM25 ranking model from around 2010 to replace TF.IDF. There have been many variants of BM25 but it became the default ranking model for most enterprise search applications.
Search can fail in many ways, as outlined in a schematic from Clearbox Consulting. From a user perspective it is often difficult to understand why a search has returned a poor set of results with low relevance to the query, if indeed it returns any results at all. Self-diagnosis is impossible, which is one of the reasons that successful enterprise search applications invariably have a strong search support team that is proactive in ensuring that search is satisfactory.
Surveys over the last decade have all indicated that perhaps only 20% of organisations have a search application that delivers a high level of search satisfaction. In the course of writing this book the author took part in the Intranet Italia Day conference in Milan in May 2022. When the audience of over 150 intranet managers was asked to raise their hands if they knew that employees were satisfied with the search performance of their intranet only five delegates did so.
The use of BM25 and related models for ranking does make it possible to reverse engineer a query and results to understand what the possible causes of the poor performance might be. Search applications have dashboards that can then be used to boost particular words or phrases, and it is also possible to manually ensure that entity extraction and name similarity routines are working effectively.
With the arrival of machine learning, dense vectors, neural networks and very large pre-trained language models the transparency of the search process disappears. A core requirement of enterprise search is that employees trust it because search failure in any degree could put the organisation, and their own careers, at risk through making a flawed decision on the basis of not finding relevant enterprise-created information.
The aim of AI-based search is the Holy Grail of understanding the intent of the query in order to deliver the most relevant set of results. No research has ever been undertaken to categorise enterprise intents. Research into the intents behind web search queries suggests that the range of intents, and the difficulty of categorising them, are quite considerable.
At present AI-based search is in the hype-stage of development, which experience shows is then followed by a period of disillusion with the initial promise of the technology. Out of this disillusion comes a reality check and a gradual period of wide-spread adoption with significant benefits to the organisation and to the individual employee. Even with the benefit of 60 years of search development it is not possible to put a time-scale on this evolution or to forecast when it might be of value to write a second edition of this history.