3 1960-1969 The pioneers
Condensing the immense amount of progress made in the 1960s is not easy and so this is a very selective perspective. As far as algorithm developments were concerned Bourne and Ford published a paper on stemming in 1961 (Bourne and Ford 1961), Damerau (Damerau 1964) reported on approaches to solve mis-spellings and Rocchio and Salton considered how best to optimise the performance of retrieval systems Roccio and Salton 1965). This was one of the first outcomes of the SMART project, initially at Harvard and then at Cornell, that will figure significantly in the history of the 1970s. Many of the developments of the period were reported in a new Information Retrieval section of ACM Communications from March 1964. A year earlier Information Storage and Retrieval was launched as a peer-reviewed journal, changing its name to Information Processing and Management in 1975.
Another initiative that started in the 1960s and lasted into the 1970s was ground-breaking work by Cyril Cleverdon, the librarian of the Cranfield Institute of Technology, UK on the comparative efficiency of indexing systems. It was funded by the US National Science Foundation. I had the good fortune to meet Cyril early in my career and his encouragement of my career choice was along the lines of “You will never be out of a job”. How right he was!
In the 1960s advances in computer technology resulted in some very technical progress in search development in terms of both research and the availability of commercial services. IBM released the 7090 range in late 1959 and the much more powerful 360 range in 1965. In parallel the technology to provide remote shared access to large computer centres was developed, with J.C.R. Licklider as the early innovator, leading directly to the Internet. At this point in the history of search a strictly chronological approach is not of value, and instead it is important to be aware of a number of major projects, several of which led to commercial online services becoming available from 1965 onwards.
Arguably the first ever enterprise/internal search service was set up in 1965 at the Cox Coronary Heart Institute in Kettering, Ohio by G. Douglas Talbott. I would cite this as enterprise search because the application indexed content that the Institute was publishing in a quarterly internal publication
In terms of the impact on the underlying algorithms of search, the work at System Development Corporation in the early part of the decade is of particular importance. Synthex was led by Robert Simmons with the objective of developing a system that could read and understand text, answer questions and compose an answer in readable English. The name was chosen as a tribute to the Memex concept of Vannevar Bush from 1945. There was a related ProtoSynthex project. One outcome of these projects was TEXTIR, an online search system developed for the Los Angeles Police Department in 1964 that could accept questions in natural language. Further development enabled it to incorporate synonyms into a search formulation and offer search term weighting. In parallel Hal Borko (Borko 1964) was developing BOLD with a focus on the automatic classification of the text in documents. Yet another project was COLEX, the aim of which was to advance the development of time-sharing services to provide online access to bibliographic databases.
These projects gave SDC the ability to launch the ORBIT online search service in 1967, a commercial service for information professionals and researchers which enabled them to search through large databases of abstracts of research literature. The project was led by Carlos Cuadra. Just a few months earlier the Information Sciences Group at the Lockheed Palo Alto Research Laboratories, led by Roger Summit, had launched the DIALOG online search service. The focus of this group was more towards scaling up online services and user interface development and one of its innovations was the display of set numbers at each stage of a query, a forerunner of facet hit numbers in current search applications. However probably the first public demonstration of computer-based information retrieval was at the 1964 World Fair with the LIBRARY/USA demonstration.
Other major centres of information retrieval science and application development in the 1960s included the work at Harvard and then Cornell University led by Gerard Salton, though this did not come to fruition until the early 1970s. Probably the most innovative was the work of Donald Hillman at Lehigh University on searching the full text of documents (the LEADER project) but mention should also be made of the SPIRES project at Stanford University (which remains one of the pre-eminent centres of information retrieval to this day) and TIP at MIT’s Lincoln Laboratories. IBM was also very much involved in retrieval research on a global basis and research into the use of computer applications for law research had been initiated. These and many other projects are described in detail by Bourne and Hahn in The History of Online Services 1963-1976 [v] and in addition there is an excellent paper by Hahn (Hahn 1996) based on the research for their book.
The importance of these online services to enterprise search is that they addressed the issues of scaling up the concepts developed in the 1950s and started to pay attention to user satisfaction, the user interface and user support. Probably the first user assessment of an online service was carried out in 1969 by Timbie and Coombs (Timbie and Coombs 1969). It was not until the early 1970s that these services were available in Europe and indeed globally, a problem primarily of low network capacity and very high network access costs. The launch of these services also set a standard for the search experience for a generation of information professionals and researchers that was not challenged until the arrival of Alta Vista and then Google 30 years later. These online services showed that research services could be delivered on demand at the desktop. The next decade was primarily about improving search result relevance and performance.
Borko, H. (1964). Research in automatic generation of classification systems. AFIPS ’64 (Spring): Proceedings of the April 21-23, 1964, spring joint computer conference.
Bourne, C.P. & Ford, D.R. (1961). A study of methods for systematically abbreviating English words and names. Journal of the ACM, 8(4), 538-552. https://doi.org/10.1145/321088.321094
Bourne, C.P. (2003). A History of Online Information Services, 1963-1976. MIT Press.
Damerau, F.J. (1964). A technique for computer detection and correction of spelling errors. Communications of the ACM, 7(3), 171-276. https://doi.org/10.1145/363958.363994
Hahn, T.B. (1996). Pioneers of the online age. Information Processing & Management, 32(1) 33-48. https://www.sciencedirect.com/science/article/abs/pii/030645739500048L?via%3Dihub
Roccio, J.J. & Salton, G. (1965). Information search optimization and interactive retrieval techniques. AFIPS ’65 (Fall, part I): Proceedings of the November 30-December 1, 1965, fall joint computer conference, part I. November, 293-305. https://dl.acm.org/doi/10.1145/1463891.1463926
Timbie, M. & Coombs, D. (1969). An interactive information retrieval system – case studies on the use of DIALOG to search the ERIC document file. ERIC Clearinghouse on Educational Media and Technology at the Institute for Communication Research, Stanford University.