Introduction

The construction of a computer system which understands riaturallanguage is an important enterprise for two reasons:

(a) such a system would enable people to converse easily with a computer system and extract useful work from it without the restrictions imposed by the use of an artificial programming language, and

(b) the techniques involved would provide valuable insights into the mechanism of the human mind.

Both of these claims are contentious. A solution to the problem of natural language processing would not only make it easier for every user to converse with a computer system, but would make it easier for the unscrupulous person (or organisation or government) to invade the privacy of others by automating the task of mass surveillance, which at present is prohibitively labour intensive. It is appropriate, therefore, that we pursue our goal in the public domain so that the social implications can be debated openly and in an informed way. The proper way to prevent abuses of technology is through enlightened political policies and not through an inhibition of technological development.

Many people would object to the suggestion, implicit in our statement of reason (b) above, that a computer system which appears to understand natural language will of necessity do this in a way which is similar to the way humans understand natural language. There are, after all, numerous examples of 'clever' computer systems which carry out their task in a way which has very little similarity to the way humans do the same task. Automatic landing systems for aeroplanes make use of radar beams and the recognition by a robot of machine parts will often make use of sensors placed under the delivery tray to sense the weight of the objects. These use mechanisms which have little similarity to human skills.

We can, however, specify the criteria for the successful landing for an aeroplane in a way that does not involve human judgement. There is a clear goal in relation to which the success or otherwise of an automated landing system can be judged, and any system which achieves the specified goal can be described as successful no matter how it achieves it. We would argue, however, that the task of understanding natural language is somewhat different. Natural language is a product of the human mind as well as its servant. We cannot specify the successful processing of language in a way which does not involve what the human mind does to that language, and so there can be no definition of 'meaning' or of 'understanding' which does not involve the subjective judgement of humans. When humans judge a natural language processing system as 'successful' we submit that what they are saying is 'The observable behaviour of the system is such that it is reasonable to assume that it is doing internally the same things that we do when we understand language. It has formed internally the same or nearly the same functional/logical structures which we form. It is able to make the same deductions and will note the same implications.' That seems to be the assumption which we make about a fellow human being when we say that another person 'understands' something we have said. We have no direct evidence that the minds of other people function as our own do, but alternative explanations for the 6ehaviour of other people seem very far fetched indeed.

Therefore while it may be possible to build a computer system which processes natural language, and responds to it in some interesting way, what it does cannot be described as 'understanding' unless what it does is in some way logically equivalent to what humans do. We speak only of the functional or logical mechanism, however. The actual physical mechanism can be quite different. When we use a data structure to represent some element of meaning in a computer we do not wish to imply that humans will necessarily construct an actual data structure of the same kind, with pointers in the same places and so on. What we do think is that the human mind may have its logical equivalent. That is, it will associate elements of information in the same way using some storage technology of its own and use it for the same purpose.

But the goal of developing a fully functional natural language processing system is an extremely elusive one. Some progress has been made. One can purchase a natural language front-end to a database system, or conduct a helpful conversation with a computerised 'medical consultant'. Computer systems can scan press reports and produce standardised summaries, and crude systems are available which translate from one natural language into another. Full natural language processing which is comparable to human performance has not been attained, however, and is unlikely to be attained in the near future.

We should not be too disheartened or surprised by this. The only surprising thing is that anyone should be surprised by the failure. A typical child learns to speak and understand language gradually over many years, and has at his or her disposal (24 hours a day), the brain, a computer immensely more complex and powerful than any artificial computer built to date. No one knows how many millions of years the 'program' which executes this process took to be developed. The human mind also has at its disposal perceptual apparatus which provides it with 'experiences' in terms of which its representations can be couched. In a computer system we will need to construct these for ourselves.

In this book we shall try to explain the nature of the difficulties rather than provide a &olution to them, but we also hope to provide limited solutions to specific application-oriented problems; We shall begin witli.t,he&imple t of", systems and add complications , gradually, so that we can se theliniit§"for particular techniques andwhy it is necessary to abandon them for new an<.i.,etter techniques. The 'solutions' provided in the early parts ofthe.book will be seento be inadequate later in the book.

A few years ago most of the literature on natural language processing by computer was contained in technical journals, monographs and research papers which were not always accessible to people who were not members of staff in one of the academic centres for such research. Fortunately this positidn is changing and a wealth of new books on the .topic are now appearing. .Theystill tend, however, to be oriented towards research reports, and reprint thl!se thout much analysis or comparison. In this book we have tried to describe different approaches as a coherent progression, characterised by increasing;;,compleXity and richness of semantic description, and have not, for the most part, identifled particular techniques with particular researchers or systems" To maintain the continuity we have occasionally ignored certain aspects of a particular published system. Since., the book is intended for undergraduates the bibliography is confined to the most accessible sources.

The book is in three parts.

In Part I we begin with a very simple systenh but one which has a very practical function - a user,front-end to a microcomputer:-based graphics fa ility. With this as our basic example we illustrate certain general principles - the need for a formally defined grammar, the need for a formal internal representation, the need for pattern matching, the notion of 'focus', th llse of transition networks, and the need for the cre tion of 'side-effects' and Jor recursive mechanisms to deal with failure and backtracking. Finally,we try to generalise this work by providing a general strategy for writing an ATN parser and a grammar of English. 0

In Part 2 we study various approaches which'have been tried for dealing with the semantics rather than the syntax of language. We have organised these approaches in sequence of increasing semantic content. We begin,withc<:ase grammar and end witli Schank's Conceptual Dependencies and the use of scripts which, at least in some implementations, almost completely abandon.any..l<:ind of fOrmal grammatical analysis.

In Part 3 we discuss some of the philosophical issues which we avoided earlier in the book. To prevent this discussion becoming entirely esotericimd unrefuted to practical matters, we have also provided a formalism for the r presentation of meaning. It is incomplete ang cannot be considered a solutifm to the MLP problem, but it allows us to discuss some of the issues in a concrete way within a common representational framework. Our concern here is to provide the reader with an exposition of,the problems to be faced and.to help him or I;!er avoid the overconfident naivety which has afflicted research workeiS'in this field in the past.

Research in this area is usually justified, in public, in terms of practical applications, and these are real enough. But the true motivation is that this is a fascinating problem, the solution for which seems tantalisingly near and yet just beyond our grasp.

Note

POPII was chosen as the main programming language for illustrating various algorithms because its syntax is similar to that of Pascal and C, which is not true of either LISP or Prolog, the other two languages which have all of the features required for this type of programming. A brief introduction to POPII is provided in an appendix.

[NOTE (added in 2004). I have omitted the appendix on POP11. The complete language system for POPLOG (that's pop11 and prolog in one package) is available FREE from www.poplog.org/resources/freepoplog.html]