CHAPTER 2
Ambiguity, Pronominal Reference and Concepts
2.1 The Ambiguous 'A'
If the user of our natural language system does not avoid the use of the symbol 'a' as a label for a
point, we may have to deal with the two sentences:
(a) a point fred has coordinates 259, 563.
(b) a has coordinates 259, 563
the system cannot 'know' which of the two sentence forms it is processing after it has read just
one word 'a'. It could therefore take the wrong arc. If the user has input sentence (b) and the
system has taken the arc Sl -> S2 the error will be discovered when it fails to find the word point
which would enable it to take the arc S2 -> S3. What is required is a mechanism to allow the
system to backtrack to Sl and reinstate the input text to its original condition. Our original ATN for
this was shown in Figure 1.2. The ATN would advance to S2 and S3 if the 'det' arc and the 'point'
arc failed. Unfortunately the 'det' arc will not fail because 'a' will be accepted as a determiner and
the system will advance to S2. At this point the system will advance to S3. Instead of advancing
to S3 it should fail and restore 'intext'. We do have a general solution for this problem. Note flTSt
that every function produces the results shown below:
In the failure condition the intext is restored. One method of solution is therefore to make each of
the alternative parsings a separate function or sub-function of ptname. Let these be called
ptnamel and ptname2. Ptnamel will assume that 'a' is a determiner and will then fail if the word
point is not found. It will exit and restore intext which will allow ptname2 to try by assuming that 'a'
is a label for a point.
The program in POPll for 'ptname' utilising these two sub-functions would be:
2.2 The Prolog Equivalent
Earlier, in program 1.4, we suggested that 'isname' could be defined:
isname(W):-ismemb(W,[p, q, r, tom, dick,harry]).
This needs to be extended so that the list of items for which 'isname' yields 'true' includes 'a'.
The definition of ptname shown in program 2.2 will assume it should test for
the pattern
'a point ... '
failing which it will test for the pattern
'point ... '
and only if this fails will it test for the pattern
' ... '
Automatic backtracking is one of the most powerful features of Prolog and allows such
programs to be written very succinctly.
2.3 Pronominal Reference and the Use of 'it'
The use of the word 'it' to refer to some entity previously mentioned is a common and useful
device in natural language. This device does introduce complications, however. Clearly the
meaning of 'it' varies dynamically with the context of the discourse and so the use of 'it' can never
be allowed in context-free languages. It would be natural, however, for the users of our graphics
system to write a statement such as:
x has the coordinates 123, 456 and y is the point with coordinates 567, 789. Join it to x.
Here there is not much doubt that 'it' refers to the point y, that being the last mentioned entity.
Such a straightforward case can be dealt with by introducing a global variable called 'focus' which
has assigned to it the output text:
[y] -> focus;
This assignment is made as that piece of internal text is created. The value of
focus will therefore change dynamically as processing continues, and at any time focus will
contain the definition of the most recently mentioned point. During processing 'it' should be
replaced by the value offocus. The input text would then read:
let x have coordinates 123, 456.
y is the point with coordinates 567, 789. join y to x.
This simple strategy works well for some examples but goes wrong in many interesting ways.
Suppose our user had written:
let x have coordinates 123, 456.
y is the point with coordinates 567, 789.
join y to it.
Here the word 'it' refers to 'x', because 'y' has already been referenced explicitly and we all know
that it would be inappropriate to ask for y to be joined to itself. Nevertheless 'y' is the last point
mentioned in the previous sentences and so the simple strategy which we outlined above would
interpret 'it' as 'y'.
The difference between the two examples appears to be small but is in fact profound. In the
first example the referent for 'it' can be found by the application of a simple rule. In the second
example the true referent for 'it' is determined not by a simple rule but by appealing to common
sense. 'We all know that it is inappropriate to ask for 'y' to be joined to itself'. Common sense of
this kind could be represented by a formal rule:
if [line p qJ is a statement in the output text p and q must be distinct.
It would not be difficult to write a POPII function which tested each internal text statement and
returned false if this rule was violated. Next we must allow focus to have more than one possible
meaning, and so instead of assigning the last mentioned point to focus we add that point to the
top of a push-down stack. The interpretation of 'it' will therefore be found on that stack, with the
most likely candidate (the last mentioned) on the top. If substituting that value for 'it' produces a
false condition as tested by the rule above, the system must backtrack and try the next possible
value on the stack.
This argument illustrates clearly a general idea which will crop up many times. Often there is no
clear-cut rule which can be applied to determine the interpretation of a statement in natural
language. Instead we have rules which represent common sense about the world, and various
interpretations must then be tried until one is found that is compatible with these common sense
rules.
Our way of dealing with pronominal reference will undergo further changes as we -",{Ogress
and find that even our modified way of handling the focus is inadequate. The basic idea of afocus
of attention is, however, fundamental.
2.4 Triangles and Concepts
Let us extend the vocabulary and grammar of the input text so that we allow the user to type
'Draw the triangle p q r'. The word 'triangle' has not been used before, and so we must provide a
definition of it, so that the system 'understands' what we are talking about. That is, the system
must be able to translate the word 'triangle' into the corresponding set of elements of the internal
text. Somewhere we must store the correspondence:
triangle p q r =
line p q
line q r
line r p
We might achieve this by defining a function called triangle, e.g.:
The parameters X, Y, Z are formal parameters which can be replaced by specific values (say p,
q, r).
When the function triangle is called, with the parameters X, Y and Z given values, it will leave
behind as its result three line statements which are valid statements in the internal text. This
function can be used to generate the internal text for any triangle (which could be drawn by the
system) and for that reason we call it a 'generic structure'.
There are two ways oflooking at such a structure. The function defmition can be regarded as a
representation of the concept 'triangle', or as a way of representing the set of all triangles. These
are not the same thing, and for reasons which will become apparent later we prefer to regard
function definitions with formal parameters as representations of the concepts.
A concept, then, is a generalised idea which acts as a skeleton (generic structure) which can
be fleshed out with specific information to form the representation of a specific entity.
The vocabulary of our NL-system could be greatly enriched by the addition of many such words
which act as the labels for similar concepts (e.g. square, rectangle, ...). These could be added to
the list of objects defined by in section 1.11.
2.5 Introducing Colours
The repertoire of the graphics language could also be extended to permit colours to be used. Let
the internal text contain such statements as:
line n x y
where n = colours as follows:
0 1 2 3 4 = white red green blue black
Now our user could write 'draw the green line p q' and we could handle this because our output
text would have a specific parameter which would be modified to correspond to the colour
specified.
An appropriate set of ATNs would be:
The conclusion to be drawn from this is that extensions to the vocabulary of the input text must
be matched by extensions to the range of things which can be represented in the internal text.
Understanding language means being able to identify the appropriate representation.
Furthermore, the representation must be such that it can be acted upon. In this example the
microcomputer is able to respond to the representation by drawing lines of the desired colour.
2.6 Green Triangles
If we now put together the two complications described in sections 2.4 and 2.5 we get an
interesting problem. Let the user write
draw the line p q.
draw the line q r.
draw the line r p.
make the triangle green.
The last sentence does not refer to any object mentioned in the preceding input text. Instead it
refers to an object the existence of which is only implied by the preceding text. The object
'triangle' does not exist in the internal text prior to the occurrence of the last sentence, and the
mechanism for updating the 'focus' which we described in section 2.3 will only have a stack with
the individual points and lines. If the word triangle had appeared in the input text it would have
been translated by our system into the representation of three linked lines.
In order to interpret the last sentence the system must be capable of identifying the object to
which it refers. This is the converse of the ability to represent a triangle as three linked lines. On
finding the representation of three linked lines the system must be capable of recognising a
triangle by matching this pattern to the known representation of a triangle.
We already have the referent for 'triangle' in terms ofthe concept definition:
Triangle(n,X Y,Z) -> line N X Y. line N Y Z. line N Z X
and we have a chunk of the internal text:
line 0 p q. line 0 p r. line 0 r p.
We need to be able to match the interpretation of the concept representation to the appropriate
internal text statements and, as we do so, make the instantiations (see Chapter 8) X=p, Y=q, Z=r,
n=O. Note however that the object referred to may be distributed over several parts of the output
text. That is, the three line representations may not be together and may not even be in the right
sequence. We will discuss mechanisms for carrying out such a matching operation in Chapter 8.
In the mean time we note the nature of the problem.