CHAPTER 8
Pattern Matching
8.1 Pattern Matching in General
Pattern matching is a process with applications in many fields including natural language processing. One particular need is to identify the patterns of word types in order to recognise the grammatical structure of a sentence. Another is concerned with the matching of structures to identify the target of some reference. These needs have emerged from the discussion in Chapters I to 7 in this book, and it is now time to take stock and establish some techniques which can be used to satisfy these needs.
The need for pattern matching is important, and is so general in artificial intelligence work that considerable effort has been expended to create special facilities for the process. Indeed, Prolog might be described as a language specifically designed for pattern matching. Some languages, such as POPll, have powerful pattern matching facilities. What we intend to do here is to examine some primitive facilities which illustrate the basic principles, and to develop a suitable set of algorithms for processing and matching patterns in a perspicuous way. It is not intended that these facilities should supplant the much more efficient facilities already available.
We shall first discuss the problem of testing items for simple equality, and then for a more general and arbitrary 'correspondence'. The reader will be invited to code these functions and to use them as the basis for the development of a family of functions able to test items, or 'terms', and check if they have particular properties, or if two such terms 'agree' with respect to particular properties.
Having established such a family of functions we can then consider how we might re-implement the algorithm for an ATN parser in a way that will not offend structured programming enthusiasts and yet will retain a close affinity with the original ATN diagrams. The ATN algorithm will depend upon our family of property testing functions, and will in addition be able to introduce 'agrr_ement testing'. That is, we shall be able to introduce tests which ensure that adjective and noun agree with respect to number and countability, that subject and verb agree with respect to number and person, and so on.
We shall then consider the mechanisms underlying the use of 'formal parameters' as a pattern matching facility. This will allow us to assert a temporary (or trial) correspondence between items during an attempt to match two arbitrary data structures, a process often called 'unification'. It is important that such trial correspondences can be undone automatically when a match fails, so that alternative trial correspondences can be examined. The provision of such facilities is one of the main attractions of the language Prolog. Such facilities will allow us to examine the internal representation and find there examples of particular structures by matching them with generic structures which are defined by means of formal parameters.
8.2 Equality, Correspondence and Agreement
Equality
The basic function used in a pattern matching procedure is that which determines equality between two entities. We can say that two variables are 'equal' because they have the same value assigned to them. We could (alternatively) say that they are 'equal' because they are of the same type, or because they have the same relationship to some other variable (e.g. members of the same list) and anyone of a host of other possibilities. The point is that we must define what we mean by 'equality', and it is open to us to define it in any way we wish. It is always possible to write a function equal(X,Y) which will return the result 'true' under any conditions we choose.
At the simplest level we can have a POP11 function defined thus:

but this is appropriate only for simple numerical variables or (if x and y are pointer variables) they will be found to be equal if and only if they 'point' to the same address. This is a useful but limited test. We could improve it by writing a function which would follow all address pointers until it reached the final reference addresses for both x and y, and then compared these. This process is known as 'de-referencing', but such a function, although useful, does no more
than test for a basic kind of equality.
Correspondence
A more powerful test is one which tests the correspondence of two entities in an arbitrary list of pairs. Consider a list with the structure:
[ [a b] [c d] [e f] ... [y z] ] -> corresp..list;
We can then define a POP11 function which will test any two items, say 'p' and 'q', to see if the pair [p q] appears anywhere in corresp..list. Let us call this function 'corresp' to distinguish it from the simpler equality testing function above. The reader is invited to write and test this function.
The function is particularly powerful because we can then establish a correspondence between any two entities in an arbitrary way to suit our own purposes. It is also possible to establish that (for example) 'x' corresponds to both 'y' and 'z' without 'y' and 'z' corresponding to each other. It is also the case that if 'x' corresponds to 'y' and 'y' corresponds to 'z' we cannot assume that 'x' will correspond to 'z'. Furthermore 'x' does not necessarily correspond to 'x', and if 'x' corresponds to 'y' we cannot assume that 'y' corresponds to 'x'. That is, 'corresp' defined this way does not automatically have the properties of reflexivity, transitivity, associativity and symmetry unless we take the trouble to build these properties into the definition.
In Prolog we can at any time assert that two entities are equal, or that they correspond, and later retrieve that information as a fact. Again the properties of transitivity etc. must be defined explicitly if they are desired.
Has...Properties
An even more general function makes use of a list structure which we shall call a 'dictionary'. The dictionary has the structure.:

Here 'term' is used to indicate a 'word' or 'lexical item', and 'property' is used to indicate a short list structure with the form:
[propertY-Ilame property_value]
For example one entry in our dictionary might be for the term "man":

The list of properties associated with it tells us that it has the grammatical property 'noun', that it is of ntype (noun type) 'countable', that it has the number 'singular', that it is a 'physobj' (physical object) entity, and that it is 'animate'.
Now we can define a function has_prop(t,v,p) which tests to see if the term 't' has the property_value ''v'' associated with the property-name 'p'. For example, we might enquire:
has..prop("man" ,"noun" ,"gram ")=>
which asks 'is the word "man" classified grammatically as a noun?' Again the reader is invited to write and test such a function.
Such a function has great power, and can be used as the basis for a whole family of functions designed to test various aspects of the properties associated with various terms. For example we can now define a grammatical classification testing function:
has_prop(%"gram"%) -> is_gram ;
The function is then identical to the function haLprop with its third argument 'frozen' at the value 'gram'. It can be used in the form:
is_gram("man","noun");
which is identical to: has-prop("man" ,"noun", "gram"); We can also freeze the last two arguments thus:
has_prop(% "noun" ,"gram"% )->is_noun;
which is then used in the form is_noun("man").
Agreement
We can also use the function has-prop as the basis for testing two terms for agreement. The reader is invited to write and test a function agree-.prop(tl,t2,p) which will yield 'true' if and only if the terms 'tl' and 't2' have the same property_value with respect to the property_name 'p'. For example
agree_prop("man" ,"dog" ,"gram");
meaning 'do the words "man" and "dog" have the same grammatical classification'? Such a function can then be used to create a family of functions which test for grammatical agreement, or number agreement, or ntype agreement (e.g. both countable or both mass). In this way we can ensure that expressions such as 'few air' or :less cows' do not pass unnoticed.
Multiple Property Agreements
A still more powerful version of this function, agree all(tl,t2,pl); will test two terms 'tl' and 't2' for agreement with respect to a whole list of property names (PI). For example:
agree_aII("man " ,"dogs" ,[gram num anim));
would yield the result 'false', because although the words 'man' and 'dogs' are both nouns and are both animate, 'man' is singular while 'dogs' is plural. That is, they agree with respect to two of the properties on the list but not with respect to all of the properties. It is arguable that that this is too harsh a test, and that the terms should be given the benefit of the doubt if one or both of the terms has (or have) no value for a given property name. This relaxation of the strict rule would allow us to omit a property from a property list if that particular term could have any of the values normally associated with a given property. For example, some adjectives are suitable for use with both countable and mass nouns. Another variation would compare values for the property ,'person' where these values were given as an octal digit, as described in section 6.5.
Best Fit Agreement
An even greater relaxation of the agreement rule might allow some nonagreement between words. In these cases it would be best for the function to yield not an absolute 'true' or 'false' but a metric which measures the degree of agreement. A commonly used metric is given by:

where A is the number of properties for which there is an agreement and B is the number of properties in the list of property names to be tested.
With such a function we could think of testing several different interpretations of an ambiguous word, and accepting the interpretation which produces the best agreement figure. We shall not pursue this idea any further.

8.3 An improved ATN-based Parser
Earlier we provided examples of programs which represented an implementation of the ATN diagrams. The programming style was simple and direct but because it was based on the use of labels (to represent each node) and GOTO statements (to represent each arc) it was not structured, and therefore not suitable for the development of large and more complex ATNs. Here we will develop a better structure for these programs and at the same time we will try to preserve the close correspondence with the ATN diagrams which was the merit of the previous version of the programs.
The basic unit of any ATN-based algorithm is the 'node-test-and-branch'.

This was represented by an 'IF-then-GOTO' construct. One would like to replace this by a construct such as:

Unfortunately nodes 1,2 and 3 do not represent unique and separate processes, since the pathways onwards from these nodes are not necessarily distinct. If we left the algorithm as shown above we would need to write the as a complete procedure in its own right which was capable of processing all possible continuations from that point onwards. In many cases, however, the continuation from two different nodes will come together and become identical. We will have lost the advantage of sharing code for such cases and have a combinatorial explosion on our hands. If they do combine, however, and there is a failure at some later node, the procedure would not know which one of several possible starting points to backtrack to. We shall call this 'the continuation problem'.
One solution is to write the algorithm as a single function in which the node identification is a parameter. We can then call the function itself (recursively) with the continuation node as an argument. The function below has an argument 'node-id' which is a numerical flag indicating the node concerned. The different values which it may have are represented symbolically by , , etc.

Computer scientists will recognise this as a version of the Ashcroft-Manna technique for the conversion of an unstructured program to a structured one.
Dealing with Back-tracking
The trouble with this algorithm is that if node_id=(node_0) and the (condition-1) test is successful then control will pass to the
continuation associated with its arc irrevocably, and will not back-track and try other possibilities - say (condition_2) - if it fails
at a later point. An additional variation in the algorithm cures this fault and provides for automatic backtracking in the event of
subsequent failure, thus:

Notice that what had been the continuation, the 'parse(tl(intext),node_id)' expression,
has now been included (by conjunction) in the condition test, and must therefore be defined so that it yields a truth value.
This construct ensures that at any node a test for a successful continuation is included in the condition test before an arc is selected.
If the continuation fails at some later point, there is a back-track from that arc and the intext is restored to its original status.
In effect the traversal of the arc, and of all subsequent arcs from then onwards, is tested before the algorithm commits itself to that arc.
Dealing with Side-effects
We now turn to the problem of generating side-effects, which is crucial to the construction of the internal text (or representation). There are two problems about handling side-effects properly.
The first is that whatever side-effects are created must be deleted if the parse fails (i.e. the back-tracking problem again). The second is that the generation of a side-effect often requires information from more than one level of the parse process. For example, when we are dealing with a single word such as 'dog' we
will be able to discover from the dictionary that it is a physical object, that it is animate, is singular, is a four-legged animal, and so on. What we will not know at that level of the analysis is whether it is the active agent or the passive recipient of some action. We do not know whether it is the subject or the object of the sentence. That kind of information is only available at a higher level, when we are processing the pattern of the sentence as a whole. Without such information we cannot ensure that the subject of a sentence agrees with the verb with respect to number and person. The test for agreement is therefore impossible at the level of tracing arcs associated with particular word types, and yet it is at this level that the information is available about the number and person of the nouns and verbs.
The basic trick to overcome these difficulties is to let each level in the parsing process generate its own 'local' side-effect (which we shall call a 'register'). The exact form of this register is not important at present. These local registers should then be passed, as the results of each sub-process, back to the parent process which calls them, and which is dealing with the higher level of the analysis. It can then accept the information provided by its subordinate processes and slot these results into its own register. If a subordinate process fails, however, the higher process will ignore the local registers supplied by these failed sub-processes. The mechanism requires each process at each level to create its own register rather than simply updating an existing global register (or 'blackboard').
The function 'parse' should leave three results:


The sub-processes pass back their registers and these are combined to produce an overall register.
Fig. 8.1. The traversal of an ATN with side-effects.
Figure 8.1 illustrates the traversal of an ATN which we shall call ATN-l. ATNI is a high level ATN in that its arcs are labelled by the names of lower level ATNs. To traverse the arc (AI)-(A2) successfully it must call the function ATN2. It passes down to this function the text of the sentence to be parsed.
To complete the traversal of ATN-2 successfully the two arcs (BI)-(B2) and (B2)-(B3) must be traced.
In order to traverse the first of these, (B1)-(B2), a test for the word 'he' must be applied,
and if successful a local register (reg-B12) is updated with the information '3rd person, singular'.
This is passed back to the parent function ATN-2 together with the result 'true' and the remainder of the
sentence with 'he' bitten off. ATN-2 then tries the second arc (B2)-(B3). To do this a test for the verb 'goes'
is applied, and if successful the local register reg-B23 containing the information '3rd person, singular'
is passed back to ATN-2 along with 'true' and the remainder of the sentence with 'goes' bitten off.
ATN-2 can then test the two registers reg-B12 and reg-B23 for agreement, and if it finds agreement it passes back to ATN-I its register reg-A12 containing the information '3rd person singular' together with 'true' and the remainder of the text with both 'he' and 'goes' bitten off. If ATN-2 fails in any of these tests it passes back the results 'fail', 'nil' and the text in its original form. All the registers reg-AI2, reg-Bl2 and reg-B23, being local to ATN-2 and its subordinate functions, are wiped out as though they never existed.
Note that the results left by these functions are the same three results which we used in the previous version of the ATN algorithm except that we have substituted 'register' for 'internal text'. The register, being a data structure of some kind, could contain the internal text as one of its elements.
Another factor to be watched is that the function places the results on the stack in the correct sequence so that the truth value lies
under the other two. This means that we can remove the other two and assign these to some local variable for subsequent processing,
and still leave the truth value on the stack for the condition test. We can then write:
if (parse(intext,node - id)- >remainder - >register) then ....
and go on to process remainder and register as we wish.
This makes it possible to write:

Note that the call of 'parse' is processing the remainder (rem1) left by the 'arc1_test' function.
As an illustration (without agreement testing) let us define an ATN for the simple grammar:

where s = sentence, np = noun phrase, det = determiner, adj = adjective, vg = verb group, vp = verb phrase,
pp = prepositional phrase, aux = auxiliary verb, ? = optional, * = possibly repeated.
Let the register be a simple list. Let the update consist simply of appending a sublist with structure
[(property) (data)] to the list, where (property) is a name such as 'agent', 'object' or 'instrument'
and 'data' is a word from the input text. More complicated registers could be constructed by using
information from the dictionary structure discussed in section 8.2.
The ATN for the parsing of a noun phrase (np) is:


Pattern Matching
The ATN for a verb group (vg) is:


The ATN for a prepositional phrase is:

In this case the node identifier is redundant and we include it simply to
maintain a constant function format.

The ATN for a verb phrase is:


The ATN for a sentence is:


8.4 The Structure of a Dictionary
The following structure illustrates the possible contents of a usable dictionary.

We have given it the name 'gvar_dict' for 'global variable dictionary'.
8.5 Unification
Consider the two patterns '(a+b)-c' and '(x+y)-z'. We can all see that they correspond as patterns,
even although the individual elements are not identical.
We can make a temporary correspondence or substitution (a to x), (b to y) and (c to z) and
declare the patterns to be equivalent. Afterwards a, b, c, x, y, z resume their anonymity.
They have no identity in these patterns except as place holders.
In some patterns an anonymous identifier might occur in more than one place.
In such cases the substitution must be uniform. If we substitute an 'a' for an 'x' at one point
in the pattern we must not substitute another identifier for 'x' anywhere else in the pattern.
When two identifiers have been made to correspond in this way we shall say that
they have been 'unified'.
Consider now the patterns:
(a+3)-c and (x+4)-z
An attempt to match these patterns would succeed only if 3=4.
We might deem this inappropriate, and we will normally operate with the rule that fixed values cannot be unified.
If, however, we had the patterns:
(a+3) - c and (x+y)-z
we might quite reasonably decide that y=3, in which case we would say that 'y' was 'instantiated to 3'.
Instantiation can be regarded as the temporary assignment of a value to a variable.
In Prolog, identifiers which are variable names are distinguished from others by being
written in upper case characters, or at least having the first character of their name in upper case.
We shall adopt the same idea here, and note in passing that the 'destword' function in POPll which was
discussed in section 5.2 provides us with a mechanism for recognising such variable names so long as
they are POPll 'words' (see Appendix). Any variable which is capable of being instantiated will be
called a 'formal parameter' or just 'a formal'. Other identifiers are assumed to be fixed values just like numbers.
Unification and instantiation are easy to implement, but there is a complication. After a matching procedure has been
completed, or if it is abandoned at some intermediate point because of some evident failure to match, all unifications
and instantiations which brought about that matching procedure must be deleted to restore the formals to their unadulterated state,
and make them available for any alternative unification or instantiation which might be required of them. Note that it is not all
unifications and instantiations which must be deleted, just those established within the context of the process just completed or abandoned.
This is the problem of back-track yet again.
For the purposes of illustration we shall now invent a simple unification/instantiation mechanism.
In later sections we shall use it to develop some interesting matching algorithms.
The basis of our approach is a record structure which we shall call 'x_formal'. This is defined in POPll:
recordclass x_formal f_label f_val f_ref;
That is, it has a name 'f_formal' and three subelements.
'f_label' is the first element, and it has as its value a word which is the identifier
(upper case) for the formal in question.
'f_val' is the element which stores the value of the formal, i.e. to which it is
instantiated (or 'undef' if it is not instantiated)
'f_ref' is a pointer field to other formals with which it has been unified. This
should also be initialised to 'undef'.
Next we define a function 'f_unify' which unifies a pair of formals (fl and f2)

This function assumes that each of the two formals is not already unified. The function should contain a test to check that
each has the value 'undef' assigned to its pointer field. If f-unify is able to unify fl and f2 it should return the result
'true', and if it fails it should return the result 'false'. We leave these modifications to the reader. Next we define a
function which instantiates a formal (f) to a fixed value (x).
define f_instant(x,f); x->r.f_val; true; enddefine;
Again this should test the arguments to make sure that they are of the correct
type and only return the result 'true' if instantiation is achieved.
Next we define a function 'L.match(x,y)' which examines and compares two arbitrary arguments.
If they are both fixed values the result is the result of the test (x=y).
If one is an uninstantiated formal and the other is a fixed value, it performs the 'f_instant' function on them.
If both are non-unified formals it performs the 'f_unify' function on them. If both are unified formals
it returns 'true' if they are unified to each other. If both are instantiated it returns 'true'
if they are instantiated to the same value. The result in every case is a truth value.
We leave the reader to implement this function.
Lastly we must implement a function which will de_instantiate and de_unify any formal:
define f_release(f): "under' ->f.f_ref; "under' ->r.f_val; enddefine;
8.6 Matching Structures with Unification
We can now apply the functions we have defmed in section 8.5 to the problem of matching lists of elements.
Let the two lists be listl and list2 such that:
listl = [line X Y] list2 = [line 123 456]
X and Y are formals so that list1 is part of a defmition of something and list2 is part of an actual bit of internal text.
We want to see if the internal text 'fits' the definition. What we would like to write is:

But this would fail to re]ease the formals which had been bound during an unsuccessful attempt to match.
It also assumes that the two lists are of equal length.
Therefore the function must, at the very beginning, test the lengths of the lists and fail if they are of unequal lengths.
It must also construct a list of a1l formals which it binds as it goes along.
If the function fails, it should release all formals which it has bound (by applist(bound_list, releas)) and return nil; false.
If it succeeds, it should return its list of bound formals and 'true'. It goes something like this:

Let us now take the matching process a step further by trying to match the definition list [line X Y] with one of a list of lists.
This time the function should return three results: the list of bound formals, the address of the matching sublist in 'struct' and
the truth value indicating success or failure. If it fails, the value of
the second result is 'undef' and the list of bound formals is nil.

We leave to the reader the exercise of writing the function which completes the set.
It will compare a structure to a structure. Note that it must retain the bound list of
formals from each successful match found and release them all if a failure occurs at any time.
8.7 General Unification
In section 8.5 we assumed that fixed values would never be unified.
We also assumed that formal parameters would be unified only two at a time.

These restrictions are not necessary, although they are appropriate in many circumstances.
A more general approach to unification would not have these restrictions and it is of interest to see how it might be implemented.
Previously the formal parameters each had a single pointer field which was either 'undef' or pointed at the single formal
with which it had been unified. An alternative would be to introduce a third structure to link the two (or more) elements.
The arrangement is shown diagrammatically in figure 8.7.
Here X and Y are both elements which have been unified and the 'unifying structure' is the third structure introduced
when the unification is established. X and Y both carry pointers to this link record, and the link record has a pointer
to each of the unified elements. These pointers can be held in a list so that more than two elements can be unified by a single link.
Multiple unification is very hard to handle correctly and it is usually inappropriate anyway. The mechanism suggested,
however, has one important advantage over the simple mechanism suggested in 8.5. The link record can be used to store
properties of the link itself. In section 23.3, for example, we will suggest that a 'possible' flag be appended to an
instantiation, and in section 25.5 we will have need of a similar arrangement.