Creating ATNs using ANT (a small ATN interpreter implemented in Common LISP) DEFINING THE LEXICON Words are added to the known vocabulary using the dw function, as in (dw a det) (dw the det) (dw I pronoun) (dw small adj) (dw women n (plural) woman) (dw list (n v) (transitive intransitive)) (dw listed v (transitive intransitive pastpart) list) For each word entry, the format is (dw ). The part of the entry may be a single category or a list of categories. The and specifications are optional. Alternatively, it is possible to define a list of words in a more compact form: (setf dictionary '((chase v (transitive)) (chased v (transitive pastpart) chase) (ate v (intransitive transitive) eat) ... )) (loaddict dictionary) In this latter form, the variable dictionary is merely set equal to the list of words and their definitions, and the call to loaddict causes those definitions to be processed by the atn interpreter. DEFINING THE NETWORKS Networks are defined by the ds (define state) function, as, for example, (ds statename arc arc arc) (ds statename arc arc arc) (ds statename arc arc arc) )) Alternatively, a variable or variables can be assigned a list of states and their arcs, and the function definenetwork can be called to instantiate those definitions within the atn interpreter: (setq network '( (statename arc arc arc) (statename arc arc arc) (statename arc arc arc) )) Each state may have any number of arcs. The final state often has only one - a pop "arc", which will be described shortly. The interpreter which we use provides a total of 6 varieties of arc. Three arcs, push, pop, and jump, modify the flow of control. Two other arcs test the current word - cat, and wrd. These last arcs are taken only when the current word meets certain conditions. The sixth arc, vir, will not be necessary for your assignments. An arc has the form ( ... ) Ordinarily, the last "action" is a specification of the destination - the next state of the network to which control should pass. ARCS WHICH TEST THE CURRENT WORD (wrd (the a) t (setr det **)(to np1)) (wrd please t (setr polite t) (to s)) First is evaluated. If it is true, the wrd arc checks to see if the current word matches the , which may be either a single word or a list. If the is a list, then the check is to see if the current word is a member of the list. If the check is successfull,then the actions are evaluated from left to right. (cat adj t (addr adj *)(to np/subj)) (cat (n pron) t (setr noun *) (to np/head)) A cat arc is taken if the current word is in the dictionary marked by the category type of - or if is a list, if the current word is marked as any of those categories. While control is inside a cat arc, the variable * is bound to the root form of the current word. (If you need the actual current word rather than its root form, it is available as the system variable lex.) ARCS WHICH MODIFY FLOW OF CONTROL (push np t (sendr n (getr subj)) (setr subj *) (to s/subj)) The push arc is used to invoke a sub-computation. If the is true, then the ATN interpreter is recursively called, starting in state . All registers at the current level are saved and made invisible to the new level. Before actually invoking the new level, the list of actions is scanned for any sendr's in the list, which are typically used to set register values in the new level. If the sub-computation fails - that is, does not reach a state with a pop arc, then the push fails. If a pop from the invoked level is taken, then control returns to the push arc, and the variable * is set to the value popped by the returning network. At that time, if there were any liftr actions in the returning network, the values of the affected registers are set, and finally the actions of the arc - other than the sendr's, which have already been executed - are evaluated. (pop (buildnp) t (liftr head-noun (getr noun)) (pop (buildq (pp + +) prep np) t) The pop arc returns from a sub-computation. If is true, any are evaluated from left to right and then the value of is computed and returned to the most recent push arc. The are optional. Under normal conditions a pop arc will not be taken if it would be returning to the top level when words remain to be parsed in the input. (jump s/end t) If is true, the actions, if any, are executed and the ATN interpreter advances to the state specified by , without advancing the input. ACTIONS TO MANIPULATE REGISTERS (setr verb 'be) setr assigns a value to a register at the current level. The form is (setr value). (addr modifiers *) addr adds a value to the current value (if any) of the register named. The difference between addr and setr is that a second setr to a register would cause the first value to be lost. A second addr would leave both values in the register. (sendr noun (getr subj)) sendr is used to set the contents of a register at a lower level, as part of a push arc. In the example shown, subj is a register at the current level, and noun is a register belonging to the subnetwork which is about to get control as a result of the push. (liftr subj (getr noun)) liftr is used to set the contents of a register at a higher level. The liftr code can occur in any arc of the network. However, the value of the register at the higher level will actually only change at the time of a pop, so the current network must succeed before a liftr would take effect. (getr noun) getr is the only way to access the contents of a register. Note, for example, that if a register named subject has been filled then (setr object subject) will not put the current value of subject into register object. The proper code to perform the assignment is (setr object (getr subject)) OTHER ACTIONS/FUNCTIONS (to* np/det) Normally this would be found only as the final action of an arc, specifying the destination of the arc. The input sentence is advanced (unless the arc is a successful push arc) and control is passed to the named state. (getf pastpart) (getf transitive (getr verbreg)) getf ordinarily occurs only as a test. It expects either one or two arguments. If the second argument is present, it is evaluated and should yield a word. The first argument must be a feature. If the word is present in the dictionary, marked with the feature, then getf returns true. If there is only one argument, then getf checks to see if * (the current word) has the feature. (cat prep) (cat (adv subconj)) cat can occur as a test. It returns true if the current word belongs to the category or list of categories specified. (root (getr verb)) Most of the time the system is providing the root form of a word automatically, but if the actual lexeme is, for some reason, the only thing available, this function will return the root of its argument, if the dictionary specifies one, and will return the argument itself if no root has been specified in the dictionary. (buildq (sentence (agent +) (verb +) (object +) subj verb obj) buildq is a function designed to make it easier to create embedded lists to return values. The first argument is a "form" which normally is a list with arbitrary levels of nesting, and with embedded +'s and @'s in various places. Everything in the form might as well be a literal string of characters except that each time a + is encountered, the content of a register following the form is substituted, and whenever a @ is the first element of a list, the elements of the list (including substitutions for +'s) are appended. Of course, when a @ is used, the elements that are appended must be lists, or the results will look quite strange. (tracestate t) (tracearc nil) (tracereg nil) (traceall t) These functions turn various kinds of tracing on and off. traceall turns all three kinds of tracing on or off.