Skip to content
Sean Trott edited this page May 4, 2016 · 20 revisions

As described here (Khayrallah, Trott, & Feldman 2015), the Specializer receives a SemSpec as input and produces an n-tuple as output. An n-tuple contains task-specific semantic information, is focused around action specifications (e.g., move, push, etc.) and their parameters, and functions as a shared communication language between all agents in our Natural Language Understanding system. In terms of implementation, n-tuples are JSON structures mapping shared keys to values.

Below is an n-tuple for the sentence "John saw the box."; note the similarities to the SemSpec above.

(N-tuple for the sentence "John saw the box.")

The Core Specializer uses n-tuple templates to determine which aspects of the SemSpec to extract. More information about the actual design of n-tuple templates can be found on the page describing the Core Communication Modules. This section is dedicated to describing the process by which the Core Specializer produces an n-tuple, and the methods used to do this.

The Core Specializer file can be found here, and contains additional documentation on the methods.

Below is a description of the most important methods in the CoreSpecializer, as well as a walkthrough of how an n-tuple for the sentence "John saw the box.". ********

Relevant methods

###specialize(self, fs)

This is a bound (class) method that takes a SemSpec or "Feature-Structure" (fs) as input, and outputs an n-tuple. Of course, there is a considerable amount of processing that goes on between the call to specialize and the output of an n-tuple.

First, the Core Specializer checks whether the SemSpec is an utterance with discourse information; if it's not (e.g., a sentence fragment like "the red box"), the Specializer calls specialize_fragment (see below), and produces a fragmented n-tuple.

Otherwise, the Specializer identifies the "mood" of the utterance using Discourse information (e.g., "Declarative", "Imperative", etc.), identifies the corresponding mood template, then routes the "content" of the utterance to the specialize_event method.

###specialize_event(self, content)

This takes in an EventDescriptor as input, and produces an n-tuple describing that event. Again, this consists of multiple component steps, but at the highest level, the method identifies the corresponding template for the type of EventDescriptor using the event templates. In most cases, this is a normal EventDescriptor, but in the case of conditional statements, it is a "ConditionalED".

Then, for each key/value pairing in the event template, the Core Specializer calls fill_value to fill in the template.

###fill_value(self, key, value, input_schema)

This is one of the most important methods in the Core Specializer, since it defines procedures by which the declarative templates can guide the Specializer's actions. The method takes as input a key name, the template value, and the schema to extract the information from. A series of conditions are then evaluated; the template value is investigated to determine how to represent the final output for this key in the n-tuple. The key, meanwhile, corresponds to the same-named role in the schema.

For example, if the key is "eventProcess", and the value is the dictionary...

{'parameters': 'eventProcess'}

...the CoreSpecializer knows to call the fill_parameters method (see below) on the contents of the eventProcess role.

If the key is "protagonist", and the value is the dictionary...

{'descriptor': 'objectDescriptor'}

...the CoreSpecializer knows to the call the get_objectDescriptor method (see below) on the contents of the protagonist role.

###fill_parameters(self, eventProcess)

This method identifies the corresponding parameter template for the input eventProcess. Typically, these describe subtypes of the "Process" schema, but parameter templates can also be used for other schema families. If no corresponding template is found, a parent schema/template (such as "Process" for "MotionPath") is used.

The method then accomplishes two primary tasks:

  1. Fills in template: For each item in the template, the method calls fill_value (above).
  2. Inverts pointers: Additionally, the method also searches for modifiers of the eventProcess, such as adverbs or prepositional phrases (like "he ran for 2 hours"), inverts these pointers, and incorporates that information into the resulting n-tuple.

###get_objectDescriptor(self, item, resolving=False)

This method identifies the corresponding descriptor template (in this case, the objectDescriptor template). In our system, objectDescriptors are general descriptions of referents in the SemSpec (an "RD", or "Referent Descriptor"). The Core Specializer has no world model, so it can't actually determine a real-world referent, but it can package the information in a simple, accessible way, so that the Problem Solver can determine the real-world referent (or, in some cases, request clarification).

Besides simply filling in the values from the RD (ontological-category, givenness, gender, etc.), this method performs two key functions:

  1. Pointer inversion: it crawls the SemSpec and finds modifiers, such as adjectives or prepositional-phrases, that point to a given RD, and then incorporates this information into an objectDescriptor.
  2. Referent resolution: in the case of a pronoun or one-anaphora, the Core Specializer searches through its stack of previous referents, and attempts to unify the current objectDescriptor with a previous referent.

For example, "the red box" might output an objectDescriptor that resembles the following:

{color: red
type: box,
givenness: uniquelyIdentifiable
number: singular}

###specialize_fragment(self, fs)

This specializes the SemSpec for a sentence fragment, such as "the red one", or another non-discourse utterance. The Core Specializer has procedures built in for the majority of the potential meanings in the core grammar. However, system integrators might want to subclass the Core-Specializer and extend this method to cover domain-specific meanings as well.

Other Core Features

The Core Specializer is also fitted with several other important features, which aid in n-tuple building, and are generalizable across domains.

Coreference Resolution

As mentioned above, the Core Specializer is able to handle basic coreferent resolution, both within and across utterances. The latter is particularly important for dialog systems with an autonomous agent, in which the human user shouldn't have to repeat the full description of an object with every reference.

Key methods

resolve_referents(self, item, antecedents=None, actionary=None, pred=None

Takes in an object-descriptor (ITEM) with a referent of "antecedent". If no other arguments are passed, the Core Specializer uses, by default, it's field attribute _stacked, which is a stack of object-descriptors. Alternatively, the user can pass in a list, as is the case when the referent is "addressee" (e.g., "you"); a separate stack of addressees is also maintained, for simple discourse analysis.

Recovering the referent consists of several steps:

  1. Pop the most recent referent from the _stacked list.
  2. Check if this new referent is compatible with the pronoun using the compatible_referents(self, pronoun, ref) method.
  3. (Optional) Check if this new referent is compatible with the ACTIONARY passed in.
  4. If the referent is compatible, clean the object descriptor and return it.
  5. Else, repeats steps 1-4 until a compatible referent is found. If no referent is found, return the descriptor for the original pronoun – it’s possible the Problem Solver could find a referent down the line.

compatible_referents(self, pronoun, ref

This returns True if PRONOUN and REF are compatible, and False if not. Two object-descriptors are compatible if all of their key/value pairs are compatible in the language ontology (excepting the "referent" value).

Resolving Semantic Incompatibilities

The Core Specializer also resolves certain semantic incompatibilities that the ECG Analyzer is unable to eliminate. Hypothetically, an ECG Grammar could be devised to eliminate these issues, but previous attempts have resulted in overly complex grammars that lose much of the compositionality that makes ECG so powerful.

Properties and Predication

Thus far, these incompatibilities have concerned the relations between subjects and their predication. Consider the following sentences:

The box weighs 2 pounds.
The box is 2 pounds.
The weight of the box is 2 pounds.

Though these sentences are all grammatically distinct, they’re conveying similar semantics, and ultimately the n-tuples are quite similar (if not exactly identical). The third sentence, however, is difficult to encode properly in the grammar, except by using what we call “Modifier-PP” constructions. There is a class of constructions that express an object’s property with the following syntax:

PROPERTY-NOUN [OF] NP

One of the Core Specializer’s additional capabilities is integrating this information in a structured way into the n-tuple. As mentioned above, the Core Specializer must invert all of the pointers to a particular RD, including subsets of the “Modification” schema. Property information that takes this format is encoded the following way:

{type: box, property: {objectDescriptor: {type: color}}}

This in and of itself is an important feature of the Core Specializer. However, more important is the aforementioned resolution of semantic incompatibilities. Because this information is not represented completely in the grammar, the Analyzer cannot rule out certain semantically incorrect assertions, such as:

The weight of the box is blue. *
The color of the box is big. *

The check_compatibility(self, predication) method checks whether the property described in the protagonist is compatible with the property implied by the predication. Thus, color and color are compatible, but weight and color are not. If an incompatibility is detected, the Core Specializer raises an Exception, which is similar to a sentence not parsing correctly.

Note that as always, meaning is context-specific and dependent on the application. For example, one application might have a meaning of red that suggests a certain weight (for example, a scale ranging from blue to red, for “danger zone”). In this case, the system integrator can add a token red that means weight, in addition to a token red that means color, and then a sentence like “the weight of the box is red” will produce the desired n-tuple. The important thing is that the mechanism exists in the Core Specializer to filter out semantically incorrect commands – it is up to the system integrator to define the right tokens for a given application.