ISOGRAM: Inference with structured objects representing graded meaning
This material is based upon work supported by the National Science Foundation under Grant No. 0845925.
What is the best way of characterizing the meaning of a word in context, for example the meaning of show in "They showed me the way to the park"? And in "They showed remarkable restraint"? The typical computational model assumes that there is a list of senses given by a dictionary, and that there each occurrence of show can be characterized by pointing out the one sense that matches it. Strangely enough, people are not very good at this task: If you give them a dictionary, and a text, and ask them to match each word in the text to its sense in the dictionary, there will be a sizeable portion of cases where they disagree. Also, it has turned out that this task is really hard for automatic word sense disambiguation systems. So, if this task is so hard for people, does that mean that we mostly don't understand what other people are saying? Another (more reassuring) explanation is that there is something wrong with the model. It has been argued that word meanings lie on a continuum between on the one hand, clear cut cases of ambiguity, and on the other hand vagueness
where clear cut boundaries do not hold. This fits in well with the prevalent models of human concept representation in psychology: They suggest that concepts have "fuzzy boundaries", and that there will be typical cases as well as borderline cases. (This seems to be the case even for the concept "potato chip".) We have collected annotation data with graded judgments on how much different senses apply.
So if we move away from the "pick the one matching sense" accounts of word meaning, what would alternative models look like? One possibility is to view each word sense of show as a point in a high-dimensional semantic space. Then each occurrence of show can be characterized by how near it is to each of the senses.
One additional problem, though: What senses, exactly, should we use? And how many senses exactly does a word have in the first place? Different dictionaries don't necessarily agree on this. So, how could one characterize the meaning of a word in context without using any dictionary senses? In fact, the same semantic space models still apply. We can still describe a single occurrence of show as a point in semantic space. We don't have distinguished "word sense points" anymore that could serve as landmarks, but we can measure how near this one occurrence of show is to other occurrences of show, or to occurrences of words like demonstrate or render. We have collected annotation data for this kind of "usage similarity" as well.
We are trying to characterize word meanings not just because it is fun to do so, we want the representations to be useful for something. In particular, we want to be able to draw conclusions. From the sentence "In November in Austin, the thermometer showed 75 degrees Fahrenheit" we can conclude that the temperature was 75 degrees Fahrenheit (and that it is a good idea to visit Austin in November). From the sentence "They showed me the way to the park" we cannot conclude anything like "the temperature was the way to the park". How can we derive this from a semantic space-based model of word meaning in context? Here is our first stab at a semantic space model of word meaning in context, and its application to predicting paraphrase applicability, and here is another.
Word meanings are an important part of language meaning, but they are not all. To describe the meaning of a natural language sentence appropriately, we need to describe how the words relate to one another in a sentence, we need to get a grasp on small but important words like "not", and many more things. After all, as we said above, we want to be able to draw conclusions from sentences.For example we would want to infer from "I regret having called him a thief" that "I called him a thief". In semantic space models, it is not clear how to do this. In contrast, such an inference is completely straightforward in logical form representations of sentence semantics. Conversely, logical form representations have no inherent way of talking about degrees of similarity between word meanings, something that is straightforward for semantic space models. The approach that we follow in this project is to link semantic-space representations for words with logical-form representations for sentences.