Courses‎ > ‎

LIN350 Computational semantics

Spring 2020 | Instructor: Katrin Erk| MWF 12-1  | GDC 2.210

How can we describe the meaning of words and sentences in such a way that we can process them automatically? That seems a huge task. There are so many words, all with their individual nuances of meaning -- do we have to define them all by hand? And there are so many things we want to do with sentences: Translate them. Answer questions. Extract important pieces of information. Figure out people's opinions. Can we even use one single meaning description to do all these tasks?

In this course, we discuss methods for automatically learning what words mean (at least to some extent) from huge amounts of text -- for example, from all the text that people have made available on the web. And we discuss ways of representing the meaning of words and sentences in such a way that we can use them in language technology tasks.

We will look at two very different kinds of meaning representations, with quite different strengths and applications. The first kind is distributional representations or embeddings. Embeddings characterize the meaning of a word or passage as an object in a "meaning space" that is learned automatically from data, often in such a way that words that are similar in meaning will be close together in space. Embeddings can be obtained by simply counting words, or using neural models. We will discuss both methods. The second kind of representation is logic-based semantics. Here the idea is to translates sentence into a format in which we can automatically reason with them and draw conclusions.

Prerequisites: Upper-division standing.

Readings will be made available for download from the course website.

Flags: Quantitatve, Independent Inquiry