Working with Corpora

LIN 392: Working with Corpora

Fall 2014 | Instructor: Katrin Erk | Tues/Thurs 12:30-2 | CLA 4.422 | Canvas

Corpus linguistics is use of text corpora for exploring, documenting and modeling linguistic phenomena. This course provides a practical introduction to working with corpora.

The purpose of this course is to provide the student with a basic toolbox for working with corpora. The student will get to know current best practice in the construction and annotation of corpora, get to know search tools for locating occurrences of relevant phenomena in a corpus, and learn to use Python, a high-level programming language, to process text corpora. We will discuss examples of corpus-creation projects and formats for annotating corpora.

This course is designed for students with no prior experience in programming. Its aim is to enable students to perform their own corpus-based studies.

Graduate students from departments other than Linguistics are welcome to take this class.