Text is everywhere, in huge amounts: Books, emails, web pages, scientific papers… And there are lots of opportunities for technology that would help us manage, understand, sort, and make sense of all the information: Automatically translating texts from one language to another; building better search engines that can deal with complex questions instead of just keywords; figuring out automatically whether the blogs are saying good or bad things about a particular product; extracting useful facts from repositories of scientific papers about medicine.
The field of computational linguistics is about building such language technology applications, but also about the science behind them. It is a highly interdisciplinary field, drawing mainly on computer science (particularly artificial intelligence) and linguistics, but also on other fields such as cognition, and philosophy of language. This course provides an introduction to the key methods that we use in computational linguistics, and it discusses some of the main applications.
The course is oriented towards students with some background in linguistics but no prior experience in programming and computer science. The main focus will be on hands-on experience of language processing techniques, with a gentle and thorough introduction to programming and to the relevant theoretical concepts. We will use the Python programming language and to using the Natural Language Toolkit (http://www.nltk.org/), a Python package made for exploring and processing language data.
Prerequisites: Upper-division standing.
Textbook: Jurafsky, D. and J. H. Martin, Speech and language processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (2nd Edition). Prentice-Hall, 2008.
Additional required readings will be made available for download from the course website.
Recommended additional text: Mark Lutz and David Ascher, Learning Python, O'Reilly.
Back to the UTCL website.