Schedule: LIN313 Language and computers
web
Notes:
Schedule is subject to change.
Slides will be posted throughout the semester.
Assignments are due by class time (2pm) on their due date.
DBM refers to Dickinson, Brew and Meurers.
Readings are given in short format, e.g. DBM 1:1-12 means “DBM Chapter 1, pages 1-12, to be read by class on the date it appears.
Week 1
August 30: Course introduction.
No reading.
September 1: Text encoding
Reading: DBM 1-14
If you would like to play with the International Phonetic Alphabet, here it is.
Week 2
September 4: Labor day.
September 6: Text encoding
September 8: We finish up text encoding.
Readings: DBM 17-26 (without "under the hood 1" but with "under the hood 2")
A demo of word n-grams (if you know Python)
Week 3
September 11: Forensic linguistics
There are no book readings, as the book does not cover forensic linguistics. But if you would like to prepare, read the above slides 1-15.
Background readings on forensic linguistics: Not required for class! This is just in case you are curious and want to read more:
Jan Svartvik: The Evans statements: a case for forensic linguistics
Language Log: Ben Zimmer on the Unabomber's use of "eat your cake and have it too"
Other uses of forensic linguistics:
September 13: Guest lecture: Jessy Li: Forensic linguistics.
Homework 1 due
Readings (optional): Slides pages 16-45
September 15: Guest lecture: John Beavers: Grammars.
Week 4
September 18: Document classification: What is machine learning? Then we start on sentiment analysis .
Readings: DBM pp. 127-133
Background reading (optional) on learning:
An example of unsupervised learning (learning to detect regularities in data):
How do infants learn to recognize word boundaries in their native language?
September 20: Document classification: More on sentiment analysis. Then: The Naive Bayes classifier.
Readings: DBM pp. 140-145
Background reader (optional):
Lingjia Deng and Janyce Wiebe (2015). Joint Prediction for Entity/Event-Level Sentiment Analysis using Probabilistic Soft Logic Models. EMNLP 2015. This is the paper that reasoned about different people's interacting opinions.
September 22: Document classification: The Naive Bayes classifier.
Readings: Same as the previous session, DBM pp. 140-145
Week 5
September 25: Document classification: Evaluating a supervised learning system
September 27: Web search: Structured and unstructured data, and how to search through those
Readings: DBM pp. 91-107 (minus Under the Hood 6)
Homework 2 due
September 29: Web search: Page Rank. HTML and XML
Slides as above
Readings: DBM pp. 103-104: Under the Hood 6
Week 6
October 2: Web search: Regular expressions
Slides as above
Readings: DBM pp. 107-115 (minus Under the Hood 7)
October 4: Web search: Regular expressions, continued.
Then we start on Spelling correction.
Slides as above
We try out regular expressions on regexr
Homework 3 due
October 6: Midterm exam review
Week 7
October 9: Midterm exam
October 11: Spelling correction
Readings: DBM pp. 33-38
October 13: Spelling correction
Readings DBM pp. 38-49
Week 8
October 16: We finish up Spelling correction, then start on cryptography.
The slides on cryptography that we will use are available on Canvas under "Files".
Background reading: Slides on cryptography (Jason Baldridge)
No readings from the book, as that does not cover cryptography
October 18: Cryptography
Homework 4 due
October 20: Cryptography
Week 9
October 23: Finishing up Cryptography, then Machine translation
Readings: DBM ch. 7 pages 181-188
October 25: Machine translation: Linguistic phenomena
Readings: DBM ch. 7 pages 188-200
October 27: Machine translation: statistical machine translation
Slides: Baldridge (pages 4-28, 35-37, 49-50), Erk (pages 9-12)
Week 10
October 30: Machine translation: statistical machine translation continued, then a glimpse at neural machine translation
Readings: DBM pages 200-203
Slidesets: We are using Baldridge pp. 52-59, this illustration of word alignment (pages 28-31)
November 1: Machine translation: neural machine translation
Neural machine translation: We use selected slides from Luong/Cho/Manning, pages 7, 10-14, 23-27
Optional background reading: neural networks and deep learning: the chapter with the cheese festival
Homework 5 due
November 3: Neural machine translation continued
Optional background reading: a blog post on neural machine translation and whether the model produces an interlingua
And here is a link to the neural conclusion-drawing model I demonstrated, in case you want to play with it some more: Click on "TE model" in the top row.
Week 11
November 6: Grammars
Readings: DBM chapter 2, pages 49-53
November 8: Grammars
Readings: DBM chapter 2, pages 54-55
November 10: Dialogue systems
Readings: DBM chapter 6, pages 153-159
Dialogues with award-winning systems: Jabberwacky, and 2017 Loebner prize finalists
For comparison, here is what Hal could do in the movie
Week 12
November 13: Dialogue systems
Readings: DBM chapter 6, pages 159-177 including "under the hood 10"
We are using Markus Dickinson's slides
November 15: Dialogue systems
Continuing on Markus Dickinson's slides
Eliza and Parry examples, and yet another discussion between Eliza and Parry, this time with some actual grammar mistakes by Eliza
Homework 6 due
November 17: Dialogue systems
Degrees of understanding: We are using Jason Baldridge's slides, pages 3-7, 10-17, 28-32
Some interesting recent dialogue systems: These are original research papers; we will look mainly at the example output.
A neural sequence to sequence model, chatbots learning to say "I don't know", and chatbots learning not to mutually tell each other "see you later" for hours
Week 13
November 20: Dialogue systems
What does it mean to understand? We are using Jason Baldridge's slides, pages 37 and following
November 22: Thanksgiving break
November 24 Thanksgiving break
Week 14
November 27: Impact of language and technology:
Overview
Language technology and jobs
Dual use
Readings (after class):
Fairness and Accountability in Machine Learning: principles for accountability
Hal Daume on a code of ethics for language technology and machine learning
DBM chapter 8, pages 215-220
Dual use: Hovy and Spruit section 4
November 29: Impact of language and technology
Anthropomorphization
Readings (before class): Weizenbaum, excerpt from "Computer Power and Human Reason: From Judgment ToCalculation"
Also refresh your memory of Eliza
Questions to discuss in class: Language technology and society: ethics of artificial intelligence
Homework 7 due
December 1: Impact of language and technology
Privacy and big data
Readings (before class):
Week 15
December 4: Impact of language and technology
Finishing up the privacy discussion
Then: bias and machine learning
Readings (before class):
Hovy and Spruit on exclusion, overgeneralization, and exposure: sections 1-3
Gender bias amplification (you only need to look at the introduction)
December 6: Impact of language and technology
Finishing up the discussion on bias and machine learning
December 8: Final exam review
Review sheet is on Canvas
Week 16
December 11: Finishing up: final exam review
Essay due: Social context of language and computers
Final exam: Saturday, Dec. 16 from 9am-12pm in JES A307A.