# LIN 392 Working with Corpora: Schedule

Assignments are due by class time (12pm) on their due date. Project proposal, progress report, and final report are due midnight (12am) at the end of the day that is the due date.

## Week 1

Aug 25: Introduction to working with corpora and to programming in Python

• Worksheet: Intro to Python and Unix: course archive, intro_python.pdf

## Week 2

Sep 1: Python functions, conditionals, loops, and lists

• We will finish up the previous week's worksheet, discussing Python functions. Then we move on to the new Worksheet: Python conditionals, loops, and lists: course archive, python_lists_conditionals.pdf
• Readings: HTTLCS chapter 3, 4, 5, 6, 9
• If you are done with the worksheet ahead of time: The chapters in HTTLCS above contain exercises at the end, which are highly recommended!

## Week 3

Sep 8: Making corpora

## Week 4

Sep 15: Annotating corpora

## Week 5

Sep 22: Working with Python strings and files

## Week 6

Sep 29: Python data structures

## Week 7

Oct 6: Sample Python programs

• Assignment 2 due
• Worksheet: sorting lists in Python. course archive, python_sorting.ppt
• Two sample Python programs. course archive, python_sample_programs.zip

## Week 8

Oct 13: An introduction to the Natural Language Toolkit

## Week 9

Oct 20: Regular expressions

• Worksheet: Regular expressions. course archive, regexp.pdf
• Project proposal due

## Week 10

Oct 27: Searching annotated corpora

## Week 11

Nov 3: Automatic analysis of text with the Natural Language Toolkit

## Week 12

Nov 10: Finishing up the use of NLTK for using and writing language processing tools. Then: Statistical analysis of linguistic data: a very short introduction.

• In discussing probability theory, we used a program for estimating conditional probabilities of words given preceding words: course archive, conditionalprobs.zip
• Project progress report due

## Week 13

Nov 17: Finishing up the short introduction to statistical analyses of linguistic data. Then: Analyzing linguistic data: a short glimpse at the R statistics package

• Worksheet: a short introduction to R for people who know Python. course archive, r_and_python.pdf
• Assignment 4 due

## Week 14

Nov 24:

• Doing some statistics with R: Worksheet, course archive, r_stats.pdf
• Processing XML with Python: Slides and sample file. course archive, python_xml.pdf, crocodile.zip

## Week 15

Dec 1: Project presentations

• 12:00 Ding
• 12:12 Parry
• 12:24 Kim
• 12:36 Ganeshan
• 12:48 Schultz
• 13:00 Lee
• 13:12 Blanco
• 13:24 break
• 13:36 Wendorf
• 13:48 Zaheed
• 14:00 Cope
• 14:12 Richter
• 14:24 Bohmann
• Project final reports are due on Dec 6, 2010