Analyzing linguistic data: links
List of software we will use in the class
Python and Python packages:
We strongly recommend installing Anaconda, as that includes Python along with all Python packages we need.
If you install anaconda, you will have to add gensim. Here is a tutorial on how to add a package to Anaconda: https://docs.anaconda.com/anaconda/navigator/tutorials/manage-packages/#installing-a-package
Please do this, and choose to add gensim.
lternatively, you can individually install:
Python: https://www.python.org/downloads/ Any version >= 3.4 should be fine.
Jupyter notebooks: https://jupyter.org/
pandas: https://pandas.pydata.org/
numpy: https://numpy.org/
matplotlib: https://matplotlib.org/
Statsmodels: https://www.statsmodels.org/stable/index.html
NLTK: Installing NLTK itself: http://www.nltk.org/install.html You also need the NLTK data, see http://www.nltk.org/data.html
To test your Python installation, use this Jupyter notebook.
Slack:
We are using Slack for in-class discussions. Please see Canvas for the link to the class Slack space.
Tips and tricks:
Learning Python:
How To Think Like A Computer Scientist is a very good and accessible online Python textbook.
(Caution: It uses Python 2 rather than Python 3. One main difference you will note is that they omit the () around the print() command.)
Jupyter notebooks:
Fun with statistics
Language Log: a language and linguistics blog written by Mark Liberman and others
Bad science: Ben Goldacre's blog with lots of illustrations of what not to do in statistics
xkcd: A webcomic of romance, sarcasm, math, and language.
And then there's the RXKCD package.