LIN 389C: Research in Computational Linguistics
LIN389C is a research course for students who work in computational linguistics. It is aimed at students with advanced knowledge in natural language processing and machine learning techniques who are doing research in the area. In the course, we discuss current research by course participants, review foundational knowledge that is relevant to participants' research, and talk about big-picture issues and current research in field.
Adapting the class format to deal with the ongoing pandemic
Here is the plan as of August 23, 2021:
In the first week, we will talk about topics to cover in this semester's class. Please bring suggestions for topics that are relevant to your research. A collection of topics suggested at the end of the previous semester is listed under Topics below.
Introductory material on deep learning
Michael Nielsen's book Neural Networks and Deep Learning gives a very nice and accessible introduction to deep learning.
I also like the relevant chapters from the upcoming 3rd edition of the Jurafsky and Martin book:
Jay Alammar has some very nice illustrations of key ideas in neural models:
BERT and Hopfield networks
What kinds of patterns can be learned by neural architectures?
RNNs can generate bounded hierarchical languages with optimal memory, https://aclanthology.org/2020.emnlp-main.156.pdf
Theoretical limitations of self-attention, https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00306/43545/Theoretical-Limitations-of-Self-Attention-in
Can we add this? Preissner/Herbelot: the Fruit Fly algorithm
Autoencoders: Frank Ferraro
Machine learning, language and cognition:
Use of neural models in neuroscience
What does language modeling tell us about language?
Marco Baroni's recent overview article about learning syntax with deep networks: https://www.annualreviews.org/doi/abs/10.1146/annurev-linguistics-032020-051035
Probing into BERT:
The theoretical side of distributional semantics:
Marco Marelli and the argument that distributional models are cognitively plausible https://journals.sagepub.com/doi/abs/10.1177/1745691619861372
Guy Emerson's big-picture paper https://aclanthology.org/2020.acl-main.663/
R. Zellers, A. Holtzman, H. Rashkin, Y. Bisk, A. Farhadi, F. Roesner, and Y. Choi. Defending against neural fake news. In Advances in neural information processing systems, pages 9054–9065, 2019
S. Dathathri, A. Madotto, J. Lan, J. Hung, E. Frank, P. Molino, J. Yosin- ski, and R. Liu. Plug and play language models: A simple approach to controlled text generation. In International Conference on Learning Rep- resentations, 2019.
N. S. Keskar, B. McCann, L. R. Varshney, C. Xiong, and R. Socher. Ctrl: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858, 2019.
Explicit content planning:
A. Fan, M. Lewis, and Y. Dauphin. Hierarchical neural story generation. In Proceedings of the 56th Annual Meeting of the Association for Compu- tational Linguistics (Volume 1: Long Papers), pages 889–898, 2018.
L. Yao, N. Peng, R. Weischedel, K. Knight, D. Zhao, and R. Yan. Plan- and-write: Towards better automatic storytelling. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 7378–7385, 2019.
B. Tan, Z. Yang, M. AI-Shedivat, E. P. Xing, and Z. Hu. Progressive generation of long text. arXiv preprint arXiv:2006.15720, 2020.
L. Martin, P. Ammanabrolu, X. Wang, W. Hancock, S. Singh, B. Harrison, and M. Riedl. Event representations for automated story generation with deep neural nets. In Proceedings of the AAAI Conference on Artificial Intelligence, 2018.
A. Fan, M. Lewis, and Y. Dauphin. Strategies for structuring story gen- eration. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2650–2660, 2019.
S. Goldfarb-Tarrant, T. Chakrabarty, R. Weischedel, and N. Peng. Con- tent planning for neural story generation with aristotelian rescoring. In Proceedings of the 2020 Conference on Empirical Methods in Natural Lan- guage Processing (EMNLP), pages 4319–4338, 2020.
E. Orbach and Y. Goldberg. Facts2story: Controlling text generation by key facts. In Proceedings of the 28th International Conference on Compu- tational Linguistics, pages 2329–2345, 2020.
Tasks for narrative reasoning that go beyond narrative cloze
Nate Chambers' critique of narrative cloze
Social computational linguistics:
characterizing language variation as a function of social groups: World well being project http://wwbp.org/
personas: Robin Cooper.
Natural language generation and personas, NLG and style
ACL workshop on metaphor identification
Discourse and pragmatics
papers in cognition about discourse particles
discourse and narrative
papers by Maria De Arteaga
pattern discrimination and ethics
questions from the Social Justice Committee: look at research methods and citation practices in the field from the point of view of discrimination/bias