Linguistics Department at the University of Texas at Austin, and I am associated with UT's Division of Statistics and Scientific Computation.
My main area is computational linguistics, focusing on semantics. I am interested in graded, flexible representations of word meaning that do not need a dictionary. In particular, I have been studying distributional models, where words (and phrases) are represented through the textual contexts in which they appear. The point about these representations is that words (and phrases) with similar meaning tend to occur in similar contexts -- which means that we can automatically determine similarity in meaning by looking at similarity in contexts. I am also interested in combining distributional representations for words (and phrases) with logic-based representations at the sentence level. Distributional representations can represent nuances -- but they are not so good at representing the structure of sentences in detail, let alone in a way that we could retrieve the sentence from the distributional representation. Logic-based representations are great at representing the structure of sentences in detail, but have usually been used to reason with just "true" and "false", or "same" and "different". How should the two be combined? I believe we should use probabilistic inference so that degrees and gradedness will be an integral part of the conclusions we draw. (The alternative would be to binarize the distributional information, so that it is just "true" and "false", "same" and "different" after all. Which, admittedly, makes inference easier -- but is only half the fun, in my opinion.)
For more information, see the research page and my cv.
To learn more about computational linguistics at UT, go to the UT computational linguistics lab.