See the UT Compling Lab project page for a list of some projects I am currently involved in. Here's a brief list:

  • TeXIT: (Texas) X-lingual Interpretation of Texts. (New York Community Trust, 2008-Present)
  • EARL: Efficient Annotation of Resources by Learning. (NSF, 2007-Present)

Past projects:

  • TEXTIME: Temporal Expressions and Time Processing (in Texas). (New York Community Trust, 2006-2008)
  • DISCOR: Discourse Structure and Coreference Resolution. (NSF, 2006-2008)
  • OpenCCG Front End: DotCCG specification language and VisCCG GUI editor for OpenCCG grammars. (LAITS, 2006-2007)
  • UT Austin NLP Suite: suite of open source NLP packages with tutorials. (~FastTex, 2006-2007)

Computational linguistics usually requires writing a fair amount of programming code, but there is a lot of existing software that can be used directly or built on for performing natural language processing tasks. Open source software is particularly appealing because it allows you to modify the source code if you need to. Check out the OpenNLP project for a fairly comprehensive list of open source software for natural language processing. Here are some of the open source software that I am involved with:

  • MSTParser: a Java dependency parser.
  • TADM (Toolkit for Advanced Discriminative Modeling): a C++ package for training maximum entropy and perceptron models.
  • The OpenNLP toolkit: a suite of Java tools for various NLP tasks, including sentence splitting, part-of-speech tagging, and parsing.
  • OpenCCG: a Java parsing/realization system for Combinatory Categorial Grammar.
  • OpenNLP Maxent: a Java implementation of Generalized Iterative Scaling for training and using maximum entropy models.
 
people/jason_baldridge/projects.txt · Last modified: 2009/05/28 by jbaldrid | | UTCL Wiki