LKB Grammars

For several years I've been involved with the LinGO project at CSLI, whose main focus is development of efficient, precise, wide-coverage typed unification-based grammars for use in a variety of natural language processing systems. Listed below are grammars I've developed myself or have been the primary developer on, with code and documentation linked as available. To run any of these grammars you will need to download and install Ann Copestake et al.'s LKB Grammar Development System, which includes extensive documentation and installation instructions. For a greatly expanded documentation and a general introduction to typed feature-structure grammars, see Copestake (2002), Implementing Typed Feature Structure Grammars, CSLI Publications, Stanford University.

Typed-inheritance Combinatory Categorial Grammar (TCCG)

TCCG is an implementation of a CCG grammar of English which I developed in the summer of 2002 at the School of Infomatics at the University of Edinburgh with the support of a grant from the Stanford-Edinburgh Link and the LinGO project at CSLI. The grammar was developed largely from the ground up, based partially on previous work on CCG grammars in the LKB by Aline Villavicencio as described in Villavicencio, A. (2001) The Acquisition of a Unification-based Generalised Categorial Grammar, thesis published as Technical Report UCAM-CL-TR-533, Computer Laboratory, Cambridge University and Jason Baldridge's non-LKB CCG parser grok. TCCG is based on the HPSG-style grammar of Sag, I and T. Wasow (1999), Syntactic Theory: A Formal Introduction, CSLI Publications, Stanford University, borrowing heavily from its well-developed lexical typed-inheritance hierarchies and lexical rule system, providing parallel treatments of such phenonema as control (raising and equi), the English auxiliary system, and predicatives as well as a greatly expanded treatment of coordination (including right node raising and argument cluster coordination), unbounded dependencies (including topicalization, relative clauses, parasitic graps, *that-t effects) and a developed hierarchy of modifier types. Furthermore, I implement a CCG Normal Form algorithm based on work by Jason Eisner in Eisner, Jason (1996), Efficient normal-form parsing for Combinatory Categorial Grammar, Proceedings of the 34th Annual Meeting of the ACL, Santra Cruz, June, to address the infamous "spurious ambiguity" problem of CCG, significantly improving the efficiency of the grammar while not losing any coverage.

The LKB code for this grammar is temporarily unavailable while I sort out IP issues. I have also written a lengthy documentation, although it's largely intended to be a reference. A shorter, more theoretically-oriented document is forthcoming.

Implementation of Ginzburg and Sag (2000)

This grammar is an implementation of the HPSG grammar in Ginzburgh, J. and I. Sag (2000), Interrogative Investigations, CSLI Publications, Stanford University. The initial development of this grammar was begun in 1999 by Chris Callison-Burch, and completed in 2001 by Chris, Ivan Sag, and myself. The implementation provides coverage of a substantial portion of the Ginzburg and Sag grammar, including their underlying HPSG grammar as well as their treatment of message types, various types of interrogative clauses (including reprise, in-situ, yes/no questions, and inverted clauses), related syntactic phenomenon such as pied-piping, as well as their semantically based treatment of a wide variety of clausal complement types and the extensive treatment of the English auxiliary system as outlined in Sag, I. (2000), Rules and Exceptions in the English Auxiliary System, Manuscript: Stanford University.

The code for his grammar is not yet available but will be shortly.