LKB Grammars
For several years I've been involved with the LinGO project at CSLI, whose main focus is
development of efficient, precise, wide-coverage typed
unification-based grammars for use in a variety of natural language
processing systems. Listed below are grammars I've developed myself
or have been the primary developer on, with code and documentation
linked as available. To run any of these grammars you will need to
download and install Ann
Copestake et al.'s LKB Grammar
Development System, which includes extensive documentation and
installation instructions. For a greatly expanded documentation and a
general introduction to typed feature-structure grammars, see
Copestake (2002), Implementing Typed Feature Structure
Grammars, CSLI Publications, Stanford University.
Typed-inheritance Combinatory Categorial Grammar (TCCG)
TCCG is an implementation of a CCG grammar of English which I
developed in the summer of 2002 at the School of Infomatics at the University of Edinburgh with the
support of a grant from the Stanford-Edinburgh Link and the LinGO
project at CSLI. The grammar was developed largely from the ground
up, based partially on previous work on CCG grammars in the LKB by Aline Villavicencio as
described in Villavicencio, A. (2001) The
Acquisition of a Unification-based Generalised Categorial
Grammar, thesis published as Technical Report UCAM-CL-TR-533,
Computer Laboratory, Cambridge University and Jason Baldridge's non-LKB
CCG parser grok. TCCG is based on
the HPSG-style grammar of Sag, I and T. Wasow (1999), Syntactic
Theory: A Formal Introduction, CSLI Publications, Stanford
University, borrowing heavily from its well-developed lexical
typed-inheritance hierarchies and lexical rule system, providing
parallel treatments of such phenonema as control (raising and equi),
the English auxiliary system, and predicatives as well as a greatly
expanded treatment of coordination (including right node raising and
argument cluster coordination), unbounded dependencies (including
topicalization, relative clauses, parasitic graps, *that-t
effects) and a developed hierarchy of modifier types. Furthermore, I
implement a CCG Normal Form algorithm based on work by Jason Eisner in Eisner, Jason
(1996), Efficient
normal-form parsing for Combinatory Categorial Grammar,
Proceedings of the 34th Annual Meeting of the ACL, Santra Cruz, June,
to address the infamous "spurious ambiguity" problem of CCG,
significantly improving the efficiency of the grammar while not losing
any coverage.
The LKB code for this grammar is temporarily unavailable while I sort out
IP issues. I have also written a lengthy documentation, although it's largely
intended to be a reference. A shorter, more theoretically-oriented
document is forthcoming.
Implementation of Ginzburg and Sag (2000)
This grammar is an implementation of the HPSG grammar in Ginzburgh,
J. and I. Sag (2000), Interrogative Investigations, CSLI
Publications, Stanford University. The initial development of this
grammar was begun in 1999 by Chris Callison-Burch, and
completed in 2001 by Chris, Ivan Sag, and myself. The
implementation provides coverage of a substantial portion of the
Ginzburg and Sag grammar, including their underlying HPSG grammar as
well as their treatment of message types, various types of
interrogative clauses (including reprise, in-situ, yes/no questions,
and inverted clauses), related syntactic phenomenon such as
pied-piping, as well as their semantically based treatment of a wide
variety of clausal complement types and the extensive treatment of the
English auxiliary system as outlined in Sag, I. (2000), Rules and
Exceptions in the English Auxiliary System, Manuscript: Stanford
University.
The code for his grammar is not yet available but will be shortly.