Tagalog CCG Grammar

This page describes the implementation of the Tagalog analysis that is given in my dissertation (Chapter 7).

Status

The grammar handles all of the sentences (grammatical and ungrammatical) that are discussed in my dissertation. This includes intransitive, transitive, ditransitive, and sentential complement sentences, and correctly modeling the local and long-distance extraction asymmetries. Also handled are linkers (for both coordination and relativization), ay-inversion, recent past and commitative constructions.

You can try out the grammar: Tagalog grammar. You'll need to remove the .txt ending (which I had to put to satisfy the wiki software).

Brief description of the data and analysis

Note: see chapter 7 of my dissertation for a full description of the data and analysis.

Tagalog is a verb initial language that has a robust system of voice marking affixes. Here are some examples of verbs in the actor voice:

  • tumakbo ang lalaki
    run-AV  ANG man
    "the man ran"
    
  • bumili ang titser ng libro
    buy-AV ANG teacher NG book
    "the teacher bought a book"
    
  • nagbigay ang lalaki ng libro sa babae
    give-AV  ANG man    NG book  SA woman
    "the man gave a book to the woman"
    

Post-verbal local scrambling is allowed, so bumili ng libro ang titser is a grammatical alternative to bumili ang titser ng libro. There are five other grammatical orders for the ditransitive sentence.

By changing the voice affix, the use of ang and ng to mark arguments changes. For example, the verb bumili in the transitive sentence above is the stem bili combined with the actor voice infix -um-. (Note: nag- is an actor voice prefix, seen on nagbigay with stem bigay 'give'). When bili is combined with the object voice infix -in-, the following pattern emerges:

  • binili ng titser  ang libro
    buy-AV NG teacher ANG book
    "the teacher bought a book"
    

One thing that is particularly interesting about these voice alternations is that they affect what can be extracted quite radically. For example, to ask “who bought the book?”, we must do so using the active voice form of the verb:

  • sino ang bumili ng libro
    who  ANG buy-AV NG book
    "who bought the book?"
    

To ask “what did the teach buy?”, however, we must use the object voice form:

  • ano  ang binili ng titser
    what ANG buy-AV NG teacher
    "what did the teacher buy?"
    

It is ungrammatical to ask about the actor using the object voice or the object using the active voice:

  • *sino ang binili ang libro
     who  ANG buy-OV ANG book
    (For: "who bought the book?")
    
  • *ano  ang bumili ang titser
     what ANG buy-AV ANG teacher
    (For: "what did the teacher buy")
    

This is the syntactic extraction asymmetry for which Tagalog is famous (within syntax circles).

There is a surprising thing to note about the way these questions are asked: it involves placing the NP-marker ang in front of a verb phrase which is missing an element, e.g., binili ng libro. Normally, we would expect the category for something like ang to be np/n and for titser and other nouns to be n. But, consider the following data point:

  • titser  ang bumili ng libro
    teacher ANG buy-AV NG book
    "It was the teacher who bought the book."
    
  • libro ang binili ng titser
    book  ANG buy-OV NG teacher
    "It was a book that the teacher bought."
    

Basically, ang seems to be an element which takes a one-place predicate and referentializes it. That is to say that words like libro and lalaki and phrases like bumili ng libro and binili ang libro are of type s/np and that ang is of type np/(s/np). We can assume the same for ng and sa, but provide different case values for them:

  • ang :- np[case=ang]/(s/np)
  • ng :- np[case=ng]/(s/np)
  • sa :- np[case=sa]/(s/np)

The actual analysis of these markers in terms of case is not straightforward at all, and there is still debate as to whether Tagalog is nominative/accusative or ergative/absolutive or something else. Here, I just label them with the names of the markers, which is all that matters for capturing the asymmetries in the analysis.

Implementation

Here, I'll describe how I use expansions to:

  • provide concise definitions that make it easy to declare new lexical items and their morphological variants and corresponding predicates (expressed as their English translations)
  • encode lexical redundancies by defining categories in terms of others (e.g., transitive is the intransitive plus one argument)
 
openccg/grammars/tagalog.txt · Last modified: 2008/05/04 10:26 (external edit)
 
Except where otherwise noted, content on this wiki is licensed under the following license:CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki