Stanford CoreNLP Context Free Grammars

This library provides some rules adapting Stanford's handy CoreNLP output to be usable with dfh.grammar . This allows you to do things like play with X-bar syntax, say, using CoreNLP tokenization, or perhaps some unification based theory like HPSGYou will have to use conditions to implement unification., or really any syntactic theory expressible at least in part in context-free grammar rules. Here's a sample:

import java.util.HashMap;
import java.util.Map;

import dfh.grammar.Grammar;
import dfh.grammar.Match;
import dfh.grammar.Rule;
import dfh.grammar.stanfordnlp.CnlpToken;
import dfh.grammar.stanfordnlp.CnlpTokenSequenceFactory;
import dfh.grammar.stanfordnlp.rules.Adjective;
import dfh.grammar.stanfordnlp.rules.Determiner;
import dfh.grammar.stanfordnlp.rules.Noun;
import dfh.grammar.tokens.TokenSequence;

/**
 * A silly NP chunker that only knows about nouns, adjectives, and determiners.
 */
public class NPChunker {

    public static void main(String[] args) {

        // a simple NP grammar
        Map<String, Rule> map = new HashMap<String, Rule>();
        map.put("A", new Adjective());
        map.put("N", new Noun());
        map.put("D", new Determiner());
        Grammar g = new Grammar("NP := <D>? [ <A> . ]* <N>", map);

        // make a tokenizer
        CnlpTokenSequenceFactory sequencer = new CnlpTokenSequenceFactory();

        // some text
        String sentence = "The fat cat sat on the mat.";

        // tokenize and tag it
        TokenSequence<CnlpToken<?>> seq = sequencer.sequence(sentence);

        // find some NPs
        for (Match m : g.find(seq).all())
            System.out.println(m.group());
    }

}

This produces, ignoring CoreNLP's output to STDERR,

The fat cat
the mat

For now, this is all the documentation of the project: this page, the downloadable tarball including source, unit tests, and example code, and the javadocs. The Stanford CoreNLP library must be downloaded separately and is under a separate license (free for research). These classes serve as much as anything as an example of how one might adapt an NLP library to be used with dfh.grammar .