GATE Components
Multi-Lingual Noun Phrase Extractor (MuNPEx) v1.0 for GATE released
MuNPEx 1.0The noun phrase chunker MuNPEx (Multi-Lingual Noun Phrase Extractor) is now available in the new and improved release v1.0. MuNPEx is a base NP chunker for the GATE framework and implemented in JAPE. It is fast, robust, customizable, well-tested and currently supports English, German, and French (with Spanish in beta).
Major changes in this release:
- Limited number of pre- and post-head modifiers to make MuNPEx more robust on certain kinds of input (like a long list of tags or menu entries when processing web pages)
- New optional grammars to add a HEAD_LEMMA slot to an NP annotation, with the lemma extracted from the GATE morphological analyser (for English), the Durm Lemmatizer (for German), or the TreeTagger (for German, Spanish, French)
- DET/MOD/HEAD/MOD2 slots are now stored as strings (rather than Content objects) to make them easier to export and compatible with the new Predicate-Argument Extractor (PAX) component
- other code cleanup and improvements
- no longer labeled as "beta" -- five years of testing ought to be enough, we're not Google ;-)
For more details and the download, please visit the MuNPEx page.
{Minding the Source: Automatic Tagging of Reported Speech in Newspaper Articles}
Submitted by rene on Sat, 2010-07-31 14:01New GATE PR: The Predicate-Argument Extractor (PAX)
At the LREC workshop New Challenges for NLP Frameworks we released a new component for GATE: The Predicate-Argument Extractor (PAX).
{Flexible Ontology Population from Text: The OwlExporter}
Submitted by ninus on Sun, 2010-05-16 15:12{Predicate-Argument EXtractor (PAX)}
Submitted by ralf on Wed, 2010-05-12 09:12The GATE Predicate-Argument EXtractor Component (PAX)
PAX is a GATE component for extracting predicate-argument structures (PAS) from the output of different parsers.
First Release of the Reported Speech Tagger
Coinciding with the presentation of our paper on Minding the Source: Automatic Tagging of Reported Speech in Newspaper Articles at LREC 2008, we are happy to announce the first public release of our free/open source Reported Speech Tagging Components.
The Durm German Lemmatizer
The Durm German Lemmatization System consists of a number of GATE components and resources that perform morphological analysis and lemmatization for German nouns.


