Skip navigation.
Home
Semantic Software Lab
Concordia University
Montréal, Canada

GATE Components

Multi-Lingual Noun Phrase Extractor (MuNPEx) v1.0 for GATE released

MuNPEx 1.0MuNPEx 1.0The noun phrase chunker MuNPEx (Multi-Lingual Noun Phrase Extractor) is now available in the new and improved release v1.0. MuNPEx is a base NP chunker for the GATE framework and implemented in JAPE. It is fast, robust, customizable, well-tested and currently supports English, German, and French (with Spanish in beta).

Major changes in this release:

  • Limited number of pre- and post-head modifiers to make MuNPEx more robust on certain kinds of input (like a long list of tags or menu entries when processing web pages)
  • New optional grammars to add a HEAD_LEMMA slot to an NP annotation, with the lemma extracted from the GATE morphological analyser (for English), the Durm Lemmatizer (for German), or the TreeTagger (for German, Spanish, French)
  • DET/MOD/HEAD/MOD2 slots are now stored as strings (rather than Content objects) to make them easier to export and compatible with the new Predicate-Argument Extractor (PAX) component
  • other code cleanup and improvements
  • no longer labeled as "beta" -- five years of testing ought to be enough, we're not Google ;-)

For more details and the download, please visit the MuNPEx page.

{Minding the Source: Automatic Tagging of Reported Speech in Newspaper Articles}

Krestel, R., S. Bergler, and R. Witte, "{Minding the Source: Automatic Tagging of Reported Speech in Newspaper Articles}", Proceedings of the Sixth International Language Resources and Evaluation Conference (LREC 2008), Marrakech, Morocco, European Language Resources Association (ELRA), May 28–30, 2008.

{Flexible Ontology Population from Text: The OwlExporter}

Witte, R., N. Khamis, and J. Rilling, "{Flexible Ontology Population from Text: The OwlExporter}", International Conference on Language Resources and Evaluation (LREC), Valletta, Malta, ELRA, pp. 3845--3850, May 19--21, 2010.

{Predicate-Argument EXtractor (PAX)}

Krestel, R., R. Witte, and S. Bergler, "{Predicate-Argument EXtractor (PAX)}", New Challenges for NLP Frameworks, Valletta, Malta, ELRA, pp. 51--54, May 22, 2010.

The GATE Predicate-Argument EXtractor Component (PAX)

PAX is a GATE component for extracting predicate-argument structures (PAS) from the output of different parsers.

First Release of the Reported Speech Tagger

Coinciding with the presentation of our paper on Minding the Source: Automatic Tagging of Reported Speech in Newspaper Articles at LREC 2008, we are happy to announce the first public release of our free/open source Reported Speech Tagging Components.

The Durm German Lemmatizer

The Durm German Lemmatization System consists of a number of GATE components and resources that perform morphological analysis and lemmatization for German nouns.

Syndicate content