Skip navigation.
Home
Semantic Software Lab
Concordia University
Montréal, Canada

The GATE LODtagger component

Printer-friendly versionPrinter-friendly versionPDF versionPDF version

1. Introduction

The LODtagger is a GATE component that provides linking entities from a document to their corresponding resource on the Linked Open Data (LOD) cloud. LODtagger relies on external tools to perform the actual content tagging and hides the complexity of communicating with LOD taggers, such as DBpedia Spotlight, from the perspective of pipeline developers.

2. Features

The LODtagger provides an easy-to-use interface for running external LOD taggers within the GATE environment. The current version of the LODtagger supports DBpedia Spotlight. We plan to integrate additional existing LOD tagging tools in the future. DBpedia Spotlight is an open-source Named Entity Recognition (NER) tool that matches the surface form of words in a text to their entry in the DBpedia knowledge base. Each annotated entity is linked to the LOD cloud through a URI, where additional, machine-readable information exists to describe the entity under investigation.

The DBpediaTagger component allows the user to define the type of the annotations generated from the results through a run-time parameter. Along with the URI, other metadata, such as the confidence of the tool in linking the word to its correct sense, is returned to the client. Therefore, for each result received from Spotlight, an annotation is added to the document, which includes several features, as shown in the following example:

DBpediaTagger Annotation Example shown in GATE DeveloperDBpediaTagger Annotation Example shown in GATE Developer

3. Quick Start Guide

As our component comes in form of a GATE component, you will need GATE itself. Note that you will need GATE version 8.1 or better to run the LODtagger. You can install the system directly from within GATE using the CREOLE Plugin Manager by selecting our Semantic Software Lab repository. After installation, you can load the demo pipline as shown below:
DBpediaTagger Demo Pipeline in GATE DeveloperDBpediaTagger Demo Pipeline in GATE Developer

Note that you should edit the endpoint run-time parameter of the DBpediaTagger PR to a URL offering the RESTful annotation service (by default, a public Spotlight endpoint is used).

4. Download

Latest version is v1.0 from 24.07.2015. You can install this version directly from within GATE through the CREOLE Plugin Manager. The download includes the DBpediaTagger PR, documentation, and an example pipeline. You can download the install package manually (but the recommended way of installation is to use the GATE plugin manager through the GATE Developer GUI).

You can also look at the documentation (this is the same file as included in the distribution in the doc/ folder).

You can check out the latest development version of LODtagger from our public GitHub repository. A continuous integration build of this repository is available on our Jenkins server.

5. License

LODtagger is distributed as free/open source software under the GNU Lesser General Public License Version 3 (LGPL3).

6. Version history

New features in v1.1 (07.10.2016)

  • Added "types" feature from Spotlight to GATE annotation.

First release was v1.0 (24.07.2015)

7. Feedback

For questions, comments, etc., please use the Forum.