Skip navigation.
Home
Semantic Software Lab
Concordia University
Montréal, Canada

IntelliGenWiki: Intelligent Semantic Wikis for Life Sciences

Printer-friendly versionPrinter-friendly versionPDF versionPDF version

1. Overview

Researchers need to extract and manage critical knowledge from the massive amount of literature available in multiple and ever-growing repositories. The sheer volume of information makes the exhaustive analysis of literature a labor-intensive and time-consuming task, during which significant knowledge can be easily missed. We present IntelliGenWiki, a service-oriented solution that combines state-of-the-art techniques from the Natural Language Processing (NLP) and Semantic Web domains to support the knowledge discovery workflow in omics sciences. For a brief description of IntelliGenWiki, please see our paper [1], Sateli, B., M. - J. Meurs, G. Butler, J. Powlowski, A. Tsang, and R. Witte, "IntelliGenWiki: An Intelligent Semantic Wiki for Life Sciences", NETTAB 2012, vol. 18 (Supplement B), Como, Italy : EMBnet.journal, pp. 50–52, 11/2012.

IntelliGenWiki integration applied to MediaWikiIntelliGenWiki integration applied to MediaWiki

2. Features

IntelliGenWiki is a novel approach based entirely on open source and open standards that empowers a wiki-based literature curation environment by applying new Human-AI collaboration patterns through integrating text mining "assistants" that work collaboratively with humans on the literature. Such assistants aid curators to extract knowledge from text and produce machine-readable metadata that can be used in an Open Data context.

Rather than a new wiki system, IntelliGenWiki provides a cohesive, generic architecture that can be applied to various wiki engines. The resulting integration seamlessly provides the curators with NLP capabilities deployed in the GATE architecture and brokered through the Semantic Assistants framework as web services. This way, the complexity of executing NLP pipelines are hidden from the curators' point of view, rather, the available semantic assistants in the wiki work collaboratively with curators on wiki pages to automatically extract entities of interest from its textual content, such as organisms, genes or enzymes.

Automatic entity extraction from the literature in IntelliGenWikiAutomatic entity extraction from the literature in IntelliGenWiki

For example, the IntelliGenWiki has been applied to a MediaWiki instance in the context of the Genozymes project, where the wiki is used as a distributed, collaborative literature curation platform for lignocellulose research [2]. As shown in the figure above, the NLP-enabled wiki system features the MediaWiki native interface as well as the Wiki-NLP integration user interface generated by our Semantic Assistants plug-in installed on the wiki. The figure shows the results from our mycoMINE pipeline that automatically extracts various entities from the given paper abstract and retrieves related additional information from external resources, e.g., their scientific name or commission number from open databases – a task that is typically done manually by curators.

In addition to automatic information extraction from wiki content, IntelliGenWiki implicitly produces semantic metadata that can be exploited in various ways, e.g., to be exported to external repositories or to provide semantic entity retrieval capabilities in the wiki, where applicable. In the figure below, we illustrate how curators of the IntelliGenWiki can find wiki pages containing specific entities of their interest based on their type.


Semantic Query in IntelliGenWiki (top) and retrieved entities (bottom)Semantic Query in IntelliGenWiki (top) and retrieved entities (bottom)

3. Download & Installation

In order to use IntelliGenWiki, you need to set up a MediaWiki version 1.16 or later and install our Semantic Assistants plug-in [3]. You will also need to import the Semantic Assistants templates in your wiki. These templates together with the plug-in will be published soon under an open source license.

If you also want to install your own Semantic Assistants server (offering NLP services), you should obtain the complete Semantic Assistants Architecture.

4. IntelliGenWiki Video Introduction

IntelliGenWiki was presented at the NETTAB 2012 Workshop in Como, Italy in November, 2013. The video of the presentation is available on http://www.nettab.org/2012/videos/BaharSateli.html.

5. Acknowledgments

Funding for the development of IntelliGenWiki has been generously provided by NSERC.


References

  1. Sateli, B., M. - J. Meurs, G. Butler, J. Powlowski, A. Tsang, and R. Witte, "IntelliGenWiki: An Intelligent Semantic Wiki for Life Sciences", NETTAB 2012, vol. 18 (Supplement B), Como, Italy : EMBnet.journal, pp. 50–52, 11/2012.
  2. Sateli, B., C. Murphy, R. Witte, M. - J. Meurs, and A. Tsang, "Text Mining Assistants in Wikis for Biocuration", 5th International Biocuration Conference, Washington DC, USA : International Society for Biocuration, pp. 126, 04/2012.
  3. Sateli, B., and R. Witte, "Supporting Wiki Users with Natural Language Processing", The 8th International Symposium on Wikis and Open Collaboration (WikiSym 2012), Linz, Austria : ACM, 08/2012.