This page provides supplementary material for our submission to the SAVE-SD 2015 workshop on Semantics, Analytics, Visualisation: Enhancing Scholarly Data. We have published our populated knowledge base from the experiments described in the paper. In order to reproduce the results in the "Application" section, you can execute the queries by clicking on the link to the full page below.
This overabundance of literature available in online repositories is an ongoing challenge for scientists that have to efficiently manage and analyze content for their information needs. Most of the existing literature management systems merely provide support for storing bibliographical metadata, tagging, and simple annotation capabilities. We go beyond these approaches by demonstrating how an innovative combination of semantic web technologies with natural language processing can mitigate the information overload by helping in curating and organizing scientific literature. Zeeva is our research prototype for demonstrating how we can turn existing papers into a queryable knowledge base.
Wikis have become powerful knowledge management platforms, offering high customizability while remaining relatively easy to deploy and use. With a majority of content in natural language, wikis can greatly benefit from automated text analysis techniques. Natural Language Processing is a branch of computer science that employs various Artificial Intelligence (AI) techniques to process content written in natural language. NLP-enhanced wikis can support users in finding, developing and organizing knowledge contained inside the wiki repository. Rather than relying on external NLP applications, we developed an approach that brings NLP as an integrated feature to wiki systems, thereby creating new human/AI collaboration patterns, where human users work together with automated "intelligent assistants" on developing, structuring and improving wiki content. This is achieved with our open source Wiki-NLP integration, a Semantic Assistants add-on that allows to incorporate NLP services into the MediaWiki environment, thereby enabling wiki users to benefit from modern text mining techniques.
This tutorial has two main parts: In the first part, we will present an introduction into NLP and text mining, as well as related frameworks, in particular the General Architecture for Text Engineering and the Semantic Assistants framework. Building on the foundations covered in the first part, we will then look into the Wiki-NLP integration and show how you can add arbitrary text processing services to your (Semantic) MediaWiki instance with minimal effort. Throughout the tutorial, we illustrate the application of NLP in wikis with a number of applications examples from various domains we developed in our research within the last decade, such as cultural heritage data management, collaborative software requirements engineering, and biomedical knowledge management. These showcases of the Wiki-NLP integration highlight a number of integration patterns that will help you to adopt this technology for your own domain.