Skip navigation.
Semantic Software Lab
Concordia University
Montréal, Canada

Text Mining

Semantic Publishing Challenge 2015: Supplementary Material

This page provides supplementary material for our submission to the Semantic Publishing Challenge 2015 co-located with the Extended Semantic Web Conference (ESWC 2015).

We present an automatic workflow that performs text segmentation and entity extraction from scientific literature to primarily address Task 2 of the Semantic Publishing Challenge 2015. The proposed solution is composed of two subsystems: (i) A text mining pipeline, developed based on the GATE framework, which extracts structural and semantic entities, such as, authors' information and citations, from text and produces semantic (typed) annotations; and (ii) a flexible exporting module that translates the document annotations into RDF triples according to a custom mapping file.

SAVE-SD 2015 Publication: Supplementary Material

This page provides supplementary material for our publication in the SAVE-SD 2015 workshop on Semantics, Analytics, Visualisation: Enhancing Scholarly Data. We have published our populated knowledge base from the experiments described in the paper. In order to reproduce the results in the "Application" section, you can execute the queries by clicking on the link to the full page below.

Zeeva: A Collaborative Semantic Literature Management System

This overabundance of literature available in online repositories is an ongoing challenge for scientists that have to efficiently manage and analyze content for their information needs. Most of the existing literature management systems merely provide support for storing bibliographical metadata, tagging, and simple annotation capabilities. We go beyond these approaches by demonstrating how an innovative combination of semantic web technologies with natural language processing can mitigate the information overload by helping in curating and organizing scientific literature. Zeeva is our research prototype for demonstrating how we can turn existing papers into a queryable knowledge base.

Tutorial: Adding Natural Language Processing Support to your (Semantic) MediaWiki

Wikis have become powerful knowledge management platforms, offering high customizability while remaining relatively easy to deploy and use. With a majority of content in natural language, wikis can greatly benefit from automated text analysis techniques. Natural Language Processing is a branch of computer science that employs various Artificial Intelligence (AI) techniques to process content written in natural language. NLP-enhanced wikis can support users in finding, developing and organizing knowledge contained inside the wiki repository. Rather than relying on external NLP applications, we developed an approach that brings NLP as an integrated feature to wiki systems, thereby creating new human/AI collaboration patterns, where human users work together with automated "intelligent assistants" on developing, structuring and improving wiki content. This is achieved with our open source Wiki-NLP integration, a Semantic Assistants add-on that allows to incorporate NLP services into the MediaWiki environment, thereby enabling wiki users to benefit from modern text mining techniques.

This tutorial has two main parts: In the first part, we will present an introduction into NLP and text mining, as well as related frameworks, in particular the General Architecture for Text Engineering and the Semantic Assistants framework. Building on the foundations covered in the first part, we will then look into the Wiki-NLP integration and show how you can add arbitrary text processing services to your (Semantic) MediaWiki instance with minimal effort. Throughout the tutorial, we illustrate the application of NLP in wikis with a number of applications examples from various domains we developed in our research within the last decade, such as cultural heritage data management, collaborative software requirements engineering, and biomedical knowledge management. These showcases of the Wiki-NLP integration highlight a number of integration patterns that will help you to adopt this technology for your own domain.

Semantic Management of Scholarly Literature: A Wiki-based Approach

Sokolova, M., and P. van Beek (Eds.), Sateli, B., "Semantic Management of Scholarly Literature: A Wiki-based Approach", The 27th Canadian Conference on Artificial Intelligence (Canadian AI 2014), vol. LNCS 8436, Montréal, Canada : Springer, pp. 387–392, 04/2014.
Syndicate content