Semantic representation of scientific literature: bringing claims, contributions and named entities onto the Linked Open Data cloudSubmitted by bahar on Thu, 2016-01-07 17:09
Rhetector is a GATE plugin for the automatic detection of Rhetorical Entities (REs) in scientific literature. Rhetorical Entities are spans of text (sentences, passages, sections, etc.) in a document, where authors convey their findings, like Claims or Arguments, to the readers. We designed a lightweight pipeline to automatically detect rhetorical entities in scientific literature, currently limited to Claims and Contributions. The motivation and application behind Rhetector is described in our publication, "Semantic representation of scientific literature: bringing claims, contributions and named entities onto the Linked Open Data cloud", PeerJ Computer Science, vol. 1, no. e37 PeerJ, 12/2015.
We present an automatic workflow that performs text segmentation and entity extraction from scientific literature to primarily address Task 2 of the Semantic Publishing Challenge 2015. The proposed solution is composed of two subsystems: (i) A text mining pipeline, developed based on the GATE framework, which extracts structural and semantic entities, such as, authors' information and citations, from text and produces semantic (typed) annotations; and (ii) a flexible exporting module that translates the document annotations into RDF triples according to a custom mapping file.
This page provides supplementary material for our submission to the SAVE-SD 2015 workshop on Semantics, Analytics, Visualisation: Enhancing Scholarly Data. We have published our populated knowledge base from the experiments described in the paper. In order to reproduce the results in the "Application" section, you can execute the queries by clicking on the link to the full page below.
This overabundance of literature available in online repositories is an ongoing challenge for scientists that have to efficiently manage and analyze content for their information needs. Most of the existing literature management systems merely provide support for storing bibliographical metadata, tagging, and simple annotation capabilities. We go beyond these approaches by demonstrating how an innovative combination of semantic web technologies with natural language processing can mitigate the information overload by helping in curating and organizing scientific literature. Zeeva is our research prototype for demonstrating how we can turn existing papers into a queryable knowledge base.