- SSL for Students
- Tools & Resources
Rhetector: Automatic Dection of Rhetorical Entities in Scientific Literature
Table of Contents
Our Rhetector component is a GATE plugin for the automatic detection of Rhetorical Entities (REs) in scientific literature. For background information on the design and application of REs, please read our paper : "Semantic representation of scientific literature: bringing claims, contributions and named entities onto the Linked Open Data cloud", PeerJ Computer Science, vol. 1, no. e37 PeerJ, 12/2015.
1. What are Rhetorical Entities?
In the context of scientific literature, Rhetorical Entities (REs) are spans of text (sentences, passages, sections, etc.) in a document, where authors convey their findings, like Claims or Arguments, to the readers. REs are usually situated in certain parts of a document, depending on their role. For example, the authors' Claims are mentioned in the Abstract, Introduction or Conclusion section of a paper, and seldom in the Background. This conforms with the researchers' habit in both reading and writing scientific articles. Verbatim extraction of REs from text helps to efficiently allocate the attention of humans when reading a paper, as well as improving retrieval mechanisms by finding documents based on their REs (e.g., “Give me all papers with implementation details”) , .
For each detected RE, an annotation “RhetoricalEntity” is added to the document. Based on the grammatical structure of the RE, it is classified and mapped onto existing concepts on the Linked Open Data (LOD) cloud. The fully-qualified URI of the RE type is stored as the value of the “URI” feature of each annotation.
Rhetector is available for direct installation through our GATE update site. If you are not familiar with GATE's Plugin Manager, please follow the installation steps in the documentation (this is the same file as included in the distribution in the doc/ folder) to install the plugin. The download package includes the component's source code, the user guide documentation and a demo pipeline. You can download the install package manually (but the recommended way of installation is to use the GATE plugin manager through the GATE Developer GUI).
4. More Information and citation
For more information on the automatic extraction of REs, please refer to our latest publication . If you use our component, we would appreciate a citation of our paper.
The Rhetector component and resources are published under the GNU Lesser General Public License v3 (LGPL3).
6. Version history
First release was v1.0 (10.08.2015)
For questions, comments, etc., please use the Forum.
- "Semantic representation of scientific literature: bringing claims, contributions and named entities onto the Linked Open Data cloud", PeerJ Computer Science, vol. 1, no. e37 PeerJ, 12/2015.
- "What's in this paper? Combining Rhetorical Entities with Linked Open Data for Semantic Literature Querying", Semantics, Analytics, Visualisation: Enhancing Scholarly Data (SAVE-SD 2015), Florence, Italy : ACM, pp. 1023–1028, 05/2015.