User login

Upcoming events

No upcoming events available

Today's:

All time:

Last viewed:

NLP Frameworks 2010 Workshop Proceedings now Online
Proceedings of the European Conference on Computational Biology (ECCB) 2010 Workshop: Annotation, interpretation and management of mutations (AIMM)
Zeeva: A Collaborative Semantic Literature Management System

Current weather

Montréal, Canada

Broken clouds
Temperature: 13 °C
Wind: North-Northeast, 22.2 km/h
Pressure: 1005 hPa
Rel. Humidity: 67 %
Visibility: 24.1 km
Sunrise: 06:54 -0500
Sunset: 16:25 -0500

Reported on:
Thu, 2023-11-16 10:00

Semantic User Profiling of Scholars (PeerJ CompSci 2016): Supplementary Material

Semantic Publishing

Printer-friendly version

PDF version

Table of Contents

1. Data
- 1.1. Knowledge Base
2. Software

Supplementary material for our PeerJ CompSci submission on Semantic User Profiling. Note: files are provided here for review purposes and will be moved to Github over the next weeks.

1. Data

1.1. Knowledge Base

The KB files are provided as supplements, in N-Quads format. The files are generated with Jena's tdbdump command, but should load fine into other, non-Jena triplestores as well. To load them into a new KB, create an empty directory (e.g., /tmp/tdb) and issue:

tdbloader --loc=/tmp/tdb triples.nq

For more information on tdbloader, please refer to Apache Jena's TDB Command-line Utilities page.

2. Software

All software described in the paper is available under open source licenses.

2.1. Required Third-Party Software

You need a JDK (Java 7 or better), as well as an installation of Apache Ant.
You also need Apache Jena, version 2.13 or better.
If you want to run the text mining pipeline, you must have GATE, version 8.1 or better, installed. See the GATE homepage for instructions on how to install and run GATE.
Also, for the text mining pipeline, you need access to a DBpedia Spotlight installation (RESTful interface). Since the output highly depends on the model used for Spotlight, if you want to reproduce our results, you will need to use the exact same model as in our paper: en_2+2.
For the automatic evaluation tool, you need additionally Ivy

2.2. Text Mining Pipeline

The text mining pipeline described in the paper is provided as a ZIP file. Unzip the downloaded package on your workstation and follow the instructions below to reproduce our experiments: (Note: the pipeline has been tested on Linux and MacOS X):

You must have created a TDB-based triplestore and loaded our mapping rules as described above.
Start GATE (v8.1 or better). Choose File → Restore Application from File. Then browse to where you unzipped the pipeline and choose the provided .xgapp file.
Once the pipeline is loaded, you can open any document.
Create a new corpus and add the documents.
Double-click on the Semantic_Profiling pipeline under Applications. Choose the new corpus you created above from the dropdown and click Run this Application button.
Once the pipeline is finished you can open the documents and examine their annotations.
In order to examine the generated triples, first close the GATE application. Then you can either check the triples using Jena's tdbdump command or publish it through a Fuseki server.

2.3. Automatic Evaluation Tool

The evaluation in the paper is based on the provided responses from the user study participants, exported from LimeSurvey. Our evaluation tool then takes these results and computes the various metrics provided in the paper. You can download the source code as a zip archive.

The tool can be started from the command line in the 'Analysis'-folder with the ant task 'ant run'. The tool processes all files in the data folder (original exported Limesurvey files in xlsx format) and creates an output folder result including the analyzed files in the format original_file_name'_results_metrics.xlsx.

The currently computed metrics are: MAP, Precision@rank and nDCG as described in the paper.

Attachment	Size
ScholarLens.zip	5.6 MB
Analysis.zip	458.99 KB

Site Menu

User login

Upcoming events

Popular content

Today's:

All time:

Last viewed:

Current weather

Semantic User Profiling of Scholars (PeerJ CompSci 2016): Supplementary Material

1. Data

1.1. Knowledge Base

2. Software

2.1. Required Third-Party Software

2.2. Text Mining Pipeline

2.3. Automatic Evaluation Tool

See also

Tag Cloud

New Publications

Recent blog posts

New forum topics

Syndicate

Search

Semantic Assistants Durm Wiki Open Positions Contact	Search this site: