=== OVERVIEW ===

This file provides the index of documents used in our intrinsic evaluation corpus. We created a goldstandard corpus from three different corpora:

- 10 randomly chosen documents from the proceedings of the Semantic Publishing workshop [1] from 2011–2014.

- 10 randomly chosen open-access papers from the computer science edition of PeerJ journal [2].

- 10 randomly chosen conference articles in computational linguistics from the AZ corpus [3], originally curated by Teufel [4].

Below you can find the document indexes in the evaluation corpus and their corresponding URLs:

== SEPUBLICA ==

[Document ID], [URL]

sepublica2011_paper-03, http://ceur-ws.org/Vol-721/paper-03.pdf
sepublica2011_paper-06, http://ceur-ws.org/Vol-721/paper-06.pdf
sepublica2012_paper-01, http://ceur-ws.org/Vol-903/paper-01.pdf
sepublica2012_paper-03, http://ceur-ws.org/Vol-903/paper-03.pdf
sepublica2012_paper-07, http://ceur-ws.org/Vol-903/paper-07.pdf
sepublica2013_paper-04, http://ceur-ws.org/Vol-994/paper-04.pdf
sepublica2013_paper-05, http://ceur-ws.org/Vol-994/paper-05.pdf
sepublica2013_paper-07, http://ceur-ws.org/Vol-994/paper-07.pdf
sepublica2014_paper-05, http://ceur-ws.org/Vol-1155/paper-05.pdf
sepublica2014_paper-07, http://ceur-ws.org/Vol-1155/paper-07.pdf

== PEERJ-COMPSCI ==

[Document ID], [URL]

peerj_paper04.xml, https://peerj.com/articles/cs-4.pdf
peerj_paper05.xml, https://peerj.com/articles/cs-5.pdf
peerj_paper10.xml, https://peerj.com/articles/cs-10.pdf
peerj_paper11.xml, https://peerj.com/articles/cs-11.pdf
peerj_paper12.xml, https://peerj.com/articles/cs-12.pdf
peerj_paper15.xml, https://peerj.com/articles/cs-15.pdf
peerj_paper17.xml, https://peerj.com/articles/cs-17.pdf
peerj_paper19.xml, https://peerj.com/articles/cs-19.pdf
peerj_paper20.xml, https://peerj.com/articles/cs-20.pdf
peerj_paper26.xml, https://peerj.com/articles/cs-26.pdf

== AZ ==

9407011.az-scixml.xml, http://arxiv.org/pdf/cmp-lg/9407011.pdf
9408003.az-scixml.xml, http://arxiv.org/pdf/cmp-lg/9408003.pdf
9410005.az-scixml.xml, http://arxiv.org/pdf/cmp-lg/9410005.pdf
9411021.az-scixml.xml, http://arxiv.org/pdf/cmp-lg/9411021.pdf
9502022.az-scixml.xml, http://arxiv.org/pdf/cmp-lg/9502022.pdf
9503009.az-scixml.xml, http://arxiv.org/pdf/cmp-lg/9503009.pdf
9503023.az-scixml.xml, http://arxiv.org/pdf/cmp-lg/9503023.pdf
9504002.az-scixml.xml, http://arxiv.org/pdf/cmp-lg/9504002.pdf
9505001.az-scixml.xml, http://arxiv.org/pdf/cmp-lg/9505001.pdf
9511001.az-scixml.xml, http://arxiv.org/pdf/cmp-lg/9511001.pdf

==== REFERENCES ===

[1] Semantic Publishing Workshop (SePublica), http://sepublica.mywikipaper.org/drupal/
[2] PeerJ Computer Science Journal, https://peerj.com/computer-science/
[3] Argumentation Zoning (AZ) Corpus, http://www.cl.cam.ac.uk/∼sht25/AZ corpus.html
[4] Teufel, S. (2010). The Structure of Scientific Articles: Applications to Citation Indexing and Summarization. Center for the Study of Language and Information.
