Skip navigation.
Semantic Software Lab
Concordia University
Montréal, Canada

Semantic Web and Information Extraction (SWAIE 2013) Workshop, Hissar, Bulgaria

Printer-friendly versionPrinter-friendly versionPDF versionPDF version

SWAIE 2013: Semantic Web and Information Extraction

Full-day workshop in conjunction with RANLP 2013

September 12/13, 2013, Hissar, Bulgaria


There is a vast wealth of information available in textual format that the Semantic Web cannot yet tap into: 80% of data on the Web and on internal corporate intranets is unstructured, hence analysing and structuring the data - social analytics and next generation analytics - is a large and growing endeavour. The goal of the 2nd workshop on Semantic Web and Information Extraction is to bring researchers from the fields of Information Extraction and the Semantic Web together to foster inter-domain collaboration. To make sense of the large amounts of textual data now available, we need help from both the Information Extraction and Semantic Web communities. The Information Extraction community specialises in mining the nuggets of information from text: such techniques could, however, be enhanced by annotated data or domain-specific resources. The Semantic Web community has already taken great strides in making these resources available through the Linked Open Data cloud, which are now ready for uptake by the Information Extraction community. The workshop invites contributions around three particular topics: 1) Semantic Web-driven Information Extraction, 2) Information Extraction for the Semantic Web, and 3) applications and architectures on the intersection of Semantic Web and Information Extraction.


The Semantic Web aims to add a machine tractable, repurposable layer to complement the existing web of natural language hypertext. In order to realise this vision, the creation of semantic annotation, the linking of Web pages to ontologies and the creation, evolution and interrelation of ontologies must become automatic or semi-automatic processes. Information Extraction, a form of natural language analysis, is becoming a central technology to link Semantic Web models with documents. On the other hand, traditional Information Extraction can be enhanced by the addition of semantic information, enabling disambiguation of concepts, reasoning and inference to take place over the documents. The primary goal of this workshop is to advance the understanding of the relationship between Information Extraction and Semantic Web.  With the adoption of the Web 2.0 paradigm, these technologies further face new challenges because of their inherent multi-source nature, while the rapidly increasing use of social media  also brings a new set of problems in dealing with degraded forms of text such as incorrect grammar, spelling and so on. Information Extraction now has to deal not just with isolated texts or single narratives but with large scale repositories or sources -- in one or many languages -- containing a multiplicity of views, opinions, or commentaries on particular topics, entities or events, in very diverse styles and formats. New methods and tools thus need to be developed to deal with the changing face of data and the changing needs of society. Furthermore, traditional platforms and architectures for Information Extraction are not necessarily capable of smooth handling of the transition to more semantic forms of annotation. While language analysis tools may not require sophisticated ontology handling mechanisms, the ensuing lack of interoperability can be problematic when embedding such tools and platforms in Semantic Web architectures.


Participants will come from various areas of research that are represented in the Semantic Web and Information Extraction communities such as: artificial intelligence, ontology population, data mining, machine learning, knowledge representation, and web information systems. Some participants will probably be especially interested in particular application areas, such as the biomedical domain, government, cultural heritage, or entertainment.