Integrating Wiki Systems, Natural Language Processing, and Semantic Technologies for Cultural Heritage Data Management

René Witte; Thomas Kappler; Ralf Krestel; Peter C. Lockemann; René Witte; René Witte; Petra Gerlach; Markus Joachim; Thomas Kappler; Ralf Krestel; Praharshana Perera; René Witte; Thomas Gitzinger; Thomas Kappler; Ralf Krestel; René Witte; Ralf Krestel; Thomas Kappler; Peter C. Lockemann

New Book Chapter on Semantic Wikis and Natural Language Processing for Cultural Heritage Data

Springer just published a new book, Language Technology for Cultural Heritage, where we also contributed a chapter: "Integrating Wiki Systems, Natural Language Processing, and Semantic Technologies for Cultural Heritage Data Management". The book collects selected, extended papers from several years of the LaTeCH workshop series, where we presented our work on the Durm Project back in 2008.

In this project, which ran from 2004–2006, we analysed the historic Encyclopedia of Architecture, which was written in German between 1880-1943. It was one of the largest projects aiming at conserving all architectural knowledge available at that time. Today, its vast amount of content is mostly lost: few complete sets are available and its complex structure does not lend itself easily to contemporary application. We were able to track down one of the rare complete sets in the Karlsruhe University's library, where it fills several meters of shelves in the archives. The goal, then, was to apply "modern" (as of 2005) semantic technologies to make these heritage documents accessible again by transforming them into a semantic knowledge base (due to funding limitations, we only worked with one book in this project, but the system was developed to be able to eventually cover the complete set). Using techniques from Natural Language Processing and Semantic Computing, we automatically populate an ontology that can be used for various application scenarios: Building historians can use it to navigate and query the encyclopedia, while architects can directly integrate it into contemporary construction tools. Additionally, we made all content accessible through a user-friendly Wiki interface, which combines original text with NLP-derived metadata and adds annotation capabilities for collaborative use (note that not all features are enabled in the public demo version).

All data created in the project (scanned book images, generated corpora, etc.) is publicly available under open content licenses. We also still maintain a number of open source tools that were originally developed for this project, such as the Durm German Lemmatizer. A new version of our Wiki/NLP integration, which will allow everyone to easily set up a similar system, is currently under development and will be available early 2012.

»

Login to post comments

Integrating Wiki Systems, Natural Language Processing, and Semantic Technologies for Cultural Heritage Data Management

Submitted by rene on Thu, 2011-07-14 16:58

Sporleder, C., A. van den Bosch, and K. Zervanou (Eds.), Witte, R., T. Kappler, R. Krestel, and P. C. Lockemann, "Integrating Wiki Systems, Natural Language Processing, and Semantic Technologies for Cultural Heritage Data Management", Language Technology for Cultural Heritage Springer Berlin Heidelberg, pp. 213--230, 2011.

»

{An Integration Architecture for User-Centric Document Creation, Retrieval, and Analysis}

Submitted by rene on Wed, 2010-08-25 06:28

Witte, R., "{An Integration Architecture for User-Centric Document Creation, Retrieval, and Analysis}", Proceedings of the VLDB Workshop on Information Integration on the Web (IIWeb'04), Toronto, Canada, pp. 141–144, August 30, 2004.

»

{Engineering a Semantic Desktop for Building Historians and Architects}

Submitted by rene on Tue, 2010-08-17 09:35

Witte, R., P. Gerlach, M. Joachim, T. Kappler, R. Krestel, and P. Perera, "{Engineering a Semantic Desktop for Building Historians and Architects}", 1st Workshop on The Semantic Desktop - Next Generation Personal Information Management and Collaboration Infrastructure, vol. 175, Galway, Ireland, pp. 138–152, November 6, 2005.

»

{A Semantic Wiki Approach to Cultural Heritage Data Management}

Submitted by rene on Fri, 2010-07-30 07:51

Witte, R., T. Gitzinger, T. Kappler, and R. Krestel, "{A Semantic Wiki Approach to Cultural Heritage Data Management}", Language Technology for Cultural Heritage Data (LaTeCH 2008), Marrakech, Morocco, June 1st, 2008.

»

Converting a Historical Architecture Encyclopedia into a Semantic Knowledge Base

Submitted by rene on Mon, 2010-07-26 08:36

Witte, R., R. Krestel, T. Kappler, and P. C. Lockemann, "Converting a Historical Architecture Encyclopedia into a Semantic Knowledge Base", IEEE Intelligent Systems, vol. 25, no. 1, Los Alamitos, CA, USA : IEEE Computer Society, pp. 58--66, January/February, 2010.

»

Durm XML Markup

Durm
Corpora

The formal DTD used within the Durm Corpus is available for download. Here, we briefly describe the meaning of the various elements.

» READ MORE

Durm TUSTEP Markup

Durm
Corpora

Tustep in general is documented at http://www.zdv.uni-tuebingen.de/tustep/tustep_eng.html. Here, we only provide an informal overview for users of the TUSTEP version of our Durm Corpus.

» READ MORE

The Durm Corpus

As part of the Durm project, we digitized a single volume from the historical German Handbuch der Architektur (Handbook on Architecture), namely:

Scanned Page Fragment from Handbuch der Architetur
E. Marx: Wände und Wandöffnungen (Walls and Wall Openings). In "Handbuch der Architektur", Part III, Volume 2, Number I, Second edition, Stuttgart, Germany, 1900.
Contains 506 pages with 956 figures.

The corpus developed in this project is made available under a free document license in several formats: scanned page images, Tustep format, and XML format. Additionally, an online version and tools for transforming the various formats are available as well.

» READ MORE

The Durm Project

The Durm project, carried out from 2004-2006 at the Institute for Program Structures and Data Organization (IPD) at the University of Karlsruhe, Germany, investigated the use of advanced semantic technologies for cultural heritage data management. The goal was to support end users, in particular users from building history and architecture, with tools that go beyond classical information retrieval techniques. Experiments were carried out on the historical Handbuch der Architektur (Handbook on Architecture).

» READ MORE

Site Menu

User login

Upcoming events

Popular content

Today's:

All time:

Last viewed:

Current weather

Durm

New Book Chapter on Semantic Wikis and Natural Language Processing for Cultural Heritage Data

Integrating Wiki Systems, Natural Language Processing, and Semantic Technologies for Cultural Heritage Data Management

{An Integration Architecture for User-Centric Document Creation, Retrieval, and Analysis}

{Engineering a Semantic Desktop for Building Historians and Architects}

{A Semantic Wiki Approach to Cultural Heritage Data Management}

Converting a Historical Architecture Encyclopedia into a Semantic Knowledge Base

Durm XML Markup

Durm TUSTEP Markup

The Durm Corpus

The Durm Project

Tag Cloud

New Publications

Recent blog posts

New forum topics

Syndicate

Search

Semantic Assistants Durm Wiki Open Positions	Search this site: