Skip navigation.
Home
Semantic Software Lab
Concordia University
Montréal, Canada

Software Engineering

{Fuzzy Extensions for Reverse Engineering Repository Models}

Kölsch, U., and R. Witte, "{Fuzzy Extensions for Reverse Engineering Repository Models}", Proceedings of the 10th Working Conference on Reverse Engineering (WCRE 2003), Victoria, BC, Canada : IEEE, pp. 113–122, November 13–16, 2003.

Agents and Databases: Friends or Foes?

Lockemann, P. C., and R. Witte, "Agents and Databases: Friends or Foes?", Ninth International Database Engineering and Applications Symposium (IDEAS 2005), Montréal, Québec, Canada, pp. 137–147, July 25–27, 2005.

{Ontological Text Mining of Software Documents}

Witte, R., Q. Li, Y. Zhang, and J. Rilling, "{Ontological Text Mining of Software Documents}", NLDB, vol. 4592, CNAM, Paris, France : Springer, pp. 168–180, June 27–29, 2007.

Semantic Technologies in System Maintenance

Rilling, J., R. Witte, D. Gasevic, and J. Z. Pan, "Semantic Technologies in System Maintenance", The 16th IEEE International Conference on Program Comprehension (ICPC 2008), Amsterdam, The Netherlands : IEEE, pp. 279--282, June, 2008.

{SE-Advisor}

Rilling, J., P. Schuegerl, P. Charland, and R. Witte, "{SE-Advisor}", CASCON 2008 Technical Showcase, Richmond Hill, Ontario, Canada, October 27–30, 2008.

New Javadoc Doclet for NLP Analysis on Java Source Code

For those interested in performing NLP on source code, in particular Javadoc comments, we just released a Doclet at the NLP Frameworks workshop last week.

Its main feature is that it creates an XML corpus from Java source code that is optimised for processing in an NLP Framework (GATE in our case, but it should work for any framework that takes XML as input).

The Javadoc NLP Corpus Generation Doclet

This page describes the process of generating a corpus from source code and source code comments using Javadoc. The SSLDoclet is a custom doclet that is passed as a parameter to Javadoc in order to create an Abstract Syntax Tree (AST) that can be used as a corpus within NLP frameworks such as GATE.

Syndicate content