Preliminary Feature Freeze of the Lucene Backend

Today we reached a preliminary feature freeze of the the Lucene based backend of KorAP. We now provide a good portion of the functionality of Cosmas II, are close to fully support the Poliqarp Query Language, and introduced some quite nice novel features for corpus querying.
Although most of the missing features are already provided by the second backend (using a Neo4j graph database), implementing these features is scheduled for early next year, marking the final milestone before starting the alpha testing phase of the KorAP project.
In the next weeks we will prepare the frontend to support all new backend functionalities and start working on the distribution capabilities of the index.

Monitoring KorAP

As part of the monitoring functionalities of KorAP, I have implemented a framework to allow logging and tracing of user and service activities. Whereas that alone represents a fulfilment of a design requirement imposed by licence agreements of the containing data (text and annotations alike), the framework also includes the possibility of recording users’ query activity. Thus, KorAP’s Auditing framework allows the retrieval of query information such as the most frequently formulated queries or the data/level of annotations that are queried most frequently. Not only does this allow KorAP to create usage statistics concerning the underlying data set, but it also enables the developers to improve KorAP’s usability for feature releases according to the data retrieved.

For example, in the case of null query (queries with no result) tracking, the data can be used to extend and improve documentation or even find new traps in users’ understanding and application of query language constructs. All recorded information is subject to legal data protection and in case it is published, the user data will be anonymized. When in production mode, KorAP will inform the users about the extent of record keeping and the usage of the data.

The API was designed with the following design concepts in mind:

  • to be easily extendible for future developer convenience
  • separation of duties (Auditing takes place outside of the system logic via Spring proxy calls)