We are happy to announce the open source release of Krill, the Lucene-based search backend for KorAP! Krill is the reference implementation for KoralQuery, covering most of the protocols features, including …
- Fulltext search
- Token-based annotation search
- Span-based annotation search
- Distance search
- Positional search
- Nested queries
… and many more!
You can download Krill on GitHub – feedback and contributions are very welcome!
Shortly after the release of Lucene version 4.1, KorAP has updated. The update has been smooth as none of the changes to API have been relevant for the KorAP applications.
Today, the Lucene team has announced the release of Lucene version 4.0. We have been working on migrating our Lucene-based code to Lucene 4.0 since the alpha has been released in July this year. Many thanks to all the Lucene developers for another great piece of open source software! Continue reading
The Proceedings of the Konvens 2012 conference (The 11th Conference on Natural Language Processing) are now online, including the paper “Using information retrieval technology for a corpus analysis platform” that has been published within KorAP.
We are happy to report that we submitted a paper titled “Using Information Retrieval Technology for a Corpus Analysis Platform” for the Konvens 2012 (The 11th Conference on Natural Language Processing) yesterday!
Lucene is probably THE state of the art tool for indexing text. It creates inverted indexes in which — in short — tokens are keys and their positions are the values so that any term can be looked up rapidly. Continue reading
The Lucene project has released version 3.6 today. Apart from some bug fixes, it provides mainly improvement in text processing. These are features from which KorAP does not profit very much. But in addition, several bugs have been fixed, full Java 7 support is introduced, and the Finite State Transducers applied for certain queries have been improved.