Corpus Retrieval Index looking up with Lucene
Krill is a Lucene based search engine for large annotated corpora, developed at the Institute for German Language (IDS) in Mannheim, Germany.
Krill is the reference implementation for the KoralQuery protocol, covering most of its query features, including ...
"Find all occurrences of the phrase 'sea monster'!"
"Find all case-insensitive words matching the regular expression /krak.*/"
"Find all plural nouns in accusative!"
"Find all nominal phrases!"
...
...
...
"Find all words marked as a noun by TreeTagger and marked as an adjective by CoreNLP](https://github.com/stanfordnlp/CoreNLP)!"
Virtual Collections; partial highlightings; Support for overlapping spans; relational queries; hierarchical queries ...
...
$ git clone https://github.com/KorAP/Krill $ cd Krill
To run the test suite, type in ...
$ mvn test
To start the server, type in ...
$ mvn compile exec:java
To compile and run the indexer, type ...
$ mvn compile assembly:single
$ java -jar target/KorAP-krill-X.XX.jar src/main/resources/korap.conf src/test/resources/examples/
Authors: Nils Diewald, Eliza Margaretha
Copyright 2013-2015, IDS Mannheim, Germany
Krill is developed as part of the KorAP Corpus Analysis Platform at the Institute for German Language (IDS).
For recent changes and compatibility issues, please consult the Changes file.
Krill is published under the BSD-2 License.
To cite this work, please ...
Named entities annotated in the test data by CoreNLP were using models based on:
Manaal Faruqui and Sebastian Padó (2010): Training and Evaluating a German Named Entity Recognizer with Semantic Generalization, Proceedings of KONVENS 2010, Saarbrücken, Germany