Lucene is probably THE state of the art tool for indexing text. It creates inverted indexes in which — in short — tokens are keys and their positions are the values so that any term can be looked up rapidly. Continue reading
The Lucene project has released version 3.6 today. Apart from some bug fixes, it provides mainly improvement in text processing. These are features from which KorAP does not profit very much. But in addition, several bugs have been fixed, full Java 7 support is introduced, and the Finite State Transducers applied for certain queries have been improved.