in with

KorAP: Corpus

Data Basis

The basis of this KorAP instance is the corpus DeReKo-KorAP-2026-I, a virtual subcorpus of the corresponding version of the German Reference Corpus DeReKo, maintained by the Corpus Development Project at IDS.

DeReKo-KorAP-2026-I contains, in addition to the W-Gesamt archive from COSMAS II, corpora that improve coverage regarding region or text type and/or are required by the Council for German Orthography.

A detailed overview of the composition of DeReKo-KorAP-2026-I can be found on the Composition by Source page.

Virtual Subcorpora

DeReKo and KorAP invite their users to define their own virtual subcorpus that is as representative as possible with regard to their current research question and intended language domain.

Frequently Used Subcorpora
Some frequently used collections are available predefined. You can adapt all definitions to your needs.
Persistent Virtual Corpora
Persistently defined virtual subcorpora, mostly also accessible via COSMAS II.
Your Suggestions
If you would like to make your own virtual subcorpus persistent or suggest additions, simply contact us via email.

How to Cite?

Information on citing the data basis and the KorAP analysis platform can be found at Citation Help.