in with
Tutorial

Annotations

KorAP provides access to multiple levels of annotations originating from multiple resources, so called foundries.

Base Foundry

The base foundry is available for all corpora and acts as a common ground for document structure annotation in the layer s. It supports three types of spans: <base/s=s> for sentences, <base/s=p> for paragraphs, and <base/s=t> for the text span

<base/s=s>

Connexor (cnx)

Connexor annotations provide the following layer for the cnx prefix:

l
All lemmas are written in lower case. Composita are split, e.g. the token "Leitfähigkeit" is matched by the lemmas "leit" and "fähigkeit" - not by the lemma "leitfähigkeit"
p
Part-of-speech information is written in capital letters and is based on STTS
syn
Includes token based information like @PREMOD, @NH, @MAIN ...
m
Includes information about tense (PRES ...), mode (IND), number (PL ...) etc.
c
Only nominal phrases are available and all nominal phrases are written in lower case (np)
[cnx/p=CC]

CoreNLP (corenlp)

p
Part-of-speech information is written in capital letters and is based on STTS
c
Constituency information follows the annotations of the negr@ corpus.
ne
Contains named entities like I-PER, I-ORG etc.
ne_hgc_175m_600
See above
ne_dewac_175_175m_600
See above
[corenlp/ne_dewac_175m_600=I-ORG]

TreeTagger (tt)

l
All non-noun lemmas are written in lower case, nouns are written upper case. Composita stay intact (e.g. Normalbedingung)
p
All part-of-speech information is written in capital letters and is based on STTS
[tt/p=ADV]

Mate (mate)

l
All lemmas are written in lower case. Composita stay intact (e.g. buchstabenbezeichnung)
p
All part-of-speech information is written in capital letters and is based on STTS
m
Includes information about tense (tense:pres ...), mode (mood:ind), number (number:pl ...), gender (gender:masc ...) etc.
[mate/m=gender:fem]

OpenNLP (opennlp)

p
All part-of-speech information is written in capital letters and is based on STTS
[opennlp/p=PDAT]

Default Foundries

For queries on specific layers without given foundries, KorAP provides default foundries. The default foundries apply to the following layers:

In the Lucene backend, the orth layer can only be bound to a specific foundry, as only one tokenization is supported.