KorAP: Annotations

KorAP provides access to multiple levels of annotations originating from multiple resources, so called foundries.

Base Foundry

The base foundry is available for all corpora and acts as a common ground for document structure annotation in the layer s.

s
Document structure supporting the spans: <base/s=s> for sentences, <base/s=p> for paragraphs, and <base/s=t> for the text span.
<base/s=s>

DeReKo (dereko)

DeReKo annotations provide the following layer for the dereko prefix:

s
Document structure as encoded in the I5 text document.
startsWith(<dereko/s=s>, Fragestunde)

CoreNLP (corenlp)

CoreNLP annotations provide the following layer for the corenlp prefix:

p
Part-of-speech information is written in capital letters and is based on STTS
c
Constituency information follows the annotations of the negr@ corpus.
ne
Contains named entities like I-PER, I-ORG etc.
ne_hgc_175m_600
See above
ne_dewac_175_175m_600
See above
[corenlp/ne_dewac_175m_600=I-ORG]

TreeTagger (tt)

TreeTagger annotations provide the following layer for the tt prefix:

l
All non-noun lemmas are written in lower case, nouns are written upper case. Composita stay intact (e.g. Normalbedingung)
p
All part-of-speech information is written in capital letters and is based on STTS
[tt/p=ADV]

Malt (malt)

Malt annotations provide the following layer for the malt prefix:

d
Dependency information
tt/p="PPOSAT" ->malt/d[func="DET"] node

OpenNLP (opennlp)

OpenNLP annotations provide the following layer for the opennlp prefix:

p
All part-of-speech information is written in capital letters and is based on STTS
[opennlp/p=PDAT]

Marmot (marmot)

Marmot annotations provide the following layer for the marmot prefix:

p
Part-of-speech information is written in capital letters and is based on STTS
m
Includes information about case (acc ...), degree (pos), gender (fem ...) etc.
[marmot/m=degree:sup & marmot/p=ADJA]

Default Foundries

For queries on specific layers without given foundries, KorAP provides default foundries. The default foundries apply to the following layers:

In the Lucene backend, the orth layer can only be bound to a specific foundry, as only one tokenization is supported.