KorAP: Annotations
KorAP provides access to multiple levels of annotations originating from multiple resources, so called foundries.
Base Foundry
The base foundry is available for all corpora and acts as a common ground for document structure annotation in the layer s.
- s
- Document structure supporting the spans:
<base/s=s>for sentences,<base/s=p>for paragraphs, and<base/s=t>for the text span.
<base/s=s>
DeReKo (dereko)
DeReKo annotations provide the following layer for the dereko prefix:
- s
- Document structure as encoded in the I5 text document.
startsWith(<dereko/s=s>, Fragestunde)
CoreNLP (corenlp)
CoreNLP annotations provide the following layer for the corenlp prefix:
- p
- Part-of-speech information is written in capital letters and is based on STTS
- c
- Constituency information follows the annotations of the negr@ corpus.
- ne
- Contains named entities like
I-PER,I-ORGetc. - ne_hgc_175m_600
- See above
- ne_dewac_175_175m_600
- See above
[corenlp/ne_dewac_175m_600=I-ORG]
TreeTagger (tt)
TreeTagger annotations provide the following layer for the tt prefix:
- l
- All non-noun lemmas are written in lower case, nouns are written upper case. Composita stay intact (e.g.
Normalbedingung) - p
- All part-of-speech information is written in capital letters and is based on STTS
[tt/p=ADV]
Malt (malt)
Malt annotations provide the following layer for the malt prefix:
- d
- Dependency information
tt/p="PPOSAT" ->malt/d[func="DET"] node
OpenNLP (opennlp)
OpenNLP annotations provide the following layer for the opennlp prefix:
- p
- All part-of-speech information is written in capital letters and is based on STTS
[opennlp/p=PDAT]
Marmot (marmot)
Marmot annotations provide the following layer for the marmot prefix:
- p
- Part-of-speech information is written in capital letters and is based on STTS
- m
- Includes information about case (
acc...), degree (pos), gender (fem...) etc.
[marmot/m=degree:sup & marmot/p=ADJA]
Default Foundries
For queries on specific layers without given foundries, KorAP provides default foundries. The default foundries apply to the following layers:
- orth:
opennlp - lemma:
tt - pos:
tt
In the Lucene backend, the
orthlayer can only be bound to a specific foundry, as only one tokenization is supported.