KorAP: Annis QL
ANNIS Query Language (Annis QL or AQL) is a query language of the ANNIS corpus search system designed particularly to deal with complex linguistic corpora with multiple annotation layers (e.g. morphology) and various annotation types (e.g. attribute-value pairs, relations). The concept of AQL is similar to searching node elements and edges between them, where a node element can be a token or an attribute-value pair.
KorAP supports the following keywords by using the default foundries:
node
- a node element
tok
- a token
cat
orc
- a constituent
lemma
orl
- a lemma annotated node
pos
orp
- a part-of speech annotated node
m
- a morphologically annotated node
KorAP does not support in-query metadata constraints in AQL yet, namely the prefix "meta::". In KorAP, metadata constraints should be separated from search queries and be given as corpus queries defining virtual corpora.
Node elements
Simple tokens
"liebe"
Attribute-value pairs
tok="liebe"
Namespaces in AQL are realized as foundry and layer combinations in KorAP. They can be used to query tokens having a specific layer annotated by a specific parser (foundry), for example coordinating conjunctions (part-of-speech layer) from the TreeTagger foundry.
tt/p="KON"
Regular expressions
tok =/m.*keit/
Sequence queries
Two consecutive tokens
"der"."Bär"
Finite verbs indirectly followed by an adverb, where any number of tokens may occur in between.
pos="VVFIN" .* pos="ADV"
Negation
Negation, such as negated tokens, is only supported in KorAP in a sequence query.
"Katze" . pos != "VVFIN"
Pointing relations
Pointing relations describe direct relationships between two node elements, for instance dependency relations.
Querying all "SUBJ"
dependency relations
node ->malt/d[func="SUBJ"] node
Querying "SUBJ"
dependency relations where the source node is token "ich"
"ich" ->malt/d[func="SUBJ"] node
Querying "SUBJ"
dependency relations where the source node is token
"ich"
and the target node is a perfect participle
"ich" ->malt/d[func="SUBJ"] pos="VVPP"
Using references
Node elements may be refered to by using #
and the position number of the element. For
instance,
"ich" & pos="VVPP" & #1 ->malt/d[func="SUBJ"] #2
"ich" & pos="VVPP" & #1 . #2
Unary operators like
arity
ortokenarity
are not yet implemented in KorAP.