KorAP: FCS-QL
FCS-QL is a query language specifically developed to accomodate advanced search in CLARIN Federated Content Search (FCS) that allows searching through annotated data. Accordingly, FCS-QL is primarily intended to represent queries involving annotation layers such as part-of-speech and lemma. FCS-QL grammar is fairly similar to Poliqarp since it was built heavily based on Poliqarp/CQP.
In FCS-QL, foundries are called qualifiers. A combination of a foundry and a layer is
separated with a colon, for example the lemma layer of TreeTagger is represented as
tt:lemma
. KorAP supports the following annotation layers for FCS-QL:
- text
- surface text
- lemma
- lemmatisation
- pos
- part-of-speech
Simple queries
Querying simple terms
"Semmel"
Querying regular expressions
"gie(ss|ß)en"
Querying case-insensitive terms
"essen"/c
Complex queries
Querying using layers
Querying a simple term using the layer for surface text
[text = "Semmel"]
[text = "essen"/c]
Querying adverbs from the default foundry.
[pos="ADV"]
Querying using qualifiers (foundries)
Querying adverbs annotated by OpenNLP
[opennlp:pos="ADV"]
Querying tokens with a lemma from TreeTagger
[tt:lemma = "leben"]
Querying using boolean operators
All tokens with lemma "leben"
which are also finite verbs
[tt:lemma ="leben" & pos="VVFIN"]
All tokens with lemma "leben"
which are also finite verbs or perfect participle
[tt:lemma ="leben" & (pos="VVFIN" | pos="VVPP")]
Sequence queries
Combining two terms in a sequence query
[opennlp:pos="ADJA"] "leben"
Empty token
Like in Poliqarp, an empty token is signified by []
,
which means any token. Due to the
excessive number of results, the empty token is not allowed to be used independently but only in
combination with other tokens, for instance in a sequence query.
[] "Wolke"
Negation
Similar to the empty token, negation is not allowed to be used independently due to the excessive number of results. However, it can be used in a sequence query.
[pos != "ADJA"] "Buch"
Querying using quantifier
Quantifiers indicate repetition of a term, for instance it can be used to search for
exactly two consecutive occurrences of "die"
.
"die" {2}
Quantifiers are also useful to search for the occurrences of any tokens near other
specific tokens, for instance two to three occurrences of any token between "wir"
and
"leben"
.
"wir" []{2,3} "leben"
Querying a term within a sentence
"Boot" within s