in with

KorAP: FCS-QL

FCS-QL is a query language specifically developed to accomodate advanced search in CLARIN Federated Content Search (FCS) that allows searching through annotated data. Accordingly, FCS-QL is primarily intended to represent queries involving annotation layers such as part-of-speech and lemma. FCS-QL grammar is fairly similar to Poliqarp since it was built heavily based on Poliqarp/CQP.

In FCS-QL, foundries are called qualifiers. A combination of a foundry and a layer is separated with a colon, for example the lemma layer of TreeTagger is represented as tt:lemma. KorAP supports the following annotation layers for FCS-QL:

text
surface text
lemma
lemmatisation
pos
part-of-speech

Simple queries

Querying simple terms

"Semmel"

Querying regular expressions

"gie(ss|ß)en"

Querying case-insensitive terms

"essen"/c

Complex queries

Querying using layers

Querying a simple term using the layer for surface text

[text = "Semmel"]
[text = "essen"/c]

Querying adverbs from the default foundry.

[pos="ADV"]

Querying using qualifiers (foundries)

Querying adverbs annotated by OpenNLP

[opennlp:pos="ADV"]

Querying tokens with a lemma from TreeTagger

[tt:lemma = "leben"]

Querying using boolean operators

All tokens with lemma "leben" which are also finite verbs

[tt:lemma ="leben" & pos="VVFIN"]

All tokens with lemma "leben" which are also finite verbs or perfect participle

[tt:lemma ="leben" & (pos="VVFIN" | pos="VVPP")]

Sequence queries

Combining two terms in a sequence query

[opennlp:pos="ADJA"] "leben"

Empty token

Like in Poliqarp, an empty token is signified by [], which means any token. Due to the excessive number of results, the empty token is not allowed to be used independently but only in combination with other tokens, for instance in a sequence query.

[] "Wolke"

Negation

Similar to the empty token, negation is not allowed to be used independently due to the excessive number of results. However, it can be used in a sequence query.

[pos != "ADJA"] "Buch"

Querying using quantifier

Quantifiers indicate repetition of a term, for instance it can be used to search for exactly two consecutive occurrences of "die".

"die" {2}

Quantifiers are also useful to search for the occurrences of any tokens near other specific tokens, for instance two to three occurrences of any token between "wir" and "leben".

"wir" []{2,3} "leben"

Querying a term within a sentence

"Boot" within s