in with

KorAP: FCSQL

FCS-QL is a query language specifically developed to accomodate advanced search in Clarin Federated Content Search (FCS), that allows searching through annotated data. Accordingly, FCS-QL is primarily intended to represent queries involving annotation layers such as part-of-speech and lemma. FCS-QL grammar is fairly similar to Poliqarp since it was built heavily based on Poliqarp/CQP.

In FCS-QL, foundries are called qualifiers. A combination of a foundry and a layer is separated with a colon, for example the lemma layer of Tree Tagger is represented as tt:lemma. KorAP supports the following annotation layers for FCS-QL:

text
surface text
lemma
lemmatisation
pos
part-of-speech

Simple queries

Querying simple terms

"Semmel"

Querying regular expressions

"gie(ss|ß)en"

Querying case-insensitive terms

"essen"/c

Complex queries

Querying using layers

Querying a simple term using the layer for surface text

[text = "Semmel"]
[text = "essen"/c]

Querying adverbs from the default foundry.

[pos="ADV"]

Querying using qualifiers (foundries)

Querying adverbs annotated by Opennlp

[opennlp:pos="ADV"]

Querying tokens with a lemma from Tree tagger

[tt:lemma = "leben"]

Querying using boolean operators

All tokens with lemma "leben" which are also finite verbs

[tt:lemma ="leben" & pos="VVFIN"]

All tokens with lemma "leben" which are also finite verbs or perfect participle

[tt:lemma ="leben" & (pos="VVFIN" | pos="VVPP")]

Sequence queries

Combining two terms in a sequence query

[opennlp:pos="ADJA"] "leben"

Empty token

Like in Poliqarp, an empty token is signified by [] which means any token. Due to the excessive number of results, empty token is not allowed to be used independently, but in combination with other tokens, for instance in a sequence query.

[] "Wolke"

Negation

Similar to empty token, negation is not allowed to be used independently due to the excessive number of results. However, it can be used in a sequence query.

[pos != "ADJA"] "Buch"

Querying using quantifier

Quantifiers indicate repetition of a term, for instance it can be used to search for exactly two consecutive occurrences of "die".

"die" {2}

Quantifiers are also useful to search for the occurrences of any tokens near other specific tokens, for instance two to three occurrences of any token between "wir" and "leben".

"wir" []{2,3} "leben"

Querying a term within a sentence

"Boot" within s