Blame - templates/doc/ql/fcsql.html.ep - KorAP/Kalamar

blob: f7e50c182ab677cd803893814762668f2e59ecb3 [file] [log] [blame]

Rebecca Wilm	4ce37af	2024-09-17 12:06:42 +0200	[diff] [blame]	1	% layout 'main', title => 'KorAP: FCS-QL';
Akron	ff7811f	2017-12-19 12:40:41 +0100	[diff] [blame]	2
Akron	9490e3b	2019-10-17 12:26:29 +0200	[diff] [blame]	3	%= page_title
Akron	ff7811f	2017-12-19 12:40:41 +0100	[diff] [blame]	4
margaretha	14ce4d6	2019-07-17 18:38:45 +0200	[diff] [blame]	5	<p>FCS-QL is a query language specifically developed to accomodate advanced search in
Rebecca Wilm	4ce37af	2024-09-17 12:06:42 +0200	[diff] [blame]	6	<%= ext_link_to 'CLARIN Federated Content Search (FCS)', "https://www.clarin.eu/content/federated-content-search-clarin-fcs" %>
margaretha	14ce4d6	2019-07-17 18:38:45 +0200	[diff] [blame]	7	that allows searching through annotated data.
				8	Accordingly, FCS-QL is primarily intended to represent queries involving annotation layers
Akron	3cfa26d	2019-10-24 15:17:34 +0200	[diff] [blame]	9	such as part-of-speech and lemma. FCS-QL grammar is fairly similar to <%= embedded_link_to 'doc', 'Poliqarp', 'ql', 'poliqarp-plus' %> since it was
margaretha	14ce4d6	2019-07-17 18:38:45 +0200	[diff] [blame]	10	built heavily based on Poliqarp/CQP.</p>
				11
				12	<p>In FCS-QL, foundries are called qualifiers. A combination of a foundry and a layer is
Rebecca Wilm	4ce37af	2024-09-17 12:06:42 +0200	[diff] [blame]	13	separated with a colon, for example the lemma layer of TreeTagger is represented as
margaretha	14ce4d6	2019-07-17 18:38:45 +0200	[diff] [blame]	14	<code>tt:lemma</code>. KorAP supports the following annotation layers for FCS-QL:</p>
				15
				16	<dl>
				17	<dt>text</dt>
				18	<dd>surface text</dd>
				19	<dt>lemma</dt>
				20	<dd>lemmatisation</dd>
				21	<dt>pos</dt>
				22	<dd>part-of-speech</dd>
				23	</dl>
				24
				25	<section id="simple-queries">
				26	<h3>Simple queries</h3>
				27	<p>Querying simple terms</p>
				28	%= doc_query fcsql => '"Semmel"', cutoff => 1
				29
				30	<p>Querying regular expressions</p>
				31	%= doc_query fcsql => '"gie(ss\|ß)en"', cutoff => 1
				32
				33	<p>Querying case-insensitive terms</p>
				34	%= doc_query fcsql => '"essen"/c', cutoff => 1
				35	</section>
				36
				37	<section id="complex-queries">
				38	<h3>Complex queries</h3>
				39
				40	<h4>Querying using layers</h4>
				41
				42	<p>Querying a simple term using the layer for surface text</p>
				43	%= doc_query fcsql => '[text = "Semmel"]', cutoff => 1
				44	%= doc_query fcsql => '[text = "essen"/c]', cutoff => 1
				45
Akron	3cfa26d	2019-10-24 15:17:34 +0200	[diff] [blame]	46	<p>Querying adverbs from the <%= embedded_link_to 'doc', 'default foundry', 'data', 'annotation' %>.</p>
margaretha	14ce4d6	2019-07-17 18:38:45 +0200	[diff] [blame]	47	%= doc_query fcsql => '[pos="ADV"]', cutoff => 1
				48
				49
				50	<h4>Querying using qualifiers (foundries)</h4>
				51
Rebecca Wilm	4ce37af	2024-09-17 12:06:42 +0200	[diff] [blame]	52	<p>Querying adverbs annotated by OpenNLP</p>
margaretha	14ce4d6	2019-07-17 18:38:45 +0200	[diff] [blame]	53	%= doc_query fcsql => '[opennlp:pos="ADV"]', cutoff => 1
				54
Rebecca Wilm	4ce37af	2024-09-17 12:06:42 +0200	[diff] [blame]	55	<p>Querying tokens with a lemma from TreeTagger</p>
margaretha	14ce4d6	2019-07-17 18:38:45 +0200	[diff] [blame]	56	%= doc_query fcsql => '[tt:lemma = "leben"]', cutoff => 1
				57
				58
				59	<h4>Querying using boolean operators</h4>
				60
				61	<p>All tokens with lemma <code>"leben"</code> which are also finite verbs</p>
				62	%= doc_query fcsql => '[tt:lemma ="leben" & pos="VVFIN"]', cutoff => 1
				63
				64	<p>All tokens with lemma <code>"leben"</code> which are also finite verbs or perfect participle</p>
				65	%= doc_query fcsql => '[tt:lemma ="leben" & (pos="VVFIN" \| pos="VVPP")]', cutoff => 1
				66
				67
				68	<h4>Sequence queries</h4>
				69
				70	<p>Combining two terms in a sequence query</p>
				71	%= doc_query fcsql => '[opennlp:pos="ADJA"] "leben"', cutoff => 1
				72
				73
				74	<h4>Empty token</h4>
Rebecca Wilm	4ce37af	2024-09-17 12:06:42 +0200	[diff] [blame]	75	<p>Like in <%= embedded_link_to 'doc', 'Poliqarp', 'ql', 'poliqarp-plus' %>, an empty token is signified by <code>[]</code>,
margaretha	14ce4d6	2019-07-17 18:38:45 +0200	[diff] [blame]	76	which means any token. Due to the
Rebecca Wilm	4ce37af	2024-09-17 12:06:42 +0200	[diff] [blame]	77	excessive number of results, the empty token is not allowed to be used independently but only in
margaretha	14ce4d6	2019-07-17 18:38:45 +0200	[diff] [blame]	78	combination with other tokens, for instance in a sequence query.</p>
				79	%= doc_query fcsql => '[] "Wolke"', cutoff => 1
				80
				81
				82	<h4>Negation</h4>
Rebecca Wilm	4ce37af	2024-09-17 12:06:42 +0200	[diff] [blame]	83	<p>Similar to the empty token, negation is not allowed to be used independently due to the
margaretha	14ce4d6	2019-07-17 18:38:45 +0200	[diff] [blame]	84	excessive number of results. However, it can be used in a sequence query.</p>
				85	%= doc_query fcsql => '[pos != "ADJA"] "Buch"', cutoff => 1
				86
				87
				88	<h4>Querying using quantifier</h4>
				89	<p>Quantifiers indicate repetition of a term, for instance it can be used to search for
				90	exactly two consecutive occurrences of <code>"die"</code>.</p>
				91	%= doc_query fcsql => '"die" {2}', cutoff => 1
				92
				93	<p>Quantifiers are also useful to search for the occurrences of any tokens near other
				94	specific tokens, for instance two to three occurrences of any token between <code>"wir"</code> and
				95	<code>"leben"</code>.</p>
				96	%= doc_query fcsql => '"wir" []{2,3} "leben"', cutoff => 1
				97
				98
				99	<h4>Querying a term within a sentence</h4>
				100	%= doc_query fcsql => '"Boot" within s', cutoff => 1
				101
				102	</section>