blob: e3ca6635a6fddc53fcc1a12f7cbe3fdd0ead39f1 [file] [log] [blame]
Nils Diewalda31a5152015-04-17 21:05:23 +00001% layout 'main', title => 'KorAP: Annis QL';
2
Akron1120a582017-10-17 12:29:16 +02003<h2 id="tutorial-top">Annis QL</h2>
Nils Diewalda31a5152015-04-17 21:05:23 +00004
margaretha0a3aeec2019-07-18 16:19:16 +02005<p><%= doc_ext_link_to 'ANNIS Query Language (Annis QL or AQL)', "https://corpus-tools.org/annis/aql.html" %>
6 is a query language of the <%= doc_ext_link_to 'ANNIS corpus search system', "https://corpus-tools.org/annis/" %>
7 designed particularly to deal with complex linguistic corpora with multiple
8 annotation layers (e.g. morphology) and various annotation types (e.g. attribute-value
9 pairs, relations). The concept of AQL is similar to searching node elements and edges
10 between them, where a node element can be a token or an attribute-value pair.</p>
11
12<p>KorAP supports the following keywords by using the <%= doc_link_to 'default foundries', 'data', 'annotation' %>: </p>
13<dl>
14 <dt><code>node</code></dt>
15 <dd>a node element</dd>
16 <dt><code>tok</code></dt>
17 <dd>a token</dd>
18 <dt><code>cat</code> or <code>c</code></dt>
19 <dd>a constituent</dd>
20 <dt><code>lemma</code> or <code>l</code></dt>
21 <dd>a lemma annotated node</dd>
22 <dt><code>pos</code> or <code>p</code></dt>
23 <dd>a part-of speech annotated node</dd>
24 <dt><code>m</code></dt>
25 <dd>a morphologically annotated node</dd>
26</dl>
27
28<blockquote class="warning">
29 <p>KorAP does not support in-query metadata constraints in AQL yet, namely the prefix &quot;meta::&quot;. In
30 KorAP, metadata constraints should be separated from search queries and be given as corpus
31 queries defining virtual corpora.</p>
32</blockquote>
33
34<section id="examples">
35 <h3>Node elements</h3>
36
37 <p>Simple tokens</p>
38 %= doc_query annis => '"liebe"', cutoff => 1
39
40 <p>Attribute-value pairs</p>
41 %= doc_query annis => 'tok="liebe"', cutoff => 1
42
43 <p>Namespaces in AQL are realized as foundry and layer combinations in KorAP. They can be used
44 to query tokens having a specific layer annotated by a specific parser (foundry), for
45 example coordinating conjunctions (part-of-speech layer) from the TreeTagger foundry.</p>
46 %= doc_query annis => 'tt/p="KON"', cutoff => 1
47
48 <h3>Regular expressions</h3>
49 %= doc_query annis => 'tok =/m.*keit/', cutoff => 1
50
51 <h3>Sequence queries</h3>
52 <p>Two consecutive tokens</p>
53 %= doc_query annis => '"der"."Bär"', cutoff => 1
54
55 <p>Finite verbs indirectly followed by an adverb, where any number of tokens may occur in
56 between.</p>
57 %= doc_query annis => 'pos="VVFIN" .* pos="ADV"', cutoff => 1
58
59 <h3>Negation</h3>
60 <p>Negation, such as negated tokens, is only supported in KorAP in a sequence query. </p>
61 %= doc_query annis => '"Katze" . pos != "VVFIN"', cutoff => 1
62
63 <h3>Pointing relations</h3>
64 <p>Pointing relations describe direct relationships between two node elements, for instance
65 dependency relations.</p>
66
67 <p>Querying all <code>&quot;SUBJ&quot;</code> dependency relations</p>
68 %= doc_query annis => 'node ->malt/d[func="SUBJ"] node', cutoff => 1
69
70 <p>Querying <code>&quot;SUBJ&quot;</code> dependency relations where the source node is token <code>&quot;ich&quot;</code></p>
71 %= doc_query annis => '"ich" ->malt/d[func="SUBJ"] node', cutoff => 1
72
73 <p>Querying <code>&quot;SUBJ&quot;</code> dependency relations where the source node is token
74 <code>&quot;ich&quot;</code> and the target node is a perfect participle</p>
75 %= doc_query annis => '"ich" ->malt/d[func="SUBJ"] pos="VVPP"', cutoff => 1
76
77 <h3>Using references</h3>
78 <p>Node elements may be refered to by using <code>#</code> and the position number of the element. For
79 instance, </p>
80 %= doc_query annis => '"ich" &amp; pos="VVPP" &amp; #1 ->malt/d[func="SUBJ"] #2', cutoff => 1
81 %= doc_query annis => '"ich" &amp; pos="VVPP" &amp; #1 . #2', cutoff => 1
82
83 %# Bug in Krill
84 %# <p>"ich" & pos="VVFIN" & #1 ->malt/d[func="SUBJ"] #2 & #1 . #2</p>
85
86 <blockquote class="warning">
87 <p>Unary operators like <code>arity</code> or <code>tokenarity</code> are not yet implemented in KorAP.</p>
88 </blockquote>
89
90
91 <!-- Not implemented in Krill yet
92
93 <h3>Unary operators</h3>
94 <dl>Arity</dl>
95 <dt>the number of children directly dominated by a node</dt>
96 <p>Querying adverbial phrases having exactly 2 direct childeren</p>
97 <p>cat="AVP" & #1:arity=2</p>
98
99 <dl>Tokenarity</dl>
100 <dt>the number of tokens within a node</dt>
101 <p>Querying adverbial phrases consisting of exactly 2 tokens</p>
102 <p>cat="AVP" & #1:tokenarity=2</p>
103
104 <h3>Searching within a tree</h3>
105 <h4>Dominance</h4>
106 <p>AQL describes hierarchical relations between nodes in a tree as a concept of dominance.
107 Node A dominates node B when A is located in a higher position than node B in a tree.
108 Moreover, A <strong>directly dominates</strong> B when A is located exactly above B
109 without any other nodes in between.</p>
110
111 <p>Direct dominance</p>
112 <p></p>
113
114 <p>Indirect dominance</p>
115 <p></p>
116
117 -->
118</section>