blob: 095d62209b0ff18043faada08012d97ff3001e1f [file] [log] [blame]
Nils Diewalda31a5152015-04-17 21:05:23 +00001% layout 'main', title => 'KorAP: Regular Expressions';
2
Akron1120a582017-10-17 12:29:16 +02003<h2 id="tutorial-top">Regular Expressions</h2>
Nils Diewalda31a5152015-04-17 21:05:23 +00004
Akronf8715a32019-07-19 13:26:03 +02005<p>Regular expressions are patterns describing a set of strings.</p>
6<p>In the KorAP backend a wide range of operators is supported, but only the following are guaranteed to be stable throughout the system:</p>
7
8<section id="quantifiers">
9 <h3>Operators</h3>
10 <dl>
11 <dt><code>.</code> - Any</dt>
12 <dd>Any symbol</dd>
13 <dt><code>()</code> - Group</dt>
14 <dd>Create a group of operands</dd>
15 <dt><code>|</code> - Alternation</dt>
16 <dd>Create alternative operands</dd>
17 <dt><code>[]</code> - Character Class</dt>
18 <dd>Group alternative characters</dd>
19 <dt><code>\</code> - Escape symbol</dt>
20 <dd>Mark the following character to be interpreted as verbatim, when the character is special (i.e. an operator or quantifier)</dd>
21 </dl>
22
23 %= doc_query poliqarp => '".eine" Frau', cutoff => 1
24 %= doc_query poliqarp => '"Fr..de"', cutoff => 1
25 %= doc_query poliqarp => '"Fr(ie|eu)de" []{,3} Eierkuchen', cutoff => 1
26 %= doc_query poliqarp => '"Fre[um]de"', cutoff => 1
27 %= doc_query poliqarp => '"b.w\."', cutoff => 1
28</section>
29
30<section id="quantifiers">
31 <h3>Quantifiers</h3>
32
33 <p>Operands in regular expressions can be quantified,
34 meaning they are allowed to occur consecutively a specified number of times.
35 The following quantifieres are supported:</p>
36
37 <dl>
38 <dt><code>?</code></dt>
39 <dd>Match 0 or 1 times</dd>
40 <dt><code>*</code></dt>
41 <dd>Match 0 or more times</dd>
42 <dt><code>+</code></dt>
43 <dd>Match 1 or more times</dd>
44 <dt><code>{n}</code></dt>
45 <dd>Match <code>n</code> times</dd>
46 <dt><code>{n,}</code></dt>
47 <dd>Match at least <code>n</code> times</dd>
48 <dt><code>{n,m}</code></dt>
49 <dd>Match at least <code>n</code> times but no more than <code>m</code> times</dd>
50 </dl>
51 %= doc_query poliqarp => '"Schif+ahrt"', cutoff => 1
52 %= doc_query poliqarp => '"kl?eine" Kinder', cutoff => 1
53 %= doc_query poliqarp => '"Schlos{2,3}traße"', cutoff => 1
54 %= doc_query poliqarp => '"Rha(bar){2}"', cutoff => 1
55</section>