| Akron | 84b9199 | 2019-07-16 11:35:49 +0200 | [diff] [blame] | 1 | % layout 'main', title => 'KorAP: COSMAS II'; | 
| Nils Diewald | a31a515 | 2015-04-17 21:05:23 +0000 | [diff] [blame] | 2 |  | 
| Akron | 9490e3b | 2019-10-17 12:26:29 +0200 | [diff] [blame] | 3 | %= page_title | 
| Nils Diewald | a31a515 | 2015-04-17 21:05:23 +0000 | [diff] [blame] | 4 |  | 
| Akron | 9490e3b | 2019-10-17 12:26:29 +0200 | [diff] [blame] | 5 | <p>The following documentation introduces some features provided by our version of the COSMAS II Query Language. For more information, please visit the <%= ext_link_to 'online help of COSMAS II', "http://www.ids-mannheim.de/cosmas2/web-app/hilfe/suchanfrage/eingabe-zeile/syntax/allgemein.html" %>.</p> | 
| Akron | 84b9199 | 2019-07-16 11:35:49 +0200 | [diff] [blame] | 6 |  | 
|  | 7 | <section id="queryterms"> | 
|  | 8 | <h3>Query Terms</h3> | 
|  | 9 |  | 
|  | 10 | <p>A query term in COSMAS II can be a word, a punctuation symbol, or a number.</p> | 
|  | 11 |  | 
|  | 12 | %= doc_query cosmas2 => 'Baum' | 
|  | 13 | %= doc_query cosmas2 => '4000' | 
|  | 14 |  | 
|  | 15 | <blockquote class="missing"> | 
|  | 16 | <p>Currently, punctuations are not supported by KorAP.</p> | 
|  | 17 | </blockquote> | 
|  | 18 |  | 
|  | 19 | <h4>Placeholder Operators</h4> | 
|  | 20 |  | 
|  | 21 | <p>In addition query terms can contain multiple placeholders like <code>?</code> (for any symbol), <code>+</code> (for any or no symbol), or <code>*</code> (for any sequence of any or no symbols).</p> | 
|  | 22 | <%= doc_query cosmas2 => 'Bau?m' %> | 
|  | 23 | <%= doc_query cosmas2 => 'Bau+m' %> | 
|  | 24 | <%= doc_query cosmas2 => 'Bau*m' %> | 
|  | 25 |  | 
|  | 26 | %# TODO: | 
|  | 27 | %#  <p>To escape placeholder symbols (i.e. to prevent these symbols from being interpreted as placeholders), they need to be prepended by a <code>\</code> symbol.</p> | 
|  | 28 | %#  <%= doc_query cosmas2 => 'Student\*in' %> | 
|  | 29 | %#  <p>To escape the backslash symbol, another backslash is required (<code>\\</code>).</p> | 
|  | 30 |  | 
|  | 31 | <h4>Lemma Operator</h4> | 
|  | 32 |  | 
| Akron | 9490e3b | 2019-10-17 12:26:29 +0200 | [diff] [blame] | 33 | <p>Instead of searching for the surface form of a word, a lemma (as annotated by the <%= embedded_link_to 'default foundry', 'data', 'annotation' %>) can be requested by prepending the term with the <code>&</code> operator. The form of the lemma is dependent on the annotation.</p> | 
| Akron | 84b9199 | 2019-07-16 11:35:49 +0200 | [diff] [blame] | 34 | <%= doc_query cosmas2 => '&laufen' %> | 
|  | 35 |  | 
|  | 36 | <h4>Case Insensitivity Operator</h4> | 
|  | 37 |  | 
|  | 38 | <p>By prepending the term with a <code>$</code> symbol, the search is case insensitive.</p> | 
|  | 39 | <%= doc_query cosmas2 => '$Lauf' %> | 
|  | 40 |  | 
|  | 41 | <h4>Regular Expression Operator</h4> | 
|  | 42 |  | 
| Akron | 9490e3b | 2019-10-17 12:26:29 +0200 | [diff] [blame] | 43 | <p>By using the <code>#REG(...)</code> operator, query terms can be formulated using <%= embedded_link_to 'regular expressions', 'ql', 'regexp' %>.</p> | 
| Akron | 84b9199 | 2019-07-16 11:35:49 +0200 | [diff] [blame] | 44 |  | 
|  | 45 |  | 
|  | 46 | <blockquote class="bug"> | 
| Akron | 9490e3b | 2019-10-17 12:26:29 +0200 | [diff] [blame] | 47 | <p>Regular expressions in COSMAS II are not yet properly implemented in KorAP. If you want to use regular expressions, please refer to <%= embedded_link_to 'Poliqarp', 'ql', 'poliqarp-plus#regexp' %>.</p> | 
| Akron | 84b9199 | 2019-07-16 11:35:49 +0200 | [diff] [blame] | 48 | </blockquote> | 
|  | 49 |  | 
|  | 50 | </section> | 
|  | 51 |  | 
|  | 52 | <section id="logical-operators"> | 
|  | 53 | <h3>Logical Operators</h3> | 
|  | 54 |  | 
|  | 55 | <p>Query terms can be combined in logical operations, using the operators <code>and</code>, <code>or</code>, and <code>not</code>. The german forms are supported as well: <code>und</code>, <code>oder</code> and <code>nicht</code>.</p> | 
|  | 56 | <p>These operators work on the text level, so the following query returns matches for all occurrences where both terms occur anywhere in the same text.</p> | 
|  | 57 | <%= doc_query cosmas2 => 'anscheinend und scheinbar' %> | 
|  | 58 |  | 
|  | 59 | <p>The following query returns matches for all occurrences where at least one of the terms occur anywhere in the text.</p> | 
|  | 60 | <%= doc_query cosmas2 => 'anscheinend oder scheinbar' %> | 
|  | 61 |  | 
|  | 62 | <p>The following query returns matches for all occurrences of the first term, where the term following the <code>nicht</code> operator does not occur anywhere in the same text.</p> | 
|  | 63 | <%= doc_query cosmas2 => 'Kegel nicht Kind' %> | 
|  | 64 |  | 
|  | 65 | <p>To escape terms for logical operators (i.e. to prevent these terms from being interpreted as logical operators), they need to be surrounded by quotations.</p> | 
|  | 66 | <%= doc_query cosmas2 => 'Mann "und" Maus' %> | 
|  | 67 |  | 
|  | 68 | </section> | 
|  | 69 |  | 
|  | 70 |  | 
|  | 71 | <section id="distance-operators"> | 
|  | 72 | <h3>Distance Operators</h3> | 
|  | 73 |  | 
|  | 74 | <p>Distance operators allow you to search for two operands (search terms or complex search operations) that occur or don't occur at a certain distance from each other in a text. When the two operands should occur together (the operator is prepended by a <code>/</code> symbol), both operands are in the result set. When they shouldn't occur together (the operator is prepended by a <code>%</code> symbol), only the first operand is in the result set.</p> | 
|  | 75 |  | 
|  | 76 | <p>Distance operators accept a prefixing direction parameter. | 
|  | 77 | By prepending the operator with a <code>+</code> symbol (e.g. in <code>/+s0</code>), the second operand is required to occur or not occur after the first operand. | 
|  | 78 | By prepending the operator with a <code>-</code> symbol (e.g. in <code>/-s0</code>), the second operand is required to occur or not occur in front of the first operand. | 
|  | 79 | In case the direction parameter is omitted, the direction of both operands is arbitrary.</p> | 
|  | 80 |  | 
|  | 81 | <p>Distance operators accept the definition of a distance interval by appending numerical values. If only a single numerical value is given (e.g. in <code>/+s4</code>), the defined distance is considered a maximum distance. So both operands can or can not occur in a distance equal or lower the given value. If two numerical values are given separated by the <code>:</code> symbol (e.g. in <code>/+s4:2</code>), they define an interval, in which the distance is valid.</p> | 
|  | 82 |  | 
|  | 83 | %#  <blockquote class="warning"> | 
| Akron | 9490e3b | 2019-10-17 12:26:29 +0200 | [diff] [blame] | 84 | %#    <p>Currently, intervals are interpreted as MIN:MAX only, while COSMAS 2 defines intervals as being MAX:MIN, while taking the smaller number as being the minimum value of the interval and the greater number as being the maximum value of the interval. <%= ext_link_to 'KorAP will adopt the behaviour of COSMAS II in the near future', "https://github.com/KorAP/Koral/issues/67" %>.</p> | 
| Akron | 84b9199 | 2019-07-16 11:35:49 +0200 | [diff] [blame] | 85 | %#  </blockquote> | 
|  | 86 |  | 
| Akron | 9490e3b | 2019-10-17 12:26:29 +0200 | [diff] [blame] | 87 | <p>Distance operators rely on the <%= embedded_link_to 'default foundry', 'data', 'annotation' %> annotation for document structures.</p> | 
| Akron | 84b9199 | 2019-07-16 11:35:49 +0200 | [diff] [blame] | 88 |  | 
|  | 89 | <h4>Word Distance Operator</h4> | 
|  | 90 |  | 
|  | 91 | <p>The word distance operator <code>w</code> defines how many words are allowed or are not allowed in-between two search operands.</p> | 
|  | 92 |  | 
|  | 93 | <p>Search for two operands with up to 4 words in-between in arbitrary order:</p> | 
|  | 94 | %= doc_query cosmas2 => 'Gegenwart /w4 Zukunft' | 
|  | 95 |  | 
|  | 96 | <p>Search for two operands with 3 to 4 words in-between with the first operand preceeding the second one:</p> | 
|  | 97 | %= doc_query cosmas2 => 'Gegenwart /+w4:3 Zukunft' | 
|  | 98 |  | 
|  | 99 | <p>Search for two consecutive operands in the given order:</p> | 
|  | 100 | %= doc_query cosmas2 => 'Gegenwart /+w1:1 Zukunft' | 
|  | 101 |  | 
|  | 102 | <p>Search for a first operand that is neither preceded nor suceeded by a second operand:</p> | 
|  | 103 | %= doc_query cosmas2 => 'Gegenwart %w1 die' | 
|  | 104 |  | 
|  | 105 | <h4>Sentence Distance Operator</h4> | 
|  | 106 |  | 
|  | 107 | <p>The sentence distance operator <code>s</code> defines how many sentences are allowed or are not allowed in-between two search operands.</p> | 
| Akron | 9490e3b | 2019-10-17 12:26:29 +0200 | [diff] [blame] | 108 | <p>The sentence distance relies on the <%= embedded_link_to 'default foundry', 'data', 'annotation' %> annotation for document structures.</p> | 
| Akron | 84b9199 | 2019-07-16 11:35:49 +0200 | [diff] [blame] | 109 |  | 
|  | 110 | <p>Search for two operands occuring in the same or a following sentence in arbitrary order:</p> | 
|  | 111 | %= doc_query cosmas2 => 'offen /s1 Geschäft' | 
|  | 112 |  | 
|  | 113 | <p>Search for two operands occuring in the same sentence with the first operand preceeding the second one:</p> | 
|  | 114 | %= doc_query cosmas2 => 'offen /+s0 Geschäft' | 
|  | 115 |  | 
|  | 116 | <p>Search for a first operand that does not occur with a second operand in the same sentence:</p> | 
|  | 117 | %= doc_query cosmas2 => 'Gegenwart %s0 Zukunft' | 
|  | 118 |  | 
|  | 119 | <h4>Paragraph Distance Operator</h4> | 
|  | 120 |  | 
|  | 121 | <p>The paragraph distance operator <code>p</code> defines how many paragraphs are allowed or are not allowed in-between two search operands.</p> | 
| Akron | 9490e3b | 2019-10-17 12:26:29 +0200 | [diff] [blame] | 122 | <p>The paragraph distance relies on the <%= embedded_link_to 'default foundry', 'data', 'annotation' %> annotation for document structures.</p> | 
| Akron | 84b9199 | 2019-07-16 11:35:49 +0200 | [diff] [blame] | 123 |  | 
|  | 124 | <p>Search for two operands occuring in the same or a following paragraph in arbitrary order:</p> | 
|  | 125 | %= doc_query cosmas2 => 'offen /p1 Geschäft' | 
|  | 126 |  | 
|  | 127 | <p>Search for two operands occuring in the same paragraph with the first operand preceeding the second one:</p> | 
|  | 128 | %= doc_query cosmas2 => 'offen /+p0 Geschäft' | 
|  | 129 |  | 
|  | 130 | <p>Search for a first operand that does not occur with a second operand in the same paragraph:</p> | 
|  | 131 | %= doc_query cosmas2 => 'Gegenwart %p0 Zukunft' | 
|  | 132 |  | 
|  | 133 | <blockquote class="warning"> | 
|  | 134 | <p>The KWIC result of including paragraph distance queries will likely exceed the supported maximum length of matches in KorAP and will therefore be cut.</p> | 
|  | 135 | </blockquote> | 
|  | 136 |  | 
|  | 137 | <h4>Multiple Distance Operators</h4> | 
|  | 138 |  | 
| Akron | 9490e3b | 2019-10-17 12:26:29 +0200 | [diff] [blame] | 139 | %= under_construction | 
| Akron | 84b9199 | 2019-07-16 11:35:49 +0200 | [diff] [blame] | 140 |  | 
|  | 141 | <h4>Nesting of multiple Distance Operations</h4> | 
|  | 142 |  | 
|  | 143 | <p>In case a query contains multiple distance operators, they need to be nested in parentheses.</p> | 
|  | 144 | <%= doc_query cosmas2 => '(Tag /+w2 offenen) /+w1 Tür' %> | 
|  | 145 |  | 
|  | 146 | </section> | 
|  | 147 |  | 
|  | 148 | <section id="annotation-operators"> | 
|  | 149 | <h3>Annotation Operators</h3> | 
| Akron | 9490e3b | 2019-10-17 12:26:29 +0200 | [diff] [blame] | 150 | %= under_construction | 
| Akron | 84b9199 | 2019-07-16 11:35:49 +0200 | [diff] [blame] | 151 | %# MORPH and ELEM | 
|  | 152 | </section> | 
|  | 153 |  | 
|  | 154 | <section id="combination-operators"> | 
|  | 155 | <h3>Combination Operators</h3> | 
| Akron | 9490e3b | 2019-10-17 12:26:29 +0200 | [diff] [blame] | 156 | %= under_construction | 
| Akron | 84b9199 | 2019-07-16 11:35:49 +0200 | [diff] [blame] | 157 | %# IN and OV | 
|  | 158 | </section> | 
|  | 159 |  | 
|  | 160 | <section id="area-operators"> | 
|  | 161 | <h3>Area Operators</h3> | 
| Akron | 9490e3b | 2019-10-17 12:26:29 +0200 | [diff] [blame] | 162 | %= under_construction | 
| Akron | 84b9199 | 2019-07-16 11:35:49 +0200 | [diff] [blame] | 163 | %# LINKS, RECHTS, INKLUSIVE, EXKLUSIVE, BED | 
|  | 164 | </section> |