Joachim Bingel | 4b405f5 | 2013-11-15 15:29:30 +0000 | [diff] [blame] | 1 | COSMAS: |
Joachim Bingel | 87480d0 | 2014-01-17 14:07:46 +0000 | [diff] [blame] | 2 | - disjunctions with more than 2 arguments are parsed as nested by ANTLR grammar: |
Joachim Bingel | 4b405f5 | 2013-11-15 15:29:30 +0000 | [diff] [blame] | 3 | "A oder B oder C" becomes (simplified) "A|(B|C)" |
Joachim Bingel | 87480d0 | 2014-01-17 14:07:46 +0000 | [diff] [blame] | 4 | - using stacks proves to be a good choice for tree processing |
| 5 | - queries of form "A B C" where A, B, C are terms of type <R> (see http://www.ids-mannheim.de/cosmas2/win-app/hilfe/suchanfrage/eingabe-grafisch/syntax/ARGUMENT_R.html) |
| 6 | are strictly speaking not well-formed since there is no linking operator between the individual terms. The Cosmas II web app offers two alternatives of how to interpret |
| 7 | such lacking operators: as a logical OR or as a /+w1 distance operator. In KorAP, the second alternative will be chose as this appears to be more intuitive (effectively, |
| 8 | the query expresses a sequence of terms) and more in line with other QLs (e.g. Poliqarp) |
| 9 | - distance operators allow to specify a minimal and maximal distance between their arguments, in the form "/wn:m" where n and m are integers. In case only one number |
| 10 | is given (like "/w2"), Cosmas II interprets this as the maximal distance. Note that it is not possible to specify a minimal but possibly infinite distance between to |
Joachim Bingel | 3f0850c | 2014-01-17 16:50:10 +0000 | [diff] [blame] | 11 | tokens - for obvious reasons. |
| 12 | - distance operator: flip arguments and change 'minus' to 'plus' |
Joachim Bingel | 87480d0 | 2014-01-17 14:07:46 +0000 | [diff] [blame] | 13 | - operands of distance operators to be expressed via shrink/classes? |
| 14 | - #BEG() and #END() are solved via shrink, with the shrink argument not being a class but a position (first/last) indicating the first or last word in the matched sequence |
Joachim Bingel | 3f0850c | 2014-01-17 16:50:10 +0000 | [diff] [blame] | 15 | - #BED() with two conditions (like 'sa,-pa') can't be mapped to position group as implemented -> make two position groups and embed in 'and'-group |
Joachim Bingel | ba9a0ab | 2014-01-29 10:12:25 +0000 | [diff] [blame] | 16 | |
| 17 | |
| 18 | Poliqarp |
| 19 | - empty tokens as distance operators |
Joachim Bingel | 3f0850c | 2014-01-17 16:50:10 +0000 | [diff] [blame] | 20 | |