blob: a89cf6c3ce540dd1220b91702962ef3d74767a79 [file] [log] [blame]
Joachim Bingel4b405f52013-11-15 15:29:30 +00001COSMAS:
Joachim Bingel87480d02014-01-17 14:07:46 +00002 - disjunctions with more than 2 arguments are parsed as nested by ANTLR grammar:
Joachim Bingel4b405f52013-11-15 15:29:30 +00003 "A oder B oder C" becomes (simplified) "A|(B|C)"
Joachim Bingel87480d02014-01-17 14:07:46 +00004 - using stacks proves to be a good choice for tree processing
5 - queries of form "A B C" where A, B, C are terms of type <R> (see http://www.ids-mannheim.de/cosmas2/win-app/hilfe/suchanfrage/eingabe-grafisch/syntax/ARGUMENT_R.html)
6 are strictly speaking not well-formed since there is no linking operator between the individual terms. The Cosmas II web app offers two alternatives of how to interpret
7 such lacking operators: as a logical OR or as a /+w1 distance operator. In KorAP, the second alternative will be chose as this appears to be more intuitive (effectively,
8 the query expresses a sequence of terms) and more in line with other QLs (e.g. Poliqarp)
9 - distance operators allow to specify a minimal and maximal distance between their arguments, in the form "/wn:m" where n and m are integers. In case only one number
10 is given (like "/w2"), Cosmas II interprets this as the maximal distance. Note that it is not possible to specify a minimal but possibly infinite distance between to
Joachim Bingel3f0850c2014-01-17 16:50:10 +000011 tokens - for obvious reasons.
12 - distance operator: flip arguments and change 'minus' to 'plus'
Joachim Bingel87480d02014-01-17 14:07:46 +000013 - operands of distance operators to be expressed via shrink/classes?
14 - #BEG() and #END() are solved via shrink, with the shrink argument not being a class but a position (first/last) indicating the first or last word in the matched sequence
Joachim Bingel3f0850c2014-01-17 16:50:10 +000015 - #BED() with two conditions (like 'sa,-pa') can't be mapped to position group as implemented -> make two position groups and embed in 'and'-group
16