Nils Diewald | 2329e1d | 2014-06-12 16:07:57 +0000 | [diff] [blame] | 1 | <div style="padding: 20pt"> |
| 2 | |
| 3 | <p>Links to Blog, FAQ, About, Contact ...</p> |
| 4 | |
Nils Diewald | 1eba657 | 2014-06-17 19:49:53 +0000 | [diff] [blame] | 5 | <h2>KorAP-Tutorial</h2> |
Nils Diewald | 2329e1d | 2014-06-12 16:07:57 +0000 | [diff] [blame] | 6 | |
Nils Diewald | 0d2dcc8 | 2014-06-18 17:10:49 +0000 | [diff] [blame] | 7 | <!-- |
Nils Diewald | 2329e1d | 2014-06-12 16:07:57 +0000 | [diff] [blame] | 8 | <ul> |
| 9 | <li>Introduction to KorAP</li> |
| 10 | <li>How to use Poliqarp+ QL?</li> |
| 11 | <li>How to use Cosmas-II QL?</li> |
| 12 | <li>How to use CQL?</li> |
| 13 | </ul> |
Nils Diewald | 0d2dcc8 | 2014-06-18 17:10:49 +0000 | [diff] [blame] | 14 | --> |
Nils Diewald | 2329e1d | 2014-06-12 16:07:57 +0000 | [diff] [blame] | 15 | |
| 16 | <p>This is a Tutorial to KorAP. It may be maintained separately (as a Wiki?) and |
Nils Diewald | 0d2dcc8 | 2014-06-18 17:10:49 +0000 | [diff] [blame] | 17 | has some nice features - like embedded example queries - just click on the queries below:</p> |
Nils Diewald | 2329e1d | 2014-06-12 16:07:57 +0000 | [diff] [blame] | 18 | |
| 19 | %= korap_tut_query poliqarp => '[base=baum]' |
| 20 | |
Nils Diewald | 2329e1d | 2014-06-12 16:07:57 +0000 | [diff] [blame] | 21 | %= korap_tut_query cosmas2 => 'der /w5 Baum' |
| 22 | |
Nils Diewald | 0d2dcc8 | 2014-06-18 17:10:49 +0000 | [diff] [blame] | 23 | <p>And here is a short cheat sheet for foundries and layers</p> |
Nils Diewald | 2329e1d | 2014-06-12 16:07:57 +0000 | [diff] [blame] | 24 | |
Nils Diewald | 0d2dcc8 | 2014-06-18 17:10:49 +0000 | [diff] [blame] | 25 | <ul> |
| 26 | <li><strong>base</strong> |
| 27 | <ul> |
| 28 | <li>Supports two types of spans: <strong><s></strong> for sentences and <strong><p></strong> for paragraphs - this will likely change in the next index version. These spans lack prefix information!</li> |
| 29 | </ul> |
| 30 | </li> |
| 31 | <li><strong>cnx</strong> |
| 32 | <ul> |
| 33 | <li><strong>l</strong> (Token:Lemma): All lemmas are written in lower case. Composita are split, e.g. the token "Leitfähigkeit" is matched by the lemmas "leit" and "fähigkeit" - not by the lemma "leitfähigkeit"</li> |
| 34 | <li><strong>p</strong> (Token:Part of Speech): All pos infos are written in capital letters and are based on STTS</li> |
| 35 | <li><strong>syn</strong> (Token:Syntactical information): Includes token based information like @PREMOD, @NH, @MAIN ...</li> |
| 36 | <li><strong>m</strong> (Token:Morphosyntactical information): Includes information about tense ("PRES" ...), mode ("IND&qut;), number ("PL" ...) etc.</li> |
| 37 | <li><strong>c</strong> (Span:Phrases): Only nominal phrases are available and all nominal phrases are written in lower case ("np")</li> |
| 38 | </ul> |
| 39 | </li> |
| 40 | <li><strong>corenlp</strong> |
| 41 | <ul> |
| 42 | <li><strong>ne_hgc_175m_600</strong> (Token:Named Entity): Contains named entities like "I-PER", "I-ORG" etc. </li> |
| 43 | <li><strong>ne_dewac_175_175m_600</strong> (Token:Named Entity): see above</li> |
| 44 | </ul> |
| 45 | </li> |
| 46 | <li><strong>tt</strong> |
| 47 | <ul> |
| 48 | <li><strong>l</strong> (Token:Lemma): All non-noun lemmas are written in lower case, nouns are written upper case. Composita stay intact (e.g. "Normalbedingung")</li> |
| 49 | <li><strong>p</strong> (Token:Part of Speech): All pos infos are written in capital letters and are based on STTS</li> |
| 50 | </ul> |
| 51 | </li> |
| 52 | <li><strong>mate</strong> |
| 53 | <ul> |
| 54 | <li><strong>l</strong> (Token:Lemma): All lemmas are written in lower case. Composita stay intact (e.g. "buchstabenbezeichnung")</li> |
| 55 | <li><strong>p</strong> (Token:Part of Speech): All pos infos are written in capital letters and are based on STTS</li> |
| 56 | <li><strong>m</strong> (Token:Morphosyntactical information): Includes information about tense ("tense:pres" ...), mode ("mood:ind&qut;), number ("number:pl" ...), gender ("gender:masc" etc.</li> |
| 57 | </ul> |
| 58 | </li> |
| 59 | <li><strong>opennlp</strong> |
| 60 | <ul> |
| 61 | <li><strong>p</strong> (Token:Part of Speech): All pos infos are written in capital letters and are based on STTS</li> |
| 62 | </ul> |
| 63 | </li> |
| 64 | <li><strong>xip</strong> |
| 65 | <ul> |
| 66 | <li><strong>l</strong> (Token:Lemma): All non-noun lemmas are written in lower case, nouns are written upper case. Composita are split, e.g. the token "Leitfähigkeit" is matched by the lemmas "leiten" and "Fähigkeit" - and by a merged and pretty useless "leitenfähigkeit" (This is going to change)</li> |
| 67 | <li><strong>p</strong> (Token:Part of Speech): All pos infos are written in capital letters and are based on STTS</li> |
| 68 | <li><strong>c</strong> (Span:Phrases): Some phrases to create sentences, all upper case ("NP", "NPA", "NOUN", "VERB", "PREP", "AP" ...)</li> |
| 69 | </ul> |
| 70 | </li> |
| 71 | </ul> |
Nils Diewald | 2329e1d | 2014-06-12 16:07:57 +0000 | [diff] [blame] | 72 | |
| 73 | </div> |