blob: 237e8513a31e1c31042dec2e799af8cacdcdd8a0 [file] [log] [blame]
Marc Kupietz4689e792021-09-07 23:29:51 +02001% layout 'main', title => 'KorAP: Corpus Data';
Nils Diewalda31a5152015-04-17 21:05:23 +00002
Akron9490e3b2019-10-17 12:26:29 +02003%= page_title
4
Akronc0725772019-07-18 16:23:18 +02005<p>KorAP is developed as being the main access point to
Akron9490e3b2019-10-17 12:26:29 +02006 <%= ext_link_to 'DeReKo', 'http://www1.ids-mannheim.de/kl/projekte/korpora' %>,
7 being the successor of <%= ext_link_to 'COSMAS II', 'https://cosmas2.ids-mannheim.de/cosmas2-web/' %> in that regard.
8 But KorAP is not focussed on any specific corpus, it is, for example, now also used for the Romanian national corpus <%= ext_link_to 'CoRoLa', 'http://corola.racai.ro/' %>.</p>
Nils Diewalda31a5152015-04-17 21:05:23 +00009
Akronc0725772019-07-18 16:23:18 +020010<p>In KorAP, corpus texts are allowed to have arbitrary metadata information, that partially can be used to create subcorpora (so-called virtual corpora).</p>
11
Akron3cfa26d2019-10-24 15:17:34 +020012<p>KorAP also supports an arbitrary number of <%= embedded_link_to 'doc', 'Annotations', 'data', 'annotation' %> from different sources (called <em>foundries</em>) with different <em>layers</em>.</p>
Akronc0725772019-07-18 16:23:18 +020013
14<dl>
15 <p>Annotations of the following kind are supported:</p>
16 <dt>Tokens</dt>
17 <dd>Annotations associated to single tokens (e.g. words or numbers)</dd>
18
19 <dt>Spans</dt>
20 <dd>Annotations to a sequence of words or nodes (e.g. sentences, phrases, constituency annotations)</dd>
21
22 <dt>Relations</dt>
23 <dd>Annotations of relations between tokens or spans (e.g. dependency annotations)</dd>
24
25 <dt>Attributes</dt>
26 <dd>Attribute information for tokens, spans, or relations (e.g. attributes of HTML elements)</dd>
27</dl>