Updated annotation documentation
Change-Id: I0f797ca70e0bbe6964a19df00aee8030734b0b55
diff --git a/Changes b/Changes
index 1b44d5c..6ec9c6a 100755
--- a/Changes
+++ b/Changes
@@ -11,6 +11,7 @@
- Introduced documentation on FCS-QL (margaretha).
- Enable experimental proxy via
configuration option 'experimental_proxy'.
+ - Updated documentation on annotations.
0.34 2019-06-26
- Introduced guided tour (hebasta, #19).
diff --git a/dev/scss/main/tutorial.scss b/dev/scss/main/tutorial.scss
index c13d368..c86a7cb 100644
--- a/dev/scss/main/tutorial.scss
+++ b/dev/scss/main/tutorial.scss
@@ -81,7 +81,7 @@
dl {
margin: 0;
- padding-bottom: 2em;
+ padding-bottom: .5em;
dt {
font-weight: bold;
abbr {
diff --git a/t/doc.t b/t/doc.t
index 52e2f1a..3f4d0c1 100644
--- a/t/doc.t
+++ b/t/doc.t
@@ -58,6 +58,14 @@
->status_is(200)
->text_is('#segments pre.query.tutorial:nth-of-type(1) code', 'Baum');
+# Check data
+$t->get_ok('/doc/data/annotation' => { 'Accept-Language' => 'en-US, en, de-DE' })
+ ->status_is(200)
+ ->text_is('#tutorial-top', 'Annotations');
+$t->get_ok('/doc/data/annotation' => { 'Accept-Language' => 'de-DE, en-US, en' })
+ ->status_is(200)
+ ->text_is('#tutorial-top', 'Annotationen');
+
my $app = $t->app;
$app->plugin(
diff --git a/templates/de/doc/data/annotation.html.ep b/templates/de/doc/data/annotation.html.ep
index 4a7c2d7..fc09157 100644
--- a/templates/de/doc/data/annotation.html.ep
+++ b/templates/de/doc/data/annotation.html.ep
@@ -4,13 +4,19 @@
<p>KorAP bietet Zugriff auf mehrere Ebenen von Annotationen, die aus mehreren Ressourcen stammen, so genannten <em>foundries</em>.</p>
+
<section id="base">
<h3>Basis Foundry</h3>
- <p>Die Basis Foundry steht allen Korpora zur Verfügung und dient als gemeinsame Grundlage für die Dokumentenstrukturannotation im layer <code>s</code>. Sie unterstützt drei Arten von Spans: <code><base/s=s></code> für Sätze, <code><base/s=p></code> für Absätze und <code><base/s=t></code> für den gesamten Text.</p>
+ <p>Die Basis Foundry steht allen Korpora zur Verfügung und dient als gemeinsame Grundlage für die Dokumentenstrukturannotation im layer <code>s</code>.</p>
+ <dl>
+ <dt><abbr data-type="token" title="Structure">s</abbr></dt>
+ <dd>Dokument Struktur, die folgende Spans unterstützt: <code><base/s=s></code> für Sätze, <code><base/s=p></code> für Paragraphen und <code><base/s=t></code> für die gesamte Textspanne.</dd>
+ </dl>
+
%= doc_query poliqarp => '<base/s=s>', cutoff => 1
</section>
-
+<!--
<section id="cnx">
<h3>Connexor (<code>cnx</code>)</h3>
<p>Connexor-Annotationen liefern die folgenden Layer für das <code>cnx</ code> Präfix:</p>
@@ -28,15 +34,27 @@
</dl>
%= doc_query poliqarp => '[cnx/p=CC]', cutoff => 1
</section>
+-->
+
+<section id="dereko">
+ <h3>DeReKo (<code>dereko</code>)</h3>
+ <p>DeReKo Annotationen unterstützen die folgenden Layer für das <code>dereko</code> Präfix:</p>
+ <dl>
+ <dt><abbr data-type="token" title="Structure">s</abbr></dt>
+ <dd>Dokument Struktur, wie sie im <%= doc_ext_link_to 'I5 Textmodell', 'http://www1.ids-mannheim.de/kl/projekte/korpora/textmodell.html' %> definiert ist.</dd>
+ </dl>
+ %= doc_query poliqarp => 'startsWith(<dereko/s=s>, Fragestunde)', cutoff => 1
+</section>
<section id="corenlp">
<h3>CoreNLP (<code>corenlp</code>)</h3>
+ <p>CoreNLP Annotationen unterstützen die folgenden Layer für das <code>corenlp</code> Präfix:</p>
<dl>
<dt><abbr data-type="token" title="Part-of-Speech">p</abbr></dt>
<dd>Part-of-Speech-Informationen werden in Großbuchstaben geschrieben und basieren auf STTS</dd>
<dt><abbr data-type="token" title="Constituency">c</abbr></dt>
- <dd>Konstituenten Informationen folgen den Annotationen des <a href="http://www.coli.uni-saarland.de/projects/sfb378/negra-corpus/negra-corpus.html">negr@ Korpus</a>.</dd>
+ <dd>Konstituenten Informationen folgen den Annotationen des <%= doc_ext_link_to 'negr@ Korpus', 'http://www.coli.uni-saarland.de/projects/sfb378/negra-corpus/negra-corpus.html' %>.</dd>
<dt><abbr data-type="token" title="Named Entity">ne</abbr></dt>
<dd>Enthält benannte Entitäten wie <code>I-PER</code>, <code>I-ORG</code> etc.</dd>
<dt><abbr data-type="token" title="Named Entity">ne_hgc_175m_600</abbr></dt>
@@ -50,6 +68,7 @@
<section id="tt">
<h3>TreeTagger (<code>tt</code>)</h3>
+ <p>TreeTagger Annotationen unterstützen die folgenden Layer für das <code>tt</code> Präfix:</p>
<dl>
<dt><abbr data-type="token" title="Lemma">l</abbr></dt>
<dd>Alle Nicht-Nomen-Lemmata sind in Kleinbuchstaben geschrieben, Substantive sind in Großbuchstaben geschrieben. Komposita bleiben intakt (z. B. <code>Normalbedingung</code>).</dd>
@@ -59,7 +78,7 @@
%= doc_query poliqarp => '[tt/p=ADV]', cutoff => 1
</section>
-
+<!--
<section id="mate">
<h3>Mate (<code>mate</code>)</h3>
<dl>
@@ -72,10 +91,22 @@
</dl>
%= doc_query poliqarp => '[mate/m=gender:fem]', cutoff => 1
</section>
+-->
+
+<section id="malt">
+ <h3>Malt (<code>malt</code>)</h3>
+ <p>Malt Annotationen unterstützen die folgenden Layer für das <code>malt</code> Präfix:</p>
+ <dl>
+ <dt><abbr data-type="token" title="Lemma">d</abbr></dt>
+ <dd>Dependenz-Annotation</dd>
+ </dl>
+ %= doc_query annis => 'tt/p="PPOSAT" ->malt/d[func="DET"] node', cutoff => 1
+</section>
<section id="opennlp">
<h3>OpenNLP (<code>opennlp</code>)</h3>
+ <p>OpenNLP Annotationen unterstützen die folgenden Layer für das <code>opennlp</code> Präfix:</p>
<dl>
<dt><abbr data-type="token" title="Part-of-Speech">p</abbr></dt>
<dd>Alle Part-of-Speech-Informationen sind in Großbuchstaben geschrieben und basieren auf STTS</dd>
@@ -83,6 +114,19 @@
%= doc_query poliqarp => '[opennlp/p=PDAT]', cutoff => 1
</section>
+
+<section id="marmot">
+ <h3>Marmot (<code>marmot</code>)</h3>
+ <p>Marmot Annotationen unterstützen die folgenden Layer für das <code>marmot</code> Präfix:</p>
+ <dl>
+ <dt><abbr data-type="token" title="Part-of-Speech">p</abbr></dt>
+ <dd>Alle Part-of-Speech-Informationen sind in Großbuchstaben geschrieben und basieren auf STTS</dd>
+ <dt><abbr data-type="token" title="Morphosyntactical information">m</abbr></dt>
+ <dd>Enthält annotationen zu case (<code>acc</code> ...), degree (<code>pos</code>), gender (<code>fem</code> ...) etc.</dd>
+ </dl>
+ %= doc_query poliqarp => '[marmot/m=degree:sup & marmot/p=ADJA]', cutoff => 1
+</section>
+
<!--
<section id="xip">
<h3>Xerox Incremental Parser (<code>xip</code>)</h3>
@@ -107,6 +151,9 @@
<li><strong>orth</strong>: <code>opennlp</code></li>
<li><strong>lemma</strong>: <code>tt</code></li>
<li><strong>pos</strong>: <code>tt</code></li>
+ <li>Constituency: <code>corenlp</code></li>
+ <li>Dependency: <code>malt</code></li>
+ <li>Morphology: <code>marmot</code></li>
</ul>
<blockquote>
diff --git a/templates/doc/data/annotation.html.ep b/templates/doc/data/annotation.html.ep
index dbbd534..656323e 100644
--- a/templates/doc/data/annotation.html.ep
+++ b/templates/doc/data/annotation.html.ep
@@ -6,11 +6,17 @@
<section id="base">
<h3>Base Foundry</h3>
- <p>The base foundry is available for all corpora and acts as a common ground for document structure annotation in the layer <code>s</code>. It supports three types of spans: <code><base/s=s></code> for sentences, <code><base/s=p></code> for paragraphs, and <code><base/s=t></code> for the text span</p>
+ <p>The base foundry is available for all corpora and acts as a common ground for document structure annotation in the layer <code>s</code>.</p>
+ <dl>
+ <dt><abbr data-type="token" title="Structure">s</abbr></dt>
+ <dd>Document structure supporting the spans: <code><base/s=s></code> for sentences, <code><base/s=p></code> for paragraphs, and <code><base/s=t></code> for the text span.</dd>
+ </dl>
+
%= doc_query poliqarp => '<base/s=s>', cutoff => 1
</section>
+<!--
<section id="cnx">
<h3>Connexor (<code>cnx</code>)</h3>
<p>Connexor annotations provide the following layer for the <code>cnx</code> prefix:</p>
@@ -28,15 +34,27 @@
</dl>
%= doc_query poliqarp => '[cnx/p=CC]', cutoff => 1
</section>
+-->
+
+<section id="dereko">
+ <h3>DeReKo (<code>dereko</code>)</h3>
+ <p>DeReKo annotations provide the following layer for the <code>dereko</code> prefix:</p>
+ <dl>
+ <dt><abbr data-type="token" title="Structure">s</abbr></dt>
+ <dd>Document structure as encoded in the <%= doc_ext_link_to 'I5 text document', 'http://www1.ids-mannheim.de/kl/projekte/korpora/textmodell.html' %>.</dd>
+ </dl>
+ %= doc_query poliqarp => 'startsWith(<dereko/s=s>, Fragestunde)', cutoff => 1
+</section>
<section id="corenlp">
<h3>CoreNLP (<code>corenlp</code>)</h3>
+ <p>CoreNLP annotations provide the following layer for the <code>corenlp</code> prefix:</p>
<dl>
<dt><abbr data-type="token" title="Part-of-Speech">p</abbr></dt>
<dd>Part-of-speech information is written in capital letters and is based on STTS</dd>
<dt><abbr data-type="token" title="Constituency">c</abbr></dt>
- <dd>Constituency information follows the annotations of the <a href="http://www.coli.uni-saarland.de/projects/sfb378/negra-corpus/negra-corpus.html">negr@ corpus</a>.</dd>
+ <dd>Constituency information follows the annotations of the <%= doc_ext_link_to 'negr@ corpus', 'http://www.coli.uni-saarland.de/projects/sfb378/negra-corpus/negra-corpus.html' %>.</dd>
<dt><abbr data-type="token" title="Named Entity">ne</abbr></dt>
<dd>Contains named entities like <code>I-PER</code>, <code>I-ORG</code> etc.</dd>
<dt><abbr data-type="token" title="Named Entity">ne_hgc_175m_600</abbr></dt>
@@ -50,6 +68,7 @@
<section id="tt">
<h3>TreeTagger (<code>tt</code>)</h3>
+ <p>TreeTagger annotations provide the following layer for the <code>tt</code> prefix:</p>
<dl>
<dt><abbr data-type="token" title="Lemma">l</abbr></dt>
<dd>All non-noun lemmas are written in lower case, nouns are written upper case. Composita stay intact (e.g. <code>Normalbedingung</code>)</dd>
@@ -60,6 +79,7 @@
</section>
+<!--
<section id="mate">
<h3>Mate (<code>mate</code>)</h3>
<dl>
@@ -72,10 +92,21 @@
</dl>
%= doc_query poliqarp => '[mate/m=gender:fem]', cutoff => 1
</section>
+-->
+<section id="malt">
+ <h3>Malt (<code>malt</code>)</h3>
+ <p>Malt annotations provide the following layer for the <code>malt</code> prefix:</p>
+ <dl>
+ <dt><abbr data-type="token" title="Lemma">d</abbr></dt>
+ <dd>Dependency information</dd>
+ </dl>
+ %= doc_query annis => 'tt/p="PPOSAT" ->malt/d[func="DET"] node', cutoff => 1
+</section>
<section id="opennlp">
<h3>OpenNLP (<code>opennlp</code>)</h3>
+ <p>OpenNLP annotations provide the following layer for the <code>opennlp</code> prefix:</p>
<dl>
<dt><abbr data-type="token" title="Part-of-Speech">p</abbr></dt>
<dd>All part-of-speech information is written in capital letters and is based on STTS</dd>
@@ -83,6 +114,20 @@
%= doc_query poliqarp => '[opennlp/p=PDAT]', cutoff => 1
</section>
+
+<section id="marmot">
+ <h3>Marmot (<code>marmot</code>)</h3>
+ <p>Marmot annotations provide the following layer for the <code>marmot</code> prefix:</p>
+ <dl>
+ <dt><abbr data-type="token" title="Part-of-Speech">p</abbr></dt>
+ <dd>Part-of-speech information is written in capital letters and is based on STTS</dd>
+ <dt><abbr data-type="token" title="Morphosyntactical information">m</abbr></dt>
+ <dd>Includes information about case (<code>acc</code> ...), degree (<code>pos</code>), gender (<code>fem</code> ...) etc.</dd>
+ </dl>
+ %= doc_query poliqarp => '[marmot/m=degree:sup & marmot/p=ADJA]', cutoff => 1
+</section>
+
+
<!--
<section id="xip">
<h3>Xerox Incremental Parser (<code>xip</code>)</h3>