Improve manpage
Change-Id: I0e5b35ade41ee30d84e4173c2f12d182121183cf
diff --git a/script/tei2korapxml b/script/tei2korapxml
index d50dc24..774c770 100755
--- a/script/tei2korapxml
+++ b/script/tei2korapxml
@@ -1,31 +1,4 @@
#!/usr/bin/env perl
-
-###
-### converts input in TEI P5 format (https://www1.ids-mannheim.de/kl/projekte/korpora/textmodell.html)
-### into output in KorAP-XML format (https://github.com/KorAP/KorAP-XML-Krill -> 'KorAP-XML document')
-###
-### input restrictions:
-###
-### . utf8 encoded
-### . TEI P5 formatted input with certain restrictions:
-### . mandatory: text-header with integrated textsigle, text-body
-### . optional: corp-header with integrated corpsigle, doc-header with integrated docsigle
-###
-### . all tokens inside the primary text (inside $data) may not be newline seperated, because newlines
-### are removed (see below: 'inside text body') and a conversion of newlines into blanks between 2 tokens
-### could lead to additional blanks, where there should be none (e.g.: punctuation characters like ',' or
-### '.' should not be seperated from their predecessor token).
-### - see also '~ whitespace handling ~'
-###
-### . POS and MSD inline annotations handling (see below: expected format)
-### ...
-###
-### notes on the output:
-###
-### . zip file output (default on stdout) with utf8 encoded entries (which together form the KorAP-XML format)
-### ...
-###
-
use strict;
use warnings;
@@ -1503,28 +1476,75 @@
=head1 DESCRIPTION
-C<tei2korapxml> is a script to convert TEI P5 and I5 based documents
-
-to the KorAP-XML format. If no specific input is defined, data is
-
+C<tei2korapxml> is a script to convert TEI P5 and
+L<I5|https://www1.ids-mannheim.de/kl/projekte/korpora/textmodell.html>
+based documents to the
+L<KorAP-XML format|https://github.com/KorAP/KorAP-XML-Krill#about-korap-xml>.
+If no specific input is defined, data is
read from C<STDIN>. If no specific output is defined, data is written
-
to C<STDOUT>.
This program is usually called from inside another script.
+=head1 FORMATS
+
+=head2 Input restrictions
+
+=over 2
+
+=item
+
+utf8 encoded
+
+=item
+
+TEI P5 formatted input with certain restrictions:
+
+=over 4
+
+=item
+
+B<mandatory>: text-header with integrated textsigle, text-body
+
+=item
+
+B<optional>: corp-header with integrated corpsigle,
+doc-header with integrated docsigle
+
+=back
+
+=item
+
+all tokens inside the primary text (inside $data) may not be
+newline seperated, because newlines are removed
+(see code section C<~ inside text body ~>) and a conversion of newlines
+into blanks between 2 tokens could lead to additional blanks,
+where there should be none (e.g.: punctuation characters like C<,> or
+C<.> should not be seperated from their predecessor token).
+(see also code section C<~ whitespace handling ~>).
+
+=back
+
+=head2 Notes on the output
+
+=over 2
+
+=item
+
+zip file output (default on C<stdout>) with utf8 encoded entries
+(which together form the KorAP-XML format)
+
+=back
+
=head1 INSTALLATION
C<tei2korapxml> requires L<libxml2-dev> bindings to build. When
-
these bindings are available, the preferred way to install the script is
-
to use L<cpanm|App::cpanminus>.
$ cpanm https://github.com/KorAP/KorAP-XML-TEI.git
In case everything went well, the C<tei2korapxml> tool will
-
be available on your command line immediately.
Minimum requirement for L<KorAP::XML::TEI> is Perl 5.16.
@@ -1556,17 +1576,12 @@
Contributors: Marc Kupietz, Carsten Schnober, Nils Diewald
L<KorAP::XML::TEI> is developed as part of the L<KorAP|https://korap.ids-mannheim.de/>
-
Corpus Analysis Platform at the
-
L<Leibniz Institute for the German Language (IDS)|http://ids-mannheim.de/>,
-
member of the
-
L<Leibniz-Gemeinschaft|http://www.leibniz-gemeinschaft.de/>.
This program is free software published under the
-
L<BSD-2 License|https://raw.githubusercontent.com/KorAP/KorAP-XML-TEI/master/LICENSE>.
=cut