| 2.5.0 2023-01-24 |
| - Upgrade minimal Perl version to 5.36 to improve |
| unicode handling. |
| - Upgrade KorAP-Tokenizer to v2.2.5 and Java to 17 to |
| improve unicode handling. |
| |
| 2.4.4 2023-04-25 |
| - Allow line-breaks in text only lines. |
| |
| 2.4.3 2023-03-02 |
| - Allow closing elements to start with "text". |
| |
| 2.4.2 2023-02-10 |
| - Improve checks for numerical annotation bounds. |
| |
| 2.4.1 2023-02-07 |
| - Fix test. |
| |
| 2.4.0 2023-02-07 |
| - Conversion of standard TEI P5 should now work, at least |
| in some cases. |
| - Option --xmlid-to-textsigle <from-regex>@<to-c/to-d/to-t> |
| added to convert standard P5 text id attributes to I5 |
| sigles with three parts. |
| - Add --no-tokenizer parameter as a requirement |
| for relying on inline tokens only. |
| |
| 2.3.4 2022-11-09 |
| - Improve stability of XML entity replacement. |
| - Check version for script and KorAP-Tokenizer |
| library when requested. |
| |
| 2.3.3 2022-03-30 |
| - Load KorAP-Tokenizer only on request. |
| |
| 2.3.2 2022-03-23 |
| - Do not reference metadata.xml |
| - Remove schema references from header files. |
| - Improve test suite for unability to use |
| KorAP-Tokenizer. |
| |
| 2.3.1 2022-01-14 Release |
| - Improve script handling of broken data |
| - Improve handling of unknown header types |
| - Check for valid sigles to avoid broken directories |
| - Introduce exclusivity for inline tokens handling. |
| - Use single dash for STDIN. |
| - Update KorAP-Tokenizer to v2.2.2 (single quote, "du." bug fixes) |
| |
| 2.2.0 2021-08-26 Release |
| - Remove unnecessary branch in recursive call |
| - Support inline-structures parameter |
| - Introduce --base-foundry, --data-file, and --header-file parameters |
| - Introduce --tokens-file parameter |
| - Introduce --skip-inline-tokens parameter |
| - Minor cleanups and improvements |
| - Introduce --skip-inline-tags parameter |
| - Introduce KorAP::XML::TEI::Inline class |
| - Introduce --skip-inline-token-annotations parameter |
| - Deprecate KORAPXMLTEI_INLINE environment variable |
| in favor of --skip-inline-token-annotations |
| |
| 1.0.0 2021-02-18 Release |
| - -s option added that uses sentence boundaries |
| provided by the KorAP tokenizer (-tk) |
| - Tokenizer invocation comments removed from KorAP XML output |
| - Indentation of </span> tags fixed |
| - Character entities used in DeReKo are automatically |
| replaced by their corresponding characters |
| - Resources defined in Makefile |
| - Fixed possible IO deadlock with KorAP tokenizer |
| - Simplified debugging by combining with X::C::T line numbers |
| - Support inline-tokens parameter |
| - Move verbose code documentation to trailing |
| script section |
| |
| 0.03 2021-01-12 |
| - Update KorAP-Tokenizer to released 2.0 version |
| - Improve test suite for recent version |
| of Mojolicious. |
| |
| 0.02 2020-11-27 |
| - Update KorAP-Tokenizer to v2.0.0. |
| - Switch input encoding based on XML |
| processing instruction. |
| - Fix handling of UTF-8 in sigles. |
| |
| 0.01 2020-09-28 |
| - Initial release to GitHub. |