Marc Kupietz | a671ae5 | 2022-12-22 16:28:14 +0100 | [diff] [blame] | 1 | - Conversion of standard TEI P5 should now work, at least |
| 2 | in some cases. |
| 3 | - Option --xmlid-to-textsigle <from-regex>@<to-c/to-d/to-t> |
| 4 | added to convert standard P5 text id attributes to I5 |
| 5 | sigles with three parts. |
Akron | b93fabb | 2023-01-13 12:05:44 +0100 | [diff] [blame^] | 6 | - Add --no-tokenizer parameter as a requirement |
| 7 | for relying on inline tokens only. |
Marc Kupietz | a671ae5 | 2022-12-22 16:28:14 +0100 | [diff] [blame] | 8 | |
Akron | 2520a34 | 2022-03-29 18:18:05 +0200 | [diff] [blame] | 9 | 2.3.4 2022-11-09 |
Akron | 85269c0 | 2022-11-07 14:03:31 +0100 | [diff] [blame] | 10 | - Improve stability of XML entity replacement. |
Akron | 2520a34 | 2022-03-29 18:18:05 +0200 | [diff] [blame] | 11 | - Check version for script and KorAP-Tokenizer |
| 12 | library when requested. |
Akron | 85269c0 | 2022-11-07 14:03:31 +0100 | [diff] [blame] | 13 | |
Akron | 2520a34 | 2022-03-29 18:18:05 +0200 | [diff] [blame] | 14 | 2.3.3 2022-03-30 |
Akron | bd4281e | 2022-03-28 08:31:40 +0200 | [diff] [blame] | 15 | - Load KorAP-Tokenizer only on request. |
| 16 | |
Akron | d708a61 | 2022-03-21 16:00:01 +0100 | [diff] [blame] | 17 | 2.3.2 2022-03-23 |
Akron | 540fd62 | 2022-03-21 18:20:05 +0100 | [diff] [blame] | 18 | - Do not reference metadata.xml |
Akron | d708a61 | 2022-03-21 16:00:01 +0100 | [diff] [blame] | 19 | - Remove schema references from header files. |
Akron | 4ee372a | 2022-02-24 17:54:24 +0100 | [diff] [blame] | 20 | - Improve test suite for unability to use |
| 21 | KorAP-Tokenizer. |
Akron | 540fd62 | 2022-03-21 18:20:05 +0100 | [diff] [blame] | 22 | |
Marc Kupietz | 0bca4f1 | 2022-01-14 13:24:22 +0100 | [diff] [blame] | 23 | 2.3.1 2022-01-14 Release |
Akron | a3799ce | 2021-10-15 16:27:30 +0200 | [diff] [blame] | 24 | - Improve script handling of broken data |
| 25 | - Improve handling of unknown header types |
| 26 | - Check for valid sigles to avoid broken directories |
| 27 | - Introduce exclusivity for inline tokens handling. |
Akron | a2cb281 | 2021-10-30 10:29:08 +0200 | [diff] [blame] | 28 | - Use single dash for STDIN. |
Marc Kupietz | 0bca4f1 | 2022-01-14 13:24:22 +0100 | [diff] [blame] | 29 | - Update KorAP-Tokenizer to v2.2.2 (single quote, "du." bug fixes) |
Akron | a3799ce | 2021-10-15 16:27:30 +0200 | [diff] [blame] | 30 | |
| 31 | 2.2.0 2021-08-26 Release |
Akron | d658df7 | 2021-02-18 18:58:56 +0100 | [diff] [blame] | 32 | - Remove unnecessary branch in recursive call |
Akron | dd0be8f | 2021-02-18 19:29:41 +0100 | [diff] [blame] | 33 | - Support inline-structures parameter |
Akron | 26a7152 | 2021-02-19 10:27:37 +0100 | [diff] [blame] | 34 | - Introduce --base-foundry, --data-file, and --header-file parameters |
Akron | 91705d7 | 2021-02-19 10:59:45 +0100 | [diff] [blame] | 35 | - Introduce --tokens-file parameter |
Akron | 75d6314 | 2021-02-23 18:40:56 +0100 | [diff] [blame] | 36 | - Introduce --skip-inline-tokens parameter |
Akron | d3e1d28 | 2021-02-24 14:51:27 +0100 | [diff] [blame] | 37 | - Minor cleanups and improvements |
Akron | 54c3ff1 | 2021-02-25 11:33:37 +0100 | [diff] [blame] | 38 | - Introduce --skip-inline-tags parameter |
Akron | eb12e23 | 2021-02-25 13:49:50 +0100 | [diff] [blame] | 39 | - Introduce KorAP::XML::TEI::Inline class |
Akron | 692d17d | 2021-03-05 13:21:03 +0100 | [diff] [blame] | 40 | - Introduce --skip-inline-token-annotations parameter |
| 41 | - Deprecate KORAPXMLTEI_INLINE environment variable |
| 42 | in favor of --skip-inline-token-annotations |
Akron | d658df7 | 2021-02-18 18:58:56 +0100 | [diff] [blame] | 43 | |
Akron | a3799ce | 2021-10-15 16:27:30 +0200 | [diff] [blame] | 44 | 1.0.0 2021-02-18 Release |
Akron | d3e1d28 | 2021-02-24 14:51:27 +0100 | [diff] [blame] | 45 | - -s option added that uses sentence boundaries |
| 46 | provided by the KorAP tokenizer (-tk) |
Marc Kupietz | a1421f0 | 2021-02-18 15:32:38 +0100 | [diff] [blame] | 47 | - Tokenizer invocation comments removed from KorAP XML output |
| 48 | - Indentation of </span> tags fixed |
Akron | d3e1d28 | 2021-02-24 14:51:27 +0100 | [diff] [blame] | 49 | - Character entities used in DeReKo are automatically |
| 50 | replaced by their corresponding characters |
Marc Kupietz | a1421f0 | 2021-02-18 15:32:38 +0100 | [diff] [blame] | 51 | - Resources defined in Makefile |
| 52 | - Fixed possible IO deadlock with KorAP tokenizer |
Akron | 4e3c7e3 | 2021-02-18 15:19:53 +0100 | [diff] [blame] | 53 | - Simplified debugging by combining with X::C::T line numbers |
Akron | 1a5271a | 2021-02-18 13:18:15 +0100 | [diff] [blame] | 54 | - Support inline-tokens parameter |
Akron | f8088e6 | 2021-02-18 16:18:59 +0100 | [diff] [blame] | 55 | - Move verbose code documentation to trailing |
| 56 | script section |
Marc Kupietz | eed4cb1 | 2021-02-17 19:39:32 +0100 | [diff] [blame] | 57 | |
Akron | f7084c4 | 2021-01-07 10:25:22 +0100 | [diff] [blame] | 58 | 0.03 2021-01-12 |
Marc Kupietz | b505d44 | 2021-01-06 16:40:29 +0100 | [diff] [blame] | 59 | - Update KorAP-Tokenizer to released 2.0 version |
Akron | f7084c4 | 2021-01-07 10:25:22 +0100 | [diff] [blame] | 60 | - Improve test suite for recent version |
| 61 | of Mojolicious. |
| 62 | |
Marc Kupietz | 44b1f25 | 2020-11-26 16:31:40 +0100 | [diff] [blame] | 63 | 0.02 2020-11-27 |
Akron | f7084c4 | 2021-01-07 10:25:22 +0100 | [diff] [blame] | 64 | - Update KorAP-Tokenizer to v2.0.0. |
Akron | eaa9623 | 2020-10-15 17:06:15 +0200 | [diff] [blame] | 65 | - Switch input encoding based on XML |
| 66 | processing instruction. |
Marc Kupietz | 44b1f25 | 2020-11-26 16:31:40 +0100 | [diff] [blame] | 67 | - Fix handling of UTF-8 in sigles. |
Akron | eaa9623 | 2020-10-15 17:06:15 +0200 | [diff] [blame] | 68 | |
Akron | 0c41ab3 | 2020-09-29 07:33:33 +0200 | [diff] [blame] | 69 | 0.01 2020-09-28 |
| 70 | - Initial release to GitHub. |