Akron | e24c933 | 2023-03-22 10:59:33 +0100 | [diff] [blame^] | 1 | 0.6.1 2023-03-22 |
| 2 | - conllu2korapxml: |
| 3 | - Fix append for filehandle output. |
| 4 | |
Marc Kupietz | 66bb495 | 2023-01-13 15:04:38 +0100 | [diff] [blame] | 5 | 0.6.0 2023-01-13 |
Marc Kupietz | 7e4cd6c | 2022-12-15 18:34:37 +0100 | [diff] [blame] | 6 | - korapxml2conllu: |
Marc Kupietz | 534df18 | 2022-12-16 15:00:30 +0100 | [diff] [blame] | 7 | - the sigle-pattern option now affects the entire sigle |
Marc Kupietz | 7e4cd6c | 2022-12-15 18:34:37 +0100 | [diff] [blame] | 8 | - handle docid attributes correctly if they are in a different line than their parent element <layer> |
Akron | f2b0bba | 2022-12-16 18:00:08 +0100 | [diff] [blame] | 9 | - Improve identification of offset errors |
Marc Kupietz | 7e4cd6c | 2022-12-15 18:34:37 +0100 | [diff] [blame] | 10 | |
Marc Kupietz | 12f64e4 | 2022-09-29 08:58:16 +0200 | [diff] [blame] | 11 | 0.5.0 2022-09-29 |
| 12 | - korapxml2conllu: |
| 13 | - --word2vec|lm-training-data option added to print word2vec input format |
| 14 | - --extract-metadata-regex added to extract some metadata values as context input for language model training |
| 15 | - by default sentence boundary information is now read from structure.xml files (use --s-bounds-from-morpho otherwise) |
| 16 | - use morpho.xml if present when run on base zips |
| 17 | - new option -c <columns> |
| 18 | - conllu2korapxml: |
| 19 | - ignore _-lemmas |
| 20 | - handle UDPipe comments |
| 21 | - ignore non-interpretable comments |
| 22 | - improve error handling for missing text ids and offsets |
Marc Kupietz | f1fdc19 | 2021-10-08 13:29:59 +0200 | [diff] [blame] | 23 | |
Marc Kupietz | a7d90c6 | 2021-07-31 23:48:13 +0200 | [diff] [blame] | 24 | 0.4.1 2021-07-31 |
| 25 | - korapxml2conllu: fix patterns not extracted for last texts in archive |
| 26 | |
Marc Kupietz | 6beca9d | 2021-07-29 18:26:09 +0200 | [diff] [blame] | 27 | 0.4 2021-07-29 |
Marc Kupietz | eb7d06a | 2021-03-19 16:29:16 +0100 | [diff] [blame] | 28 | - korapxml2conllu option -e <regex> added to extract element/attributes to comments |
Marc Kupietz | 0ab8a2c | 2021-03-19 16:21:00 +0100 | [diff] [blame] | 29 | |
Marc Kupietz | 22858f8 | 2021-02-15 14:22:05 +0100 | [diff] [blame] | 30 | 0.3 2021-02-15 |
Marc Kupietz | 79ba1e5 | 2021-02-12 17:26:54 +0100 | [diff] [blame] | 31 | - Provide conllu2korapxml to convert from ConLL-U to KorAP-XML zip |
| 32 | |
Marc Kupietz | b96c386 | 2021-02-12 08:33:44 +0100 | [diff] [blame] | 33 | 0.2 2021-02-12 |
Marc Kupietz | d845583 | 2021-02-11 17:30:29 +0100 | [diff] [blame] | 34 | - Convert also KorAP-XML base zips |
| 35 | |
Marc Kupietz | 396b4d6 | 2021-02-12 08:29:35 +0100 | [diff] [blame] | 36 | 0.1 2020-09-23 |
| 37 | - Initial release to GitHub. |