Marc Kupietz | d0bf277 | 2022-06-26 19:27:58 +0200 | [diff] [blame^] | 1 | - --word2vec|lm-training-data option added to print word2vec input format |
| 2 | - --extract-metadata-regex added to extract some metadata values as context input for language model training |
Marc Kupietz | 15c84fd | 2021-10-12 12:20:27 +0200 | [diff] [blame] | 3 | - by default sentence boundary information is now read from structure.xml files (use --s-bounds-from-morpho otherwise) |
Marc Kupietz | f1fdc19 | 2021-10-08 13:29:59 +0200 | [diff] [blame] | 4 | - korapxml2conllu: use morpho.xml if present when run on base zips |
Marc Kupietz | d7d5d6a | 2021-10-11 17:52:58 +0200 | [diff] [blame] | 5 | - korapxml2conllu: new option -c <columns> |
Marc Kupietz | 97ba2ba | 2021-10-11 17:55:47 +0200 | [diff] [blame] | 6 | - conllu2korapxml: ignore _-lemmas |
Marc Kupietz | f1fdc19 | 2021-10-08 13:29:59 +0200 | [diff] [blame] | 7 | |
Marc Kupietz | a7d90c6 | 2021-07-31 23:48:13 +0200 | [diff] [blame] | 8 | 0.4.1 2021-07-31 |
| 9 | - korapxml2conllu: fix patterns not extracted for last texts in archive |
| 10 | |
Marc Kupietz | 6beca9d | 2021-07-29 18:26:09 +0200 | [diff] [blame] | 11 | 0.4 2021-07-29 |
Marc Kupietz | eb7d06a | 2021-03-19 16:29:16 +0100 | [diff] [blame] | 12 | - korapxml2conllu option -e <regex> added to extract element/attributes to comments |
Marc Kupietz | 0ab8a2c | 2021-03-19 16:21:00 +0100 | [diff] [blame] | 13 | |
Marc Kupietz | 22858f8 | 2021-02-15 14:22:05 +0100 | [diff] [blame] | 14 | 0.3 2021-02-15 |
Marc Kupietz | 79ba1e5 | 2021-02-12 17:26:54 +0100 | [diff] [blame] | 15 | - Provide conllu2korapxml to convert from ConLL-U to KorAP-XML zip |
| 16 | |
Marc Kupietz | b96c386 | 2021-02-12 08:33:44 +0100 | [diff] [blame] | 17 | 0.2 2021-02-12 |
Marc Kupietz | d845583 | 2021-02-11 17:30:29 +0100 | [diff] [blame] | 18 | - Convert also KorAP-XML base zips |
| 19 | |
Marc Kupietz | 396b4d6 | 2021-02-12 08:29:35 +0100 | [diff] [blame] | 20 | 0.1 2020-09-23 |
| 21 | - Initial release to GitHub. |