| - --word2vec|lm-training-data option added to print word2vec input format |
| - --extract-metadata-regex added to extract some metadata values as context input for language model training |
| - by default sentence boundary information is now read from structure.xml files (use --s-bounds-from-morpho otherwise) |
| - korapxml2conllu: use morpho.xml if present when run on base zips |
| - korapxml2conllu: new option -c <columns> |
| - conllu2korapxml: ignore _-lemmas |
| |
| 0.4.1 2021-07-31 |
| - korapxml2conllu: fix patterns not extracted for last texts in archive |
| |
| 0.4 2021-07-29 |
| - korapxml2conllu option -e <regex> added to extract element/attributes to comments |
| |
| 0.3 2021-02-15 |
| - Provide conllu2korapxml to convert from ConLL-U to KorAP-XML zip |
| |
| 0.2 2021-02-12 |
| - Convert also KorAP-XML base zips |
| |
| 0.1 2020-09-23 |
| - Initial release to GitHub. |