blob: 9f2fdf90101b888e5d2d8d21e7b048a7d837a728 [file] [log] [blame]
Akron249fc832024-06-04 16:36:44 +020010.6.3 2024-06-04
2 - Trim filenames to fix double space after filename metadata
3 - Set permissions of zip contents to 666
4 - Allow different foundries for morpho and dependency annotations
5 - Readme: Drop early stage warning
6 - Drop cherry-pick unfriendly test count prediction
7 - conllu2korapxml:
8 - escape &, <, >
9 - convert upos column to upos features
10
Marc Kupietz6dc97b72024-01-24 13:01:33 +0100110.6.2 2024-01-24
12 - Bump minimal perl version to 5.36 to improve unicode handling.
13 - korapxml2conllu
14 - Use implicit default utf8 encoding instead of explicit de/encodes. Speeds up processing by 10%.
15
Akrone24c9332023-03-22 10:59:33 +0100160.6.1 2023-03-22
17 - conllu2korapxml:
18 - Fix append for filehandle output.
19
Marc Kupietz66bb4952023-01-13 15:04:38 +0100200.6.0 2023-01-13
Marc Kupietz7e4cd6c2022-12-15 18:34:37 +010021 - korapxml2conllu:
Marc Kupietz534df182022-12-16 15:00:30 +010022 - the sigle-pattern option now affects the entire sigle
Marc Kupietz7e4cd6c2022-12-15 18:34:37 +010023 - handle docid attributes correctly if they are in a different line than their parent element <layer>
Akronf2b0bba2022-12-16 18:00:08 +010024 - Improve identification of offset errors
Marc Kupietz7e4cd6c2022-12-15 18:34:37 +010025
Marc Kupietz12f64e42022-09-29 08:58:16 +0200260.5.0 2022-09-29
27 - korapxml2conllu:
28 - --word2vec|lm-training-data option added to print word2vec input format
29 - --extract-metadata-regex added to extract some metadata values as context input for language model training
30 - by default sentence boundary information is now read from structure.xml files (use --s-bounds-from-morpho otherwise)
31 - use morpho.xml if present when run on base zips
32 - new option -c <columns>
33 - conllu2korapxml:
34 - ignore _-lemmas
35 - handle UDPipe comments
36 - ignore non-interpretable comments
37 - improve error handling for missing text ids and offsets
Marc Kupietzf1fdc192021-10-08 13:29:59 +020038
Marc Kupietza7d90c62021-07-31 23:48:13 +0200390.4.1 2021-07-31
40 - korapxml2conllu: fix patterns not extracted for last texts in archive
41
Marc Kupietz6beca9d2021-07-29 18:26:09 +0200420.4 2021-07-29
Marc Kupietzeb7d06a2021-03-19 16:29:16 +010043 - korapxml2conllu option -e <regex> added to extract element/attributes to comments
Marc Kupietz0ab8a2c2021-03-19 16:21:00 +010044
Marc Kupietz22858f82021-02-15 14:22:05 +0100450.3 2021-02-15
Marc Kupietz79ba1e52021-02-12 17:26:54 +010046 - Provide conllu2korapxml to convert from ConLL-U to KorAP-XML zip
47
Marc Kupietzb96c3862021-02-12 08:33:44 +0100480.2 2021-02-12
Marc Kupietzd8455832021-02-11 17:30:29 +010049 - Convert also KorAP-XML base zips
50
Marc Kupietz396b4d62021-02-12 08:29:35 +0100510.1 2020-09-23
52 - Initial release to GitHub.