Akron | 1cdbc9d | 2020-05-07 15:28:54 +0200 | [diff] [blame^] | 1 | 0.41 2020-05-07 |
Akron | 07e2477 | 2020-04-23 14:00:54 +0200 | [diff] [blame] | 2 | - Added support for RWK annotations. |
Akron | 1cdbc9d | 2020-05-07 15:28:54 +0200 | [diff] [blame^] | 3 | - Improved DGD support. |
Akron | 07e2477 | 2020-04-23 14:00:54 +0200 | [diff] [blame] | 4 | |
Akron | dec4312 | 2020-03-03 11:22:25 +0100 | [diff] [blame] | 5 | 0.40 2020-03-03 |
Akron | a0d5af3 | 2020-03-01 12:46:30 +0100 | [diff] [blame] | 6 | - Fixed XIP parser. |
Akron | b62d92a | 2020-03-01 16:32:00 +0100 | [diff] [blame] | 7 | - Added example corpus of the |
8 | Redewiedergabe-Korpus. | ||||
9 | - Fixed span offset bug. | ||||
10 | - Fixed milestones behind the last | ||||
11 | token bug. | ||||
Akron | dec4312 | 2020-03-03 11:22:25 +0100 | [diff] [blame] | 12 | - Fixed gap behind last token bug. |
13 | - Fixed <base/s:t> length. | ||||
Akron | a0d5af3 | 2020-03-01 12:46:30 +0100 | [diff] [blame] | 14 | |
Akron | 6e886f7 | 2020-02-19 07:42:32 +0100 | [diff] [blame] | 15 | 0.39 2020-02-19 |
Akron | 7d5e638 | 2019-08-08 16:36:27 +0200 | [diff] [blame] | 16 | - Added Talismane support. |
Akron | 0d68a4b | 2019-11-13 15:42:11 +0100 | [diff] [blame] | 17 | - Added "distributor" field to I5 metadata. |
Akron | 2029455 | 2019-11-29 16:15:35 +0100 | [diff] [blame] | 18 | - Added DGD link field to I5 metadata. |
Akron | b05b842 | 2019-12-11 13:47:57 +0100 | [diff] [blame] | 19 | - Improve logging. |
Akron | c29b8e1 | 2019-12-16 14:28:09 +0100 | [diff] [blame] | 20 | - Added support for DGD pseudo-sentences |
21 | based on anchor milestones. | ||||
Akron | 8f69d63 | 2020-01-15 16:58:11 +0100 | [diff] [blame] | 22 | - Added brief explanation of the format. |
Akron | d4c5c10 | 2020-02-11 11:47:59 +0100 | [diff] [blame] | 23 | - Fixed parsing of editionStmt. |
24 | - Added documentation for supported I5 metadata | ||||
25 | fields. | ||||
Akron | 6e886f7 | 2020-02-19 07:42:32 +0100 | [diff] [blame] | 26 | - Added integrated benchmark mechanism. |
Akron | 7d5e638 | 2019-08-08 16:36:27 +0200 | [diff] [blame] | 27 | |
Akron | 57510c1 | 2019-01-04 14:58:53 +0100 | [diff] [blame] | 28 | 0.38 2019-05-22 |
Akron | 9b04f60 | 2019-03-08 18:45:35 +0100 | [diff] [blame] | 29 | - Stop file processing when base tokenization |
30 | is wrong. | ||||
Akron | 57510c1 | 2019-01-04 14:58:53 +0100 | [diff] [blame] | 31 | - Added DGD support. |
Akron | 9b04f60 | 2019-03-08 18:45:35 +0100 | [diff] [blame] | 32 | |
Akron | eaffe93 | 2019-03-07 17:14:42 +0100 | [diff] [blame] | 33 | 0.37 2019-03-06 |
Akron | 263274c | 2019-02-07 09:48:30 +0100 | [diff] [blame] | 34 | - Support for 'koral:field' array. |
35 | - Support for Koral versioning. | ||||
Akron | 4e1712c | 2019-02-04 22:29:37 +0100 | [diff] [blame] | 36 | - Added tests for english sources. |
Akron | 6bf3cc9 | 2019-02-07 12:11:20 +0100 | [diff] [blame] | 37 | - Added support for external links for |
38 | Wikipedia resources. | ||||
Akron | 63d03ee | 2019-02-13 18:49:38 +0100 | [diff] [blame] | 39 | - Ignore temporary extraction |
40 | on directory archiving. | ||||
Akron | 955b75b | 2019-02-21 14:28:41 +0100 | [diff] [blame] | 41 | - Remove extract_text and extract_doc in |
42 | favor of extract_sigle for archives. | ||||
Akron | 263274c | 2019-02-07 09:48:30 +0100 | [diff] [blame] | 43 | |
Akron | ed9baf0 | 2019-01-22 17:03:25 +0100 | [diff] [blame] | 44 | 0.36 2019-01-22 |
45 | - Support for non-word tokens (fixes #5). | ||||
46 | |||||
Akron | 6eff23b | 2018-09-24 10:31:20 +0200 | [diff] [blame] | 47 | 0.35 2018-09-24 |
48 | - Lift minimum version of Perl to 5.16 as for | ||||
49 | "fc"-feature. | ||||
50 | |||||
Akron | dd1c0f1 | 2018-07-19 06:45:28 +0200 | [diff] [blame] | 51 | 0.34 2018-07-19 |
52 | - Preliminary support for HNC. | ||||
53 | |||||
Akron | 28dc17f | 2018-02-01 15:31:41 +0100 | [diff] [blame] | 54 | 0.33 2018-02-01 |
Akron | 4c67919 | 2018-01-16 17:41:49 +0100 | [diff] [blame] | 55 | - Added LWC support. |
Akron | 28dc17f | 2018-02-01 15:31:41 +0100 | [diff] [blame] | 56 | - Fixed TreeTagger certainties. |
Akron | 4c67919 | 2018-01-16 17:41:49 +0100 | [diff] [blame] | 57 | |
Akron | 3c56f50 | 2017-10-24 15:37:27 +0200 | [diff] [blame] | 58 | 0.32 2017-10-24 |
Akron | 9a062ce | 2017-07-04 19:12:05 +0200 | [diff] [blame] | 59 | - Fixed tar building process in script. |
Akron | 3c56f50 | 2017-10-24 15:37:27 +0200 | [diff] [blame] | 60 | - Support file extensions in base tokenization parameter. |
Akron | 9a062ce | 2017-07-04 19:12:05 +0200 | [diff] [blame] | 61 | |
Akron | 0a6cce1 | 2017-06-30 23:03:21 +0200 | [diff] [blame] | 62 | 0.31 2017-06-30 |
Akron | 3abc03e | 2017-06-29 16:23:35 +0200 | [diff] [blame] | 63 | - Fixed exit codes in script. |
Akron | 0a6cce1 | 2017-06-30 23:03:21 +0200 | [diff] [blame] | 64 | - Use CORE::fc for case folding. |
Akron | 3abc03e | 2017-06-29 16:23:35 +0200 | [diff] [blame] | 65 | |
Akron | d5bb434 | 2017-06-19 11:50:49 +0200 | [diff] [blame] | 66 | 0.30 2017-06-19 |
67 | - Fixed permission handling in test suite. | ||||
Akron | ce125b6 | 2017-06-19 11:54:36 +0200 | [diff] [blame] | 68 | - Added preliminary CMC support. |
Akron | d5bb434 | 2017-06-19 11:50:49 +0200 | [diff] [blame] | 69 | |
Akron | da3097e | 2017-04-23 19:53:57 +0200 | [diff] [blame] | 70 | 0.29 2017-04-23 |
71 | - support --to-tar flag. | ||||
72 | |||||
Akron | 9ec8887 | 2017-04-12 16:29:06 +0200 | [diff] [blame] | 73 | 0.28 2017-04-12 |
Akron | 86db52e | 2017-04-11 20:36:43 +0200 | [diff] [blame] | 74 | - Improved overwriting behaviour for unzip. |
Akron | 9ec8887 | 2017-04-12 16:29:06 +0200 | [diff] [blame] | 75 | - Introduced --sequential-extraction flag. |
Akron | 86db52e | 2017-04-11 20:36:43 +0200 | [diff] [blame] | 76 | |
Akron | 63f20d4 | 2017-04-10 23:40:29 +0200 | [diff] [blame] | 77 | 0.27 2017-04-10 |
Akron | 636aa11 | 2017-04-07 18:48:56 +0200 | [diff] [blame] | 78 | - Support configuration files. |
Akron | 8150010 | 2017-04-07 20:45:44 +0200 | [diff] [blame] | 79 | - Support temporary extraction. |
Akron | 63f20d4 | 2017-04-10 23:40:29 +0200 | [diff] [blame] | 80 | - Support serial conversion. |
81 | - Support input-base. | ||||
Akron | 636aa11 | 2017-04-07 18:48:56 +0200 | [diff] [blame] | 82 | |
83 | 0.26 2017-04-06 | ||||
84 | - Support wildcards on input. | ||||
85 | |||||
Akron | 5809fea | 2017-03-14 20:02:26 +0100 | [diff] [blame] | 86 | 0.25 2017-03-14 |
Akron | 7e2eb88 | 2017-01-18 17:28:07 +0100 | [diff] [blame] | 87 | - Updated to Mojolicious 7.20 |
88 | - Fixed meta treatment in case analytic and monogr | ||||
89 | are available | ||||
Akron | 4fa37c3 | 2017-01-20 14:43:10 +0100 | [diff] [blame] | 90 | - Added DRuKoLa support to script |
Akron | 3887301 | 2017-02-06 20:27:37 +0100 | [diff] [blame] | 91 | - Liberated document and text sigle handling to be |
92 | compliant with CoRoLa. | ||||
Akron | 41ac10b | 2017-02-08 22:47:25 +0100 | [diff] [blame] | 93 | - Added support for pagebreak annotations. |
Akron | 08d5445 | 2017-02-16 23:19:49 +0100 | [diff] [blame] | 94 | - Renamed "pages" to "srcPages". |
Akron | 60a8caa | 2017-02-17 21:51:27 +0100 | [diff] [blame] | 95 | - Fixed handling of prefixes for text sigles. |
Akron | 3bd942f | 2017-02-20 20:09:14 +0100 | [diff] [blame] | 96 | - Support for MarMoT. |
Akron | 5809fea | 2017-03-14 20:02:26 +0100 | [diff] [blame] | 97 | - Fix case insensitivity. |
Akron | 55778f0 | 2017-03-14 20:47:26 +0100 | [diff] [blame] | 98 | - Added preliminary support for diacritic insensitivity. |
Akron | 3ec0a1c | 2017-01-18 14:41:55 +0100 | [diff] [blame] | 99 | |
Akron | 3741f8b | 2016-12-21 19:55:21 +0100 | [diff] [blame] | 100 | 0.24 2016-12-21 |
101 | - Added --base-sentences and --base-paragraphs options | ||||
102 | |||||
Akron | 6f9fef5 | 2016-11-03 17:06:40 +0100 | [diff] [blame] | 103 | 0.23 2016-11-03 |
Akron | 2fd402b | 2016-10-27 21:26:48 +0200 | [diff] [blame] | 104 | - Added wildcard support for document extraction |
Akron | 2812ba2 | 2016-10-28 21:55:59 +0200 | [diff] [blame] | 105 | - Fixed archive iteration to not duplicate the first archive |
106 | - Added parallel extraction for document sigles | ||||
Akron | 13d5662 | 2016-10-31 14:54:49 +0100 | [diff] [blame] | 107 | - Improved return value for existing files |
Akron | 3741f8b | 2016-12-21 19:55:21 +0100 | [diff] [blame] | 108 | - Don't warn on recursion in CoreNLP/Constituency |
Akron | 2fd402b | 2016-10-27 21:26:48 +0200 | [diff] [blame] | 109 | |
Akron | 2080758 | 2016-10-26 17:11:34 +0200 | [diff] [blame] | 110 | 0.22 2016-10-26 |
111 | - Added support for document extraction | ||||
Akron | b4bbec7 | 2016-10-26 20:21:02 +0200 | [diff] [blame] | 112 | - Fixed archive naming |
Akron | 2080758 | 2016-10-26 17:11:34 +0200 | [diff] [blame] | 113 | |
Akron | b4bbec7 | 2016-10-26 20:21:02 +0200 | [diff] [blame] | 114 | 0.21 2016-10-24 |
Nils Diewald | b3e9ccd | 2016-10-24 15:16:52 +0200 | [diff] [blame] | 115 | - Improved Windows support |
116 | |||||
Akron | 4c0cf31 | 2016-10-15 16:42:09 +0200 | [diff] [blame] | 117 | 0.20 2016-10-15 |
118 | - Fixed treatment of temporary folders in script | ||||
119 | |||||
Akron | bdb6465 | 2016-08-17 23:30:01 +0200 | [diff] [blame] | 120 | 0.19 2016-08-17 |
Akron | 92ad95b | 2016-08-15 23:38:56 +0200 | [diff] [blame] | 121 | - Added test for direct I5 support. |
122 | - Fixed support for Mojolicious 7. | ||||
123 | - Added script test. | ||||
Akron | 5f51d42 | 2016-08-16 16:26:43 +0200 | [diff] [blame] | 124 | - Fixed setting multiple annotations in |
125 | script. | ||||
Akron | e2b902d | 2016-08-16 16:50:11 +0200 | [diff] [blame] | 126 | - Fixed output of version and help messages. |
Akron | 7d4cdd8 | 2016-08-17 21:39:45 +0200 | [diff] [blame] | 127 | - Added script test for extraction. |
Akron | 651cb8d | 2016-08-16 21:44:49 +0200 | [diff] [blame] | 128 | - Fixed extraction with multiple archives and prefix |
129 | negation support. | ||||
Akron | 7d4cdd8 | 2016-08-17 21:39:45 +0200 | [diff] [blame] | 130 | - Added script test for archives. |
Akron | 1924bbe | 2016-06-22 16:05:41 +0200 | [diff] [blame] | 131 | |
Akron | bdb6465 | 2016-08-17 23:30:01 +0200 | [diff] [blame] | 132 | 0.18 2016-07-08 |
133 | - Added REI test. | ||||
134 | - Added multiple archive support to korapxml2krill. | ||||
135 | - Added support for prefix negation in korapxml2krill. | ||||
136 | - Added support for Malt#Dependency. | ||||
137 | - Improved test suite for caching and REI. | ||||
138 | - Added support for MDParser annotation. | ||||
139 | - Added batch processing class for documents. | ||||
140 | |||||
Akron | 1cd5b87 | 2016-03-22 00:23:46 +0100 | [diff] [blame] | 141 | 0.17 2016-03-22 |
142 | - Rewrite siglen to use slashes as separators. | ||||
Akron | 5f51d42 | 2016-08-16 16:26:43 +0200 | [diff] [blame] | 143 | - Zip listing optimized. Does no longer work with primary data |
144 | in text.xml files. | ||||
Akron | 1cd5b87 | 2016-03-22 00:23:46 +0100 | [diff] [blame] | 145 | |
Akron | 11c8030 | 2016-03-18 19:44:43 +0100 | [diff] [blame] | 146 | 0.16 2016-03-18 |
147 | - Added caching mechanism for | ||||
Akron | 5f51d42 | 2016-08-16 16:26:43 +0200 | [diff] [blame] | 148 | metadata. |
Akron | 11c8030 | 2016-03-18 19:44:43 +0100 | [diff] [blame] | 149 | |
Akron | 35db6e3 | 2016-03-17 22:42:22 +0100 | [diff] [blame] | 150 | 0.15 2016-03-17 |
151 | - Modularized metadata handling. | ||||
152 | - Simplified metadata handling. | ||||
Akron | 5f51d42 | 2016-08-16 16:26:43 +0200 | [diff] [blame] | 153 | - Added --meta option to script. |
154 | - Removed deprecated --human option from script. | ||||
Akron | 35db6e3 | 2016-03-17 22:42:22 +0100 | [diff] [blame] | 155 | |
Akron | c13a170 | 2016-03-15 19:33:14 +0100 | [diff] [blame] | 156 | 0.14 2016-03-15 |
Akron | 151676d | 2016-03-14 20:12:14 +0100 | [diff] [blame] | 157 | - Renamed ::Index to ::Annotate and ::Field to ::Index. |
Akron | 5f51d42 | 2016-08-16 16:26:43 +0200 | [diff] [blame] | 158 | - Renamed 'allow' to 'anno' as parameters of the script. |
159 | - Added readme. | ||||
Akron | 151676d | 2016-03-14 20:12:14 +0100 | [diff] [blame] | 160 | |
Akron | 5b25431 | 2016-03-10 00:29:56 +0100 | [diff] [blame] | 161 | 0.13 2016-03-10 |
Akron | 44feb4e | 2016-03-02 12:45:47 +0100 | [diff] [blame] | 162 | - Removed korapxml2krill_dir. |
Akron | 5f51d42 | 2016-08-16 16:26:43 +0200 | [diff] [blame] | 163 | - Renamed dependency nodes. |
164 | - Made dependency relations more effective (trimmed down TUIs) | ||||
165 | ! This is currently very slow ! | ||||
Akron | 44feb4e | 2016-03-02 12:45:47 +0100 | [diff] [blame] | 166 | |
Akron | dc898d8 | 2016-02-28 23:49:19 +0100 | [diff] [blame] | 167 | 0.12 2016-02-28 |
Akron | e10ad32 | 2016-02-27 10:54:26 +0100 | [diff] [blame] | 168 | - Added extract method to korapxml2krill. |
Akron | 5f51d42 | 2016-08-16 16:26:43 +0200 | [diff] [blame] | 169 | - Fixed Mate/Dependency. |
170 | - Fixed skip flag in korapxml2krill. | ||||
171 | - Ignore spans outside the token range | ||||
172 | (i.e. character offsets end before tokens have started). | ||||
Akron | e10ad32 | 2016-02-27 10:54:26 +0100 | [diff] [blame] | 173 | |
Akron | 941c1a6 | 2016-02-23 17:41:41 +0100 | [diff] [blame] | 174 | 0.11 2016-02-23 |
Akron | 44feb4e | 2016-03-02 12:45:47 +0100 | [diff] [blame] | 175 | - Merged korapxml2krill and korapxml2krill_dir. |
Akron | 941c1a6 | 2016-02-23 17:41:41 +0100 | [diff] [blame] | 176 | |
Akron | 96165ad | 2016-02-15 18:09:41 +0100 | [diff] [blame] | 177 | 0.10 2016-02-15 |
178 | - Added EXPERIMENTAL support for parallel jobs. | ||||
179 | |||||
Akron | c1babed | 2016-02-15 11:48:18 +0100 | [diff] [blame] | 180 | 0.09 2016-02-15 |
181 | - Fixed temporary directory handling in scripts. | ||||
Akron | 5f51d42 | 2016-08-16 16:26:43 +0200 | [diff] [blame] | 182 | - Improved skipping for archive handling in scripts. |
Akron | c1babed | 2016-02-15 11:48:18 +0100 | [diff] [blame] | 183 | |
Akron | 150b29e | 2016-02-14 23:06:48 +0100 | [diff] [blame] | 184 | 0.08 2016-02-14 |
185 | - Added support for archive streaming. | ||||
Akron | 5f51d42 | 2016-08-16 16:26:43 +0200 | [diff] [blame] | 186 | - Improved scripts. |
Akron | 150b29e | 2016-02-14 23:06:48 +0100 | [diff] [blame] | 187 | |
Akron | 8c84aa5 | 2016-02-13 21:26:54 +0100 | [diff] [blame] | 188 | 0.07 2016-02-13 |
189 | - Improved support for Schreibgebrauch meta data | ||||
Akron | 5f51d42 | 2016-08-16 16:26:43 +0200 | [diff] [blame] | 190 | (IDS flavour). |
Akron | 8c84aa5 | 2016-02-13 21:26:54 +0100 | [diff] [blame] | 191 | |
192 | 0.06 2016-02-11 | ||||
Akron | 49a4765 | 2016-02-12 18:17:19 +0100 | [diff] [blame] | 193 | - Improved support for Schreibgebrauch meta data |
Akron | 5f51d42 | 2016-08-16 16:26:43 +0200 | [diff] [blame] | 194 | (Duden flavour). |
Akron | 49a4765 | 2016-02-12 18:17:19 +0100 | [diff] [blame] | 195 | |
Akron | 93d620e | 2016-02-05 19:40:05 +0100 | [diff] [blame] | 196 | 0.05 2016-02-04 |
Akron | e4c2e41 | 2016-01-28 15:10:50 +0100 | [diff] [blame] | 197 | - Changed KorAP::Document to KorAP::XML::Krill. |
Akron | 5f51d42 | 2016-08-16 16:26:43 +0200 | [diff] [blame] | 198 | - Renamed "Schreibgebrauch" to "Sgbr". |
199 | - Preparation for GitHub release. | ||||
Akron | e4c2e41 | 2016-01-28 15:10:50 +0100 | [diff] [blame] | 200 | |
Akron | 9c0488f | 2016-01-28 14:17:15 +0100 | [diff] [blame] | 201 | 0.04 2016-01-28 |
Akron | 69a4a2f | 2016-01-17 12:55:50 +0100 | [diff] [blame] | 202 | - Added PTI to all payloads. |
Akron | 5f51d42 | 2016-08-16 16:26:43 +0200 | [diff] [blame] | 203 | - Added support for empty elements. |
204 | - Added support for element attributes in struct. | ||||
205 | - Added meta data support for Schreibgebrauch. | ||||
206 | - Fixed test suite for meta data. | ||||
Akron | 69a4a2f | 2016-01-17 12:55:50 +0100 | [diff] [blame] | 207 | |
208 | 0.03 2014-11-03 | ||||
Nils Diewald | 7867467 | 2014-11-03 21:43:12 +0000 | [diff] [blame] | 209 | - Added new metadata scheme. |
Akron | 5f51d42 | 2016-08-16 16:26:43 +0200 | [diff] [blame] | 210 | - Fixed a minor bug in the constituency tree building. |
211 | - Sorted terms in tokens a priori. | ||||
Nils Diewald | 7867467 | 2014-11-03 21:43:12 +0000 | [diff] [blame] | 212 | |
Akron | 69a4a2f | 2016-01-17 12:55:50 +0100 | [diff] [blame] | 213 | 0.02 2014-07-21 |
Nils Diewald | f03c680 | 2014-07-21 16:39:44 +0000 | [diff] [blame] | 214 | - Sentence annotations for all providing foundries |
Akron | 5f51d42 | 2016-08-16 16:26:43 +0200 | [diff] [blame] | 215 | - Starting subtokenization |
Nils Diewald | f03c680 | 2014-07-21 16:39:44 +0000 | [diff] [blame] | 216 | |
Akron | 69a4a2f | 2016-01-17 12:55:50 +0100 | [diff] [blame] | 217 | 0.01 2014-04-15 |
Akron | 5f51d42 | 2016-08-16 16:26:43 +0200 | [diff] [blame] | 218 | - [bugfix] for first token annotations |
Nils Diewald | 7b84722 | 2014-04-23 11:14:00 +0000 | [diff] [blame] | 219 | - Sentences are now available from all foundries that have it |
220 | - <>:p is now <>:base/para | ||||
Akron | 5f51d42 | 2016-08-16 16:26:43 +0200 | [diff] [blame] | 221 | - Added <>:base/text |