| 0.47 2022-07-27 |
| - Support for preferred language transformation. |
| - Support for NKJP taxonomies. |
| |
| 0.46 2022-07-21 |
| - Support NKJP Meta, Morpho and NamedEntities. |
| |
| 0.45 2022-03-04 |
| - Due to problems installing Archive::Tar::Builder |
| in certain environments, this is now optional, |
| with a pure perl fallback archiver. |
| - Support externalLink and internalLink universally in |
| i5 meta data. |
| |
| 0.44 2022-02-17 |
| - Improve Gingko Metadata support. |
| - Fix data-URIs by always refering to UTF-8. |
| - Warn on wrong token order. |
| - Improve Gingko Metadata support. |
| - Updated all dependencies. |
| |
| 0.43 2022-01-17 |
| - Fix temporary extract handling when defined |
| in a config file. |
| - Improve handling of invalid certainty values |
| in TreeTagger. |
| - Add log slimming function. |
| |
| 0.42 2021-10-11 |
| - Replaced Log4perl with Log::Any. |
| - Ignore level < 0 structures in DeReKo, but support |
| them for base annotations. |
| - Define resources in Makefile. |
| - Add GitHub action for CI. |
| - Remove MANIFEST file from repo. |
| - Introduce Gingko support. |
| - Fix data URIs to always encode percentage-wise. |
| |
| 0.41 2020-08-10 |
| - Added support for RWK annotations. |
| - Improved DGD support. |
| - Fixed bug in RWK support that broke on |
| some KorAP-XML files. |
| - Separate "real data" test suite from artificial |
| tests to prepare for CPAN release. |
| - Optimizations and cleanup based on profiling. |
| - Remove MultiTerm->add() in favor of |
| MultiTerm->add_by_term(). |
| - Optimization by reducing calls to _offset(). |
| - Introduced add_span() method to MultiTermToken. |
| - Removed deprecated 'primary' flag. |
| - Removed deprecated 'pretty' flag. |
| - Fix RWK paragraph handling. |
| - Updated 'Clone' dependency in Makefile. |
| - Make Sys::Info optional. |
| - Fixes a bug in XIP::Dependency and added |
| dependency checks for all annotation libraries. |
| |
| 0.40 2020-03-03 |
| - Fixed XIP parser. |
| - Added example corpus of the |
| Redewiedergabe-Korpus. |
| - Fixed span offset bug. |
| - Fixed milestones behind the last |
| token bug. |
| - Fixed gap behind last token bug. |
| - Fixed <base/s:t> length. |
| |
| 0.39 2020-02-19 |
| - Added Talismane support. |
| - Added "distributor" field to I5 metadata. |
| - Added DGD link field to I5 metadata. |
| - Improve logging. |
| - Added support for DGD pseudo-sentences |
| based on anchor milestones. |
| - Added brief explanation of the format. |
| - Fixed parsing of editionStmt. |
| - Added documentation for supported I5 metadata |
| fields. |
| - Added integrated benchmark mechanism. |
| |
| 0.38 2019-05-22 |
| - Stop file processing when base tokenization |
| is wrong. |
| - Added DGD support. |
| |
| 0.37 2019-03-06 |
| - Support for 'koral:field' array. |
| - Support for Koral versioning. |
| - Added tests for english sources. |
| - Added support for external links for |
| Wikipedia resources. |
| - Ignore temporary extraction |
| on directory archiving. |
| - Remove extract_text and extract_doc in |
| favor of extract_sigle for archives. |
| |
| 0.36 2019-01-22 |
| - Support for non-word tokens (fixes #5). |
| |
| 0.35 2018-09-24 |
| - Lift minimum version of Perl to 5.16 as for |
| "fc"-feature. |
| |
| 0.34 2018-07-19 |
| - Preliminary support for HNC. |
| |
| 0.33 2018-02-01 |
| - Added LWC support. |
| - Fixed TreeTagger certainties. |
| |
| 0.32 2017-10-24 |
| - Fixed tar building process in script. |
| - Support file extensions in base tokenization parameter. |
| |
| 0.31 2017-06-30 |
| - Fixed exit codes in script. |
| - Use CORE::fc for case folding. |
| |
| 0.30 2017-06-19 |
| - Fixed permission handling in test suite. |
| - Added preliminary CMC support. |
| |
| 0.29 2017-04-23 |
| - support --to-tar flag. |
| |
| 0.28 2017-04-12 |
| - Improved overwriting behaviour for unzip. |
| - Introduced --sequential-extraction flag. |
| |
| 0.27 2017-04-10 |
| - Support configuration files. |
| - Support temporary extraction. |
| - Support serial conversion. |
| - Support input-base. |
| |
| 0.26 2017-04-06 |
| - Support wildcards on input. |
| |
| 0.25 2017-03-14 |
| - Updated to Mojolicious 7.20 |
| - Fixed meta treatment in case analytic and monogr |
| are available |
| - Added DRuKoLa support to script |
| - Liberated document and text sigle handling to be |
| compliant with CoRoLa. |
| - Added support for pagebreak annotations. |
| - Renamed "pages" to "srcPages". |
| - Fixed handling of prefixes for text sigles. |
| - Support for MarMoT. |
| - Fix case insensitivity. |
| - Added preliminary support for diacritic insensitivity. |
| |
| 0.24 2016-12-21 |
| - Added --base-sentences and --base-paragraphs options |
| |
| 0.23 2016-11-03 |
| - Added wildcard support for document extraction |
| - Fixed archive iteration to not duplicate the first archive |
| - Added parallel extraction for document sigles |
| - Improved return value for existing files |
| - Don't warn on recursion in CoreNLP/Constituency |
| |
| 0.22 2016-10-26 |
| - Added support for document extraction |
| - Fixed archive naming |
| |
| 0.21 2016-10-24 |
| - Improved Windows support |
| |
| 0.20 2016-10-15 |
| - Fixed treatment of temporary folders in script |
| |
| 0.19 2016-08-17 |
| - Added test for direct I5 support. |
| - Fixed support for Mojolicious 7. |
| - Added script test. |
| - Fixed setting multiple annotations in |
| script. |
| - Fixed output of version and help messages. |
| - Added script test for extraction. |
| - Fixed extraction with multiple archives and prefix |
| negation support. |
| - Added script test for archives. |
| |
| 0.18 2016-07-08 |
| - Added REI test. |
| - Added multiple archive support to korapxml2krill. |
| - Added support for prefix negation in korapxml2krill. |
| - Added support for Malt#Dependency. |
| - Improved test suite for caching and REI. |
| - Added support for MDParser annotation. |
| - Added batch processing class for documents. |
| |
| 0.17 2016-03-22 |
| - Rewrite siglen to use slashes as separators. |
| - Zip listing optimized. Does no longer work with primary data |
| in text.xml files. |
| |
| 0.16 2016-03-18 |
| - Added caching mechanism for |
| metadata. |
| |
| 0.15 2016-03-17 |
| - Modularized metadata handling. |
| - Simplified metadata handling. |
| - Added --meta option to script. |
| - Removed deprecated --human option from script. |
| |
| 0.14 2016-03-15 |
| - Renamed ::Index to ::Annotate and ::Field to ::Index. |
| - Renamed 'allow' to 'anno' as parameters of the script. |
| - Added readme. |
| |
| 0.13 2016-03-10 |
| - Removed korapxml2krill_dir. |
| - Renamed dependency nodes. |
| - Made dependency relations more effective (trimmed down TUIs) |
| ! This is currently very slow ! |
| |
| 0.12 2016-02-28 |
| - Added extract method to korapxml2krill. |
| - Fixed Mate/Dependency. |
| - Fixed skip flag in korapxml2krill. |
| - Ignore spans outside the token range |
| (i.e. character offsets end before tokens have started). |
| |
| 0.11 2016-02-23 |
| - Merged korapxml2krill and korapxml2krill_dir. |
| |
| 0.10 2016-02-15 |
| - Added EXPERIMENTAL support for parallel jobs. |
| |
| 0.09 2016-02-15 |
| - Fixed temporary directory handling in scripts. |
| - Improved skipping for archive handling in scripts. |
| |
| 0.08 2016-02-14 |
| - Added support for archive streaming. |
| - Improved scripts. |
| |
| 0.07 2016-02-13 |
| - Improved support for Schreibgebrauch meta data |
| (IDS flavour). |
| |
| 0.06 2016-02-11 |
| - Improved support for Schreibgebrauch meta data |
| (Duden flavour). |
| |
| 0.05 2016-02-04 |
| - Changed KorAP::Document to KorAP::XML::Krill. |
| - Renamed "Schreibgebrauch" to "Sgbr". |
| - Preparation for GitHub release. |
| |
| 0.04 2016-01-28 |
| - Added PTI to all payloads. |
| - Added support for empty elements. |
| - Added support for element attributes in struct. |
| - Added meta data support for Schreibgebrauch. |
| - Fixed test suite for meta data. |
| |
| 0.03 2014-11-03 |
| - Added new metadata scheme. |
| - Fixed a minor bug in the constituency tree building. |
| - Sorted terms in tokens a priori. |
| |
| 0.02 2014-07-21 |
| - Sentence annotations for all providing foundries |
| - Starting subtokenization |
| |
| 0.01 2014-04-15 |
| - [bugfix] for first token annotations |
| - Sentences are now available from all foundries that have it |
| - <>:p is now <>:base/para |
| - Added <>:base/text |