blob: a351a922641be200b38062deee80b201f7d75674 [file] [log] [blame]
0.40 2020-03-01
- Fixed XIP parser.
- Added example corpus of the
Redewiedergabe-Korpus.
- Fixed span offset bug.
- Fixed milestones behind the last
token bug.
0.39 2020-02-19
- Added Talismane support.
- Added "distributor" field to I5 metadata.
- Added DGD link field to I5 metadata.
- Improve logging.
- Added support for DGD pseudo-sentences
based on anchor milestones.
- Added brief explanation of the format.
- Fixed parsing of editionStmt.
- Added documentation for supported I5 metadata
fields.
- Added integrated benchmark mechanism.
0.38 2019-05-22
- Stop file processing when base tokenization
is wrong.
- Added DGD support.
0.37 2019-03-06
- Support for 'koral:field' array.
- Support for Koral versioning.
- Added tests for english sources.
- Added support for external links for
Wikipedia resources.
- Ignore temporary extraction
on directory archiving.
- Remove extract_text and extract_doc in
favor of extract_sigle for archives.
0.36 2019-01-22
- Support for non-word tokens (fixes #5).
0.35 2018-09-24
- Lift minimum version of Perl to 5.16 as for
"fc"-feature.
0.34 2018-07-19
- Preliminary support for HNC.
0.33 2018-02-01
- Added LWC support.
- Fixed TreeTagger certainties.
0.32 2017-10-24
- Fixed tar building process in script.
- Support file extensions in base tokenization parameter.
0.31 2017-06-30
- Fixed exit codes in script.
- Use CORE::fc for case folding.
0.30 2017-06-19
- Fixed permission handling in test suite.
- Added preliminary CMC support.
0.29 2017-04-23
- support --to-tar flag.
0.28 2017-04-12
- Improved overwriting behaviour for unzip.
- Introduced --sequential-extraction flag.
0.27 2017-04-10
- Support configuration files.
- Support temporary extraction.
- Support serial conversion.
- Support input-base.
0.26 2017-04-06
- Support wildcards on input.
0.25 2017-03-14
- Updated to Mojolicious 7.20
- Fixed meta treatment in case analytic and monogr
are available
- Added DRuKoLa support to script
- Liberated document and text sigle handling to be
compliant with CoRoLa.
- Added support for pagebreak annotations.
- Renamed "pages" to "srcPages".
- Fixed handling of prefixes for text sigles.
- Support for MarMoT.
- Fix case insensitivity.
- Added preliminary support for diacritic insensitivity.
0.24 2016-12-21
- Added --base-sentences and --base-paragraphs options
0.23 2016-11-03
- Added wildcard support for document extraction
- Fixed archive iteration to not duplicate the first archive
- Added parallel extraction for document sigles
- Improved return value for existing files
- Don't warn on recursion in CoreNLP/Constituency
0.22 2016-10-26
- Added support for document extraction
- Fixed archive naming
0.21 2016-10-24
- Improved Windows support
0.20 2016-10-15
- Fixed treatment of temporary folders in script
0.19 2016-08-17
- Added test for direct I5 support.
- Fixed support for Mojolicious 7.
- Added script test.
- Fixed setting multiple annotations in
script.
- Fixed output of version and help messages.
- Added script test for extraction.
- Fixed extraction with multiple archives and prefix
negation support.
- Added script test for archives.
0.18 2016-07-08
- Added REI test.
- Added multiple archive support to korapxml2krill.
- Added support for prefix negation in korapxml2krill.
- Added support for Malt#Dependency.
- Improved test suite for caching and REI.
- Added support for MDParser annotation.
- Added batch processing class for documents.
0.17 2016-03-22
- Rewrite siglen to use slashes as separators.
- Zip listing optimized. Does no longer work with primary data
in text.xml files.
0.16 2016-03-18
- Added caching mechanism for
metadata.
0.15 2016-03-17
- Modularized metadata handling.
- Simplified metadata handling.
- Added --meta option to script.
- Removed deprecated --human option from script.
0.14 2016-03-15
- Renamed ::Index to ::Annotate and ::Field to ::Index.
- Renamed 'allow' to 'anno' as parameters of the script.
- Added readme.
0.13 2016-03-10
- Removed korapxml2krill_dir.
- Renamed dependency nodes.
- Made dependency relations more effective (trimmed down TUIs)
! This is currently very slow !
0.12 2016-02-28
- Added extract method to korapxml2krill.
- Fixed Mate/Dependency.
- Fixed skip flag in korapxml2krill.
- Ignore spans outside the token range
(i.e. character offsets end before tokens have started).
0.11 2016-02-23
- Merged korapxml2krill and korapxml2krill_dir.
0.10 2016-02-15
- Added EXPERIMENTAL support for parallel jobs.
0.09 2016-02-15
- Fixed temporary directory handling in scripts.
- Improved skipping for archive handling in scripts.
0.08 2016-02-14
- Added support for archive streaming.
- Improved scripts.
0.07 2016-02-13
- Improved support for Schreibgebrauch meta data
(IDS flavour).
0.06 2016-02-11
- Improved support for Schreibgebrauch meta data
(Duden flavour).
0.05 2016-02-04
- Changed KorAP::Document to KorAP::XML::Krill.
- Renamed "Schreibgebrauch" to "Sgbr".
- Preparation for GitHub release.
0.04 2016-01-28
- Added PTI to all payloads.
- Added support for empty elements.
- Added support for element attributes in struct.
- Added meta data support for Schreibgebrauch.
- Fixed test suite for meta data.
0.03 2014-11-03
- Added new metadata scheme.
- Fixed a minor bug in the constituency tree building.
- Sorted terms in tokens a priori.
0.02 2014-07-21
- Sentence annotations for all providing foundries
- Starting subtokenization
0.01 2014-04-15
- [bugfix] for first token annotations
- Sentences are now available from all foundries that have it
- <>:p is now <>:base/para
- Added <>:base/text