Changelog
[2.1.0]
- added script 'GeneratePseudonymKey.groovy' to compute pseudonyms
- added script
Pseudonymize.groovy
to pseudonymize tokens (and lemmas)
[2.0] - 2021-10-07
- for
.*\\.(freq|tsv)(\\.gz)?
input files automatically cumulate frequencies - -N option added to sort keys with same frequency numerically
- --pad option added to optionally add padding symbols at text edges
- jar is now called totalngrams-2.0.jar
- support xz compression for input and output (single-threaded and slow)
- let number of folds default to 1 (-F option)