commit | 1d65f9467ab04537821c0d6efd565c49ac3649fb | [log] [tgz] |
---|---|---|
author | Peter Harders <harders@ids-mannheim.de> | Wed Jul 22 23:31:00 2020 +0200 |
committer | Peter Harders <harders@ids-mannheim.de> | Fri Jul 24 20:23:35 2020 +0200 |
tree | 3e69613385c58635bd43f1cab4e0506f1e7a2d21 | |
parent | 5fb5e8d0fe8f3b16277a77a68b732dd42a80657b [diff] |
extend tests for wikipedia.txt in t/tokenization.t . extended testing for wikipedia.txt, so that UTF8 characters are read . fixed bug related to UTF-8 . TODO: testing is very slow after bugfix Change-Id: I7d63e1b87c10bab85789098b3b7ce63f359dc49e