commit | 854a11503d3c0e9d22f322c26572d2e032192fc0 | [log] [tgz] |
---|---|---|
author | Peter Harders <harders@ids-mannheim.de> | Wed Jul 22 22:48:02 2020 +0200 |
committer | Peter Harders <harders@ids-mannheim.de> | Fri Jul 24 20:24:20 2020 +0200 |
tree | 39ea0c4db5401d1097ec1a12ed33d27d217376df | |
parent | 1d65f9467ab04537821c0d6efd565c49ac3649fb [diff] |
bugfixing Conservative.pm 1. identified wrong tokenization caused by wrong pattern match ($3) (wrote a test in t/tokenization.t, that shows the wrong tokenization) 2. removed wrong pattern match ($3) and adjusted test in t/tokenization.t 3. cleaned up (also changed some comments) 4. fixed missing tokenization of first punctuation char 5. exchanged [^A-Za-z0-9] by [\p{Punct}\s] (TODO: yet no approp. test found) Change-Id: Ib494c79c3e6971a57ad874fc62583c625095cf28