| commit | 854a11503d3c0e9d22f322c26572d2e032192fc0 | [log] [tgz] |
|---|---|---|
| author | Peter Harders <harders@ids-mannheim.de> | Wed Jul 22 22:48:02 2020 +0200 |
| committer | Peter Harders <harders@ids-mannheim.de> | Fri Jul 24 20:24:20 2020 +0200 |
| tree | 39ea0c4db5401d1097ec1a12ed33d27d217376df | |
| parent | 1d65f9467ab04537821c0d6efd565c49ac3649fb [diff] |
bugfixing Conservative.pm
1. identified wrong tokenization caused by wrong pattern match ($3)
(wrote a test in t/tokenization.t, that shows the wrong tokenization)
2. removed wrong pattern match ($3) and adjusted test in t/tokenization.t
3. cleaned up (also changed some comments)
4. fixed missing tokenization of first punctuation char
5. exchanged [^A-Za-z0-9] by [\p{Punct}\s]
(TODO: yet no approp. test found)
Change-Id: Ib494c79c3e6971a57ad874fc62583c625095cf28