Gitiles
Code Review
Sign In
korap.ids-mannheim.de
/
KorAP
/
Datok
/
8e803936b0e70ac53e86c48614f149e2c101a425
/
testdata
/
tokenizer_de.matok
0139bc5
Introduce the english model as being on the same level as german
by Akron
· 1 year, 6 months ago
[Renamed from testdata/tokenizer.matok]
6c92763
New build
by Akron
· 2 years ago
b98e4cf
Improve Emoticons
by Akron
· 2 years, 11 months ago
v0.1.5
b428755
Support punctuation after quotes
by Akron
· 2 years, 11 months ago
v0.1.4
4222ac8
Improve handling of ellipsis
by Akron
· 3 years ago
e200841
Further improve speech rule for eos with more quotation marks
by Akron
· 3 years ago
e96895f
Improve handling of sentence splits including speech
by Akron
· 3 years ago
4ec8cec
Prepare first official release
by Akron
· 3 years, 3 months ago
v0.1.0
fac8abc
Reorder longest match operator and update models
by Akron
· 3 years, 3 months ago
17984c8
Improving time parsing
by Akron
· 3 years, 4 months ago
f6bdfdb
Add trimming at the beginning of a text
by Akron
· 3 years, 4 months ago
a854faa
Introduce EOT (end-of-transmission) marker
by Akron
· 3 years, 4 months ago
094a4e8
Use serialized matrix representation in test suite
by Akron
· 3 years, 5 months ago