blob: f4061c44f33988ff4598893a7a8ad3e98036644c [file] [log] [blame]
Marc Kupietz899ba2a2022-03-17 21:50:22 +01001Tool Version Sen Tok Model Tokens/ms effi1x - 100 runs Tokens/ms effi10x - 100 runs
2KorAP-Tokenizer 72.90 199.28
Akronc004c0c2022-05-20 15:25:18 +02003Datok x x datok 614.72 2304.13
4 x x matok 1041.63 2798.78
Marc Kupietz899ba2a2022-03-17 21:50:22 +01005BlingFire 0.1.8 x wbd.bin 431.92 1697.73
6 x sbd.bin 417.10 1908.87
7Cutter 2.5 x x 0.38
8JTok 2.1.19 31.19 117.22
9OpenNLP x Simple 290.71 1330.23
10 x Tokenizer 74.65 145.08
11 x SentenceD 247.84 853.01
12SoMaJo x x P=1 8.15 8.41
13 x x P=8 27.32 39.91
14SpaCy x Tokenizer 19.73 44.40
15 x Sentencizer 16.94
16 x Statistical 4.90
17 x Dependency 2.24
18Stanford x 75.47 156.24
19 x x T,S,M 46.95 91.56
20Syntok x segmenter 59.66 61.07
21 x tokenizer 103.90 108.40
22Waste 2.0.20-1 x x 141.07 144.95
23Elephant x 8.57 8.68
24TreeTagger x 69.92 72.98
25Deep-EOS x bi-lstm-de 0.25
26 x cnn-de 0.27
27 x lstm-de 0.29
28NNsplit x 0.90