diff --git a/Readme.md b/Readme.md
index ee3a9c8..b5dfca6 100644
--- a/Readme.md
+++ b/Readme.md
@@ -5,7 +5,8 @@
 ![Introduction to Datok](https://raw.githubusercontent.com/KorAP/Datok/master/misc/introducing-datok.gif)
 
 Implementation of a finite state automaton for
-high-performance natural language tokenization, based on a finite state
+high-performance large-scale natural language tokenization,
+based on a finite state
 transducer generated with [Foma](https://fomafst.github.io/).
 
 The library contains precompiled tokenizer models for
@@ -13,6 +14,10 @@
 - [german](testdata/tokenizer_de.matok)
 - [english](testdata/tokenizer_en.matok)
 
+The focus of development is on the tokenization of
+[DeReKo](https://www.ids-mannheim.de/digspra/kl/projekte/korpora),
+the german reference corpus.
+
 ## Performance
 
 ![Speed comparison of german tokenizers](https://raw.githubusercontent.com/KorAP/Datok/master/misc/benchmarks.svg)
diff --git a/testdata/tokenizer_en.fst b/testdata/tokenizer_en.fst
index 011934a..ee312d2 100644
--- a/testdata/tokenizer_en.fst
+++ b/testdata/tokenizer_en.fst
Binary files differ
