Improve Readme
diff --git a/Readme.md b/Readme.md
index ce710d9..c2f967e 100644
--- a/Readme.md
+++ b/Readme.md
@@ -1,4 +1,4 @@
-# Datok - Matrix or Double Array FSA based Tokenizer
+# Datok - Finite State Tokenizer
This is an implementation of an FSA for natural language
tokenization, either in form of a matrix representation
@@ -6,7 +6,8 @@
The system accepts a finite state transducer (FST)
describing a tokenizer generated by
[Foma](https://fomafst.github.io/)
-that needs to follow some rules as described below.
+that needs to follow some conventional rules as described
+below.
# Conventions
@@ -113,9 +114,12 @@
# Technology
The double array representation (Aoe 1989) of all transitions
-in the FST is
-implemented as an extended FSA following Mizobuchi et al. (2000)
-and implementation details following Kanda et al. (2018).
+in the FST is implemented as an extended DFA following Mizobuchi
+et al. (2000) and implementation details following Kanda et al. (2018).
+
+Both representations mark all non-word-character targets with a
+leading bit. The transduction is greedy with a single backtracking
+option to the last ε transition.
The german tokenizer shipped is based on work done by the
[Lucene project](https://github.com/apache/lucene-solr)