Update Readme
Change-Id: I4540324c978e64bacf300be4c552ac3b4fbd4444
diff --git a/Readme.md b/Readme.md
index bf80b22..b8f5339 100644
--- a/Readme.md
+++ b/Readme.md
@@ -1,5 +1,7 @@
# conllu-gender
+
+
Reads CoNLL-U format from stdin and annotates German gender-sensitive personal nouns, gendered determiners/pronouns, and neo-pronouns with correct **POS** (UPOS and XPOS/STTS), **lemma**, and **morphological features**. Writes CoNLL-U format to stdout.
Existing annotations for matched tokens are **replaced**; all other tokens pass through unchanged.
@@ -110,6 +112,7 @@
### Known limitations
+- The current version is fully recall-oriented for the covered phenomena and uses no contextual disambiguation or machine learning. Thus the precision is expected to be very low, especially for determiner/pronoun annotation, but also in the case of wrong tokenizations of `<word>:in`.
- **Binnen-I with non-final capital** (e.g. `jedEn`, `jedEr`): these forms embed the capital letter at a non-final position; detection requires morphological analysis beyond simple pattern matching and is not currently supported.
- **Gendered adjectives** (e.g. `begeisterte*n`): not yet annotated (occur in ~5 % of gendered NP elements per Ochs 2026, §7.3.2).
- **Inflected case suffixes** on gendered nouns (e.g. genitive `Lehrers*in`, dative plural extra marking): rare and not detected.
@@ -119,13 +122,13 @@
```shell
# Annotate CoNLL-U input
-korapxml2conllu doc.zip | conllu-gender
+conllu-gender < ./test/data/gender.conllu
# Sparse output (only annotated tokens, with their sentence headers)
-korapxml2conllu doc.zip | conllu-gender -s
+conllu-gender -s < ./test/data/gender.conllu
-# Pipe with other KorAP annotation tools
-korapxml2conllu doc.zip | conllu-cmc | conllu-gender | conllu2korapxml > doc.annotated.zip
+# Create annotation for KorAP XML ZIP archive
+korapxmltool -A "conllu-gender -s" -t zip wdd24.zip
```
### Options