| commit | 1a42266168a011af3e8c592221c4c9f8118700d0 | [log] [tgz] |
|---|---|---|
| author | Marc Kupietz <kupietz@ids-mannheim.de> | Sat Mar 16 09:34:10 2024 +0100 |
| committer | Marc Kupietz <kupietz@ids-mannheim.de> | Sat Mar 16 09:34:10 2024 +0100 |
| tree | 920b02f23fcc68246d301a036ce41aa0fb4fe7ad | |
| parent | 7747c11f1152645b938c487f38e16b6b91b12d4f [diff] |
Add first working conversion pipeline
make target/dnb.i5.xml
Prerequisite: KorAP-XML-CoNLL-U
make target/dnb.zip
make target/dnb.spacy.zip target/dnb.tree_tagger.zip
2024-03-16: first working pipeline for EPub ⮕ TEI I5 ⮕ KorAP-XML ⮕ (UDPipe+TreeTagger+Spacy) ⮕ Krill ⮕ KorAP-JSON
2024-03-15: DNB test data added
2024-03-08: example EPub and I5 added from DeReKo KJL corpus: Christiane F. ; Kai Hermann ; Horst Rieck: Wir Kinder vom Bahnhof Zoo in the folder test/resources/ – do not distribute (copyrighted data)