tree: 35f70c7b62806df8b8fd0dcff4a898c06bf365e8 [path history] [tgz]
  1. lib/
  2. scripts/
  3. test/
  4. xslt/
  5. .gitignore
  6. .gitlab-ci.yml
  7. krill-korap4dnb.cfg
  8. Makefile
  9. Readme.md
Readme.md

EPub to KorAP (via TEI I5) conversion

Run

To generate I5 corpus

make target/dnb.i5.xml

To generate the KorAP-XML ZIP

Prerequisite: KorAP-XML-CoNLL-U

make target/dnb.zip

To generate Annotations

make target/dnb.spacy.zip target/dnb.tree_tagger.zip

News

  • 2024-03-16

    • CI/CD pipeline added
    • first working pipeline for EPub ⮕ TEI I5 ⮕ KorAP-XML ⮕ (UDPipe+TreeTagger+Spacy) ⮕ Krill ⮕ KorAP-JSON
  • 2024-03-15: DNB test data added

  • 2024-03-08: example EPub and I5 added from DeReKo KJL corpus: Christiane F. ; Kai Hermann ; Horst Rieck: Wir Kinder vom Bahnhof Zoo in the folder test/resources/ – do not distribute (copyrighted data)