| commit | a5538653b534161b612f6205a4532138e506ee26 | [log] [tgz] |
|---|---|---|
| author | Marc Kupietz <kupietz@ids-mannheim.de> | Sun Apr 21 15:49:30 2024 +0200 |
| committer | Marc Kupietz <kupietz@ids-mannheim.de> | Sun Apr 21 15:51:11 2024 +0200 |
| tree | 5987ab9ec0770d99fc898c52d4f2be7da38298d0 | |
| parent | 09745e10923d0aa7e904edf5b6adb5f2110b200c [diff] |
Add topic domain classification in XSLT pass2 Generated with mallet based on the old training data in /vol/work/TE via calling a Java function from XSLT. Resolves #6
make -j $(nproc) target/dnb18.i5.xml SRC_DIR=test/resources/DNB YEARS=18
make -j $(nproc) i5 SRC_DIR=test/resources/DNB
Prerequisite: KorAP-XML-CoNLL-U
make -j $(nproc) target/dnb18.zip SRC_DIR=test/resources/DNB YEARS=18
Adjust the following line in your korap4dnb-compose.yml to point to your index (it is in target/dnb.index by default, but should better be copied to a safe place):
- "${PWD}/target/dnb.index:/kustvakt/index:z"
and start the docker:
docker compose -p korap4dnb --profile=lite -f korap4dnb-compose.yml up -d
docker compose -p korap4dnb down
Install prerequisite korap/conllu2treetagger and korap/conllu2spacy docker images if not present:
docker image inspect korap/conllu2treetagger:latest || curl -Ls 'https://gitlab.ids-mannheim.de/KorAP/CoNLL-U-Treetagger/-/jobs/artifacts/master/raw/conllu2treetagger.xz?job=build-docker-image' | docker load docker image inspect korap/conllu2spacy:latest || curl -Ls https://corpora.ids-mannheim.de/tools/conllu2spacy.tar.xz | docker load
Make annotations fro dnb20:
make -j $(nproc) target/dnb20.marmot-malt.zip target/dnb20.spacy.zip target/dnb20.tree_tagger.zip
Build KorAP all, up to the deployable index:
make -j $(nproc) all
2024-04-19
2024-04-15
2024-04-10
make YY=22 to select 20222024-03-24
2024-03-18
make deploy to install new index and restart local KorAP@DNB instance (also available as ci target)show-server-logs and show-server-status make targets to monitor the local KorAP@DNB instance2024-03-17
make all to build all targets, including the index2024-03-16
2024-03-15: DNB test data added
2024-03-08: example EPub and I5 added from DeReKo KJL corpus: Christiane F. ; Kai Hermann ; Horst Rieck: Wir Kinder vom Bahnhof Zoo in the folder test/resources/ – do not distribute (copyrighted data)