| commit | 6e73bdd1b93183bd6bfefddd365510833fc26e51 | [log] [tgz] |
|---|---|---|
| author | Marc Kupietz <kupietz@ids-mannheim.de> | Sun Nov 02 19:58:32 2025 +0100 |
| committer | Marc Kupietz <kupietz@ids-mannheim.de> | Sun Nov 02 19:58:32 2025 +0100 |
| tree | b8ff1e42a3b34318a9d112f72e7c736945b3247d | |
| parent | 3b6edcd90143dbb6dba2cb19c75c0ae8256a49df [diff] |
Print indexing progress
make -j $(nproc) test
make -j $(nproc) test index
INDEX=./target/dnb.index docker compose -p korap4dnb --profile=lite -f korap4dnb-compose.yml up -d xdg-open http://localhost:4000/?q=Test
docker compose -p korap4dnb down
Install prerequisite korap/conllu2treetagger and korap/conllu2spacy docker images if not present:
docker image inspect korap/conllu2treetagger:latest || curl -Ls 'https://gitlab.ids-mannheim.de/KorAP/CoNLL-U-Treetagger/-/jobs/artifacts/master/raw/conllu2treetagger.xz?job=build-docker-image' | docker load docker image inspect korap/conllu2spacy:latest || curl -Ls https://corpora.ids-mannheim.de/tools/conllu2spacy.tar.xz | docker load
Make annotations fro dnb20:
make -j $(nproc) target/dnb20.marmot-malt.zip target/dnb20.spacy.zip target/dnb20.tree_tagger.zip
make -j $(( $(nproc) / 2 )) index
By default, as sources directories, all directories in ./DeLiKo@DNB are used. Note that (due to a bug in the Makefile), the nesting depth of the EPUB files must be exactly 2. You can check, what files will be converted, by running ls DeLiKo@DNB/*/*.epub.
The new index will be built as target/dnb.index.
make clean && time make -j $(( $(nproc) / 2 )) index SRC_DIR=./Buchpreis
The index will be in target/dnb.index.
and start the docker:
INDEX=./target/dnb.index docker compose -p korap4dnb --profile=lite -f korap4dnb-compose.yml up -d
docker compose -p korap4dnb down
docker compose -p korap4dnb --profile=lite restart
2024-05-26
2024-05-08
idno elements with all ids given by dnb SRU api2024-04-19
2024-04-15
2024-04-10
make YY=22 to select 20222024-03-24
2024-03-18
make deploy to install new index and restart local KorAP@DNB instance (also available as ci target)show-server-logs and show-server-status make targets to monitor the local KorAP@DNB instance2024-03-17
make all to build all targets, including the index2024-03-16
2024-03-15: DNB test data added
2024-03-08: example EPub and I5 added from DeReKo KJL corpus: Christiane F. ; Kai Hermann ; Horst Rieck: Wir Kinder vom Bahnhof Zoo in the folder test/resources/ – do not distribute (copyrighted data)