commit | f06f7fa3ce2d4b9ebc9f5aebb3d0554a8589d23c | [log] [tgz] |
---|---|---|
author | Akron <nils@diewald-online.de> | Mon Mar 28 19:44:10 2022 +0200 |
committer | Akron <nils@diewald-online.de> | Mon Mar 28 19:44:10 2022 +0200 |
tree | 579b98bb843ce5d75ab0de7a30c994a8a9e359c1 | |
parent | 15d4a1e05404635e0d36d6b46515992f721e4af0 [diff] |
Add description for Korapxml2krill Change-Id: I600bbf7b409dfb3cfdd4a7053c1eadc2deac0f13
Install docker and docker compose.
To download, intialize and run KorAP pointing to a certain directory index (in this example myindex
in the local directory), run
$ INDEX=./myindex docker-compose up
This will make the frontend be available at localhost:64543
.
Depending on the corpus data to be indexed, it must first be converted. In the case of a conversion from TEI p5/i5 format, the tools required for this have already been installed with the above command.
In the following we assume that an i5 file mycorpus.i5.xml
is located in the local folder.
The command ...
$ docker run --rm -v ${PWD}:/data korap/kalamar tei2korapxml --input /data/mycorpus.i5.xml > mycorpus.zip
... will convert the i5 file into a KorAP-XML file using tei2korapxml.
To convert the KorAP-XML archive in a second step into individual Krill JSON, the following command ...
$ docker run --rm -u root \ -v ${PWD}/:/kalamar/data/ korap/kalamar korapxml2krill archive \ -z -i /kalamar/data/mycorpus.zip -o ./data/
... will use korapxml2krill.