commit | b040897b226b57cc84a628e884807f2cfb437f51 | [log] [tgz] |
---|---|---|
author | Akron <nils@diewald-online.de> | Mon Mar 07 11:36:17 2022 +0100 |
committer | Akron <nils@diewald-online.de> | Mon Mar 07 11:36:17 2022 +0100 |
tree | 9be858c68006e3ec029bebbc4085d972997c8d36 | |
parent | c261642121b09f09e62bf913951371597b244638 [diff] |
Cleanup output of stanford tokenizer+sentencesplitter Change-Id: I4d620d319b0546aef21a0f7070c4ab5c5356d646
To build the Docker image, run
$ docker build -f Dockerfile -t korap/euralex22 .
This will download and install an image of approximately 6GB.
It will download and install the following tokenizers in an image to your system:
...
To run the evaluation suite ...
...
To run the benchmark, call
$ docker run --rm -it \ -v ${PWD}/benchmarks:/euralex/benchmarks \ -v ${PWD}/corpus:/euralex/corpus \ korap/euralex22 benchmarks/benchmark.pl
For Treetagger: Please read the license terms, before you download the software! By downloading the software, you agree to the terms stated there.
When running this benchmark using Docker you may need to run all processes privileged to get meaningful results.
docker run --privileged -v