commit | 899ba2a5f2cd9d1a3e5c79a5a64ad3649541a9f3 | [log] [tgz] |
---|---|---|
author | Marc Kupietz <kupietz@ids-mannheim.de> | Thu Mar 17 21:50:22 2022 +0100 |
committer | Marc Kupietz <kupietz@ids-mannheim.de> | Thu Mar 24 18:40:57 2022 +0100 |
tree | 1ce8d8c0764ebbcd371082f1f4a8253ae04da0b1 | |
parent | 049e52606bcd1fb192789d49110607042a55e2f8 [diff] |
Add R scripts to plot performance charts Change-Id: I20d33a47722f515897c79e9e58c49f6391e48eac
To build the Docker image, run
$ docker build -f Dockerfile -t korap/euralex22 .
This will download and install an image of approximately 6GB.
It will download and install the following tokenizers in an image to your system:
...
To run the evaluation suite ...
...
To run the benchmark, call
$ docker run --rm -i \ -v ${PWD}/benchmarks:/euralex/benchmarks \ -v ${PWD}/corpus:/euralex/corpus \ korap/euralex22 benchmarks/[BENCHMARK-SCRIPT]
The supported benchmark scripts are:
benchmark.pl
Performance measurements of the tools. See the tools section for some remarks to take into account. Accepts two numerical parameters:
empirist.pl
To run the empirist evaluation suite, you first need to download the empirist gold standard corpus and tooling, and extract it into the corpus directory.
$ wget https://sites.google.com/site/empirist2015/home/shared-task-data/empirist_gold_cmc.zip $ unzip empirist_gold_cmc.zip -d corpus $ wget https://sites.google.com/site/empirist2015/home/shared-task-data/empirist_gold_web.zip $ unzip empirist_gold_web.zip -d corpus
Quality measurements based on EmpiriST 2015.
To investigate the output, start the benchmark with mounted output folders
-v ${PWD}/output_cmc:/euralex/empirist_cmc -v ${PWD}/output_web:/euralex/empirist_web
ud_tokens.pl
To run the token evaluation suite against the Universal Dependency corpus, first install the empirist tooling as explained above, and download the corpus.
$ wget https://github.com/UniversalDependencies/UD_German-GSD/raw/master/de_gsd-ud-train.conllu \ -O corpus/de_gsd-ud-train.conllu
ud_sentences.pl
To run the sentence evaluation suite, first download the corpus as explained above.
All tools are run using pipelining, which obviously introduces some overhead, that needs to be taken into account.
For Treetagger: Please read the license terms, before you download the software! By downloading the software, you agree to the terms stated there.
When running this benchmark using Docker you may need to run all processes privileged to get meaningful results.
docker run --privileged -v