Scripts for converting NKJP-XML to KorAP-XML.

Clone this repo:

Branches

  1. 9a943ad Set line-length Saxon serialization parameter to 1024 by Marc Kupietz · 2 years ago master
  2. b28e588 catalog fixed. script prepared for processing, morpho files have some new data now (from the new NKJP version) by Piotr Banski · 2 years, 6 months ago
  3. dea799a new sample, NKJP-SGJP, 7 texts as before, state from 29-05-2022 by Piotr Banski · 2 years, 6 months ago
  4. 60d3277 updated for the new sample and its new bug by Piotr Banski · 2 years, 6 months ago
  5. a78e59d new dataset, up to NE by Piotr Banski · 2 years, 6 months ago

NKJP2KorAP

Tools for converting NKJP-XML format to KorAP-XML

Installation

The test suite is based on xspec. To install xspec, please follow the Installation Guide. Ensure either xspec.bat or xspec.sh is available on the command line.

To run the test suite, execute

$ xspec.sh test/nkjp2korap.xspec

The created report is available in test/xspec/nkjp2korap-result.html afterwards.

Examples Usage

Development and License

Copyright (c) 2021, Leibniz Institute for the German Language, Mannheim, Germany

This package is developed as part of the KorAP Corpus Analysis Platform at the Leibniz Institute for German Language (IDS).

It is published under the BSD-2 License.

Contributions

Contributions are very welcome!

Your contributions should ideally be committed via our Gerrit server to facilitate reviewing (see Gerrit Code Review - A Quick Introduction if you are not familiar with Gerrit). However, we are also happy to accept comments and pull requests via GitHub.

Please note that unless you explicitly state otherwise any contribution intentionally submitted for inclusion into this software shall – as this software itself – be under the BSD-2 License.

References