Gitiles
Code Review
Sign In
korap.ids-mannheim.de
/
KorAP
/
KorAP-XML-Krill
/
94262cef2bc4a40aa372d4a46a6e7de6583379a1
/
script
/
korapxml2krill
94262ce
Renamed Institute for the German Language to Leibniz Institute for the German Language
by Akron
· 6 years ago
955b75b
Remove extract_text and extract_doc in favor of extract_sigle
by Akron
· 6 years ago
31a08cb
Add extract_sigle method to archive
by Akron
· 6 years ago
63d03ee
Ignore temporary-extraction on directory archiving
by Akron
· 6 years ago
263274c
Support koral versioning
by Akron
· 6 years ago
ed9baf0
Support non-word-tokens (fixes #5)
by Akron
· 6 years ago
6eff23b
Updated minimum perl
by Akron
· 6 years ago
ea1aed5
Activate HNC by default
by Akron
· 6 years ago
5fdc7e1
Fixed last change info in --version
by Akron
· 6 years ago
f73ffb6
Fixed readme by mentioning preference regarding configuration parameters
by Akron
· 7 years ago
4c67919
Support for LWC dependency annotations
by Akron
· 7 years ago
3c56f50
Support file extensions in base tokenization file
by Akron
· 7 years ago
28c4e54
Fix missing command issue
by Akron
· 7 years ago
d5643ad
Warn on missing output parameter in extract
by Akron
· 7 years ago
9a062ce
Fix tarring to include only filenames
by Akron
· 7 years ago
3abc03e
Fixed exit codes in script
by Akron
· 7 years ago
ce125b6
Improved documentation on new features
by Akron
· 8 years ago
da3097e
Finished tar flag
by Akron
· 8 years ago
486f9ab
Improved tar support
by Akron
· 8 years ago
081639e
Added preliminary tar support
by Akron
· 8 years ago
9ec8887
Introduced sequential extraction flag to circumvent troubles with parallel extraction
by Akron
· 8 years ago
bd3adda
Fixing behaviour for existing output directories
by Akron
· 8 years ago
63f20d4
Support serial conversion and input-base
by Akron
· 8 years ago
8150010
Introduced temporary extraction
by Akron
· 8 years ago
636aa11
Added configuration to script
by Akron
· 8 years ago
821db3d
Add wildcard support for inputs
by Akron
· 8 years ago
c11f798
Add auto-core-calculation
by Akron
· 8 years ago
3bd942f
Added marmot-support
by Akron
· 8 years ago
60a8caa
Treat prefixes correct for text sigles
by Akron
· 8 years ago
636bd9c
Fixed pagebreak treatment in script
by Akron
· 8 years ago
41ac10b
Added pagebreak annotations (with '~'-prefix)
by Akron
· 8 years ago
4fa37c3
Added DRuKoLa support to korapxml2krill script
by Akron
· 8 years ago
3ec0a1c
Updated to Mojolicious 7.20
by Akron
· 8 years ago
3741f8b
Added base-sentences and base-paragraphs options
by Akron
· 8 years ago
f1a1de9
Improved readme
by Akron
· 8 years ago
13d5662
Improved 'already processed' message
by Akron
· 8 years ago
2812ba2
Fixed archive handling and support multiple jobs for extraction
by Akron
· 8 years ago
2fd402b
Added support for wildcards in document siglen
by Akron
· 8 years ago
a76d835
Improved documentation (thx @margaretha)
by Akron
· 8 years ago
b4bbec7
Fixed naming scheme for folder archives
by Akron
· 8 years ago
2080758
Added extraction method for documents in archives
by Akron
· 8 years ago
7606afa
Improved documentation to be more precise regarding non-argument calls (thx @margaretha)
by Akron
· 8 years ago
a93d51b
Improved Readme
by Akron
· 8 years ago
0e48977
Fixed windows support
by Nils Diewald
· 8 years ago
bdf434a
Added note on optimization
by Akron
· 8 years ago
4c0cf31
Fixed treatment of temporary files
by Akron
· 8 years ago
7438151
Added minimum requirements
by Akron
· 8 years ago
af38698
Fixed default tokenization
by Akron
· 8 years ago
3ec4897
Added archive test for directories and parallel processing
by Akron
· 8 years ago
7d4cdd8
Added archive test script
by Akron
· 8 years ago
651cb8d
Fix extraction of multiple archives
by Akron
· 8 years ago
03b24db
Added test for sigles support in extract
by Akron
· 8 years ago
e2b902d
Fixed output of version and help screens
by Akron
· 8 years ago
5f51d42
Fixed annotation bug in script
by Akron
· 8 years ago
e1dbc38
Added test for script calls
by Akron
· 8 years ago
cdf0e00
Added batch processing class for documents
by Akron
· 8 years ago
405f0c5
Test file processing for batch processing
by Akron
· 8 years ago
8b99052
Start splitting script file for better testing
by Akron
· 8 years ago
b0c88db
Added caching test
by Akron
· 8 years ago
0c3e375
Test multiple archives
by Akron
· 9 years ago
6255760
Fixed overwrite flag
by Akron
· 9 years ago
f3f0c94
Added malt dependency resource
by Akron
· 9 years ago
2cfe809
Added pefix negation to multiple archive support
by Akron
· 9 years ago
29866ac
Fixed archive attachments
by Akron
· 9 years ago
08385f6
First step to multi-archive support
by Akron
· 9 years ago
e8adfcc
Optimize performance of text listing
by Akron
· 9 years ago
11c8030
Add metadata caching
by Akron
· 9 years ago
35db6e3
Simplified and modularized metadata processing
by Akron
· 9 years ago
f7ad89e
Improved readme
by Akron
· 9 years ago
c13a170
Removed BRZ and added Readme
by Akron
· 9 years ago
75ba57d
TUIs are now optional if not set
by Akron
· 9 years ago
ee13019
Minor information adjustment in script
by Akron
· 9 years ago
e10ad32
Added 'extract' method support
by Akron
· 9 years ago
941c1a6
Merged executables
by Akron
· 9 years ago
c1babed
Fixed tempdir issue in script
by Akron
· 9 years ago
150b29e
Added archive support to korapxml2krill_dir
by Akron
· 9 years ago
069bd71
Fixed skip and overwrite flags in scripts
by Akron
· 9 years ago
93d620e
Update scripts and sgbr test suite
by Akron
· 9 years ago
[Renamed (91%) from script/prepare_index.pl]
14ca9f0
Introduced dependency relations for MATE
by Akron
· 9 years ago
93a01db
Added overwrite protection
by Nils Diewald
· 10 years ago
59094f2
Added overwrite protection
by Nils Diewald
· 10 years ago
02d100e
Update indexer script
by Nils Diewald
· 10 years ago
32e30f0
Fix script for new index (including new foundries)
by Nils Diewald
· 10 years ago
840c924
Meta data update for corpus indexing
by Nils Diewald
· 10 years ago
7b84722
Added text marker, added sentences from multiple foundries, changed paragraphs to base/para some tests, some bugfixes
by Nils Diewald
· 11 years ago
3ece630
Fixed tiny offset issue for documents ending with non-tokens
by Nils Diewald
· 11 years ago
092178e
Fix dealing with no-span layers|Improve error messages for bughunting
by Nils Diewald
· 11 years ago
37e5b57
changed sentence foundry, foundry selector and textClass serialization
by Nils Diewald
· 11 years ago
7364d1f
Indexation script finished
by Nils Diewald
· 11 years ago
2db9ad0
Lucene field indexer written in perl
by Nils Diewald
· 11 years ago