Gitiles
Code Review
Sign In
korap.ids-mannheim.de
/
KorAP
/
KorAP-XML-Krill
/
abb36900540f712ac8d7fc6090fc3c61bb085921
/
lib
/
KorAP
/
XML
abb3690
Introduce support for Gingko
by Akron
· 3 years, 9 months ago
6882d7d
Define resources in Makefile
by Akron
· 4 years, 5 months ago
0ffbd52
Ignore negative level structures in DeReKo#structure.xml
by Akron
· 4 years, 5 months ago
b9c3381
Replaced Log4perl with Log::Any
by Akron
· 4 years, 9 months ago
650e20e
Quick fix for XIP::Dependency
by Akron
· 4 years, 8 months ago
56deacb
Fix RWK paragraph handling
by Akron
· 5 years ago
6aed056
Removed deprecated 'pretty' flag
by Akron
· 5 years ago
11daf96
Removed deprecated 'primary' flag
by Akron
· 5 years ago
41127e3
Move get_file() as a function to Krill.pm
by Akron
· 5 years ago
be67c85
Minor cleanups for annotation procedures
by Akron
· 5 years ago
6a4cb16
Added add_span() method to MultiTermToken
by Akron
· 5 years ago
47426f0
Optimize performance slightly by reducing calls to _offset()
by Akron
· 5 years ago
c4ec093
Move get_file_from_glob() as a function to Krill.pm and test it
by Akron
· 5 years ago
129e441
Remove MultiTerm->add() in favor of MultiTerm->add_by_term()
by Akron
· 5 years ago
47f07db
Replace deprecated MultiTerm add() method from Talismane parser
by Akron
· 5 years ago
d6bcf5d
Replace deprecated MultiTerm add() method from HNC parser
by Akron
· 5 years ago
926731f
Replace deprecated MultiTerm add() method from RWK parser
by Akron
· 5 years ago
0b2eed6
Replace deprecated MultiTerm add() method from Schreibgebrauch parser
by Akron
· 5 years ago
0157c2e
Replace deprecated MultiTerm add() method from MarMot parser
by Akron
· 5 years ago
9dda1cb
Replace deprecated MultiTerm add() method from LWC parser
by Akron
· 5 years ago
f42866d
Replace deprecated MultiTerm add() method from CMC parser
by Akron
· 5 years ago
6630859
Replace deprecated MultiTerm add() method from XIP parser
by Akron
· 5 years ago
c2a910b
Replace deprecated MultiTerm add() method from OpenNLP parser
by Akron
· 5 years ago
030de56
Replace deprecated MultiTerm add() method from MDParser
by Akron
· 5 years ago
0b6dce4
Replace deprecated MultiTerm add() method from Glemm parser
by Akron
· 5 years ago
b65e909
Replace deprecated MultiTerm add() method from Mate parser
by Akron
· 5 years ago
bdc3fe5
Replace deprecated MultiTerm add() method from Connexor parser
by Akron
· 5 years ago
65c5671
Add micro optimizations based on profiling in Tokenizer::Units
by Akron
· 5 years ago
740bb54
Add micro optimizations based on profiling for sorting and escaping MultiTerms
by Akron
· 5 years ago
39df7ce
Optimize annotations to use term based Multiterm construction
by Akron
· 5 years ago
4701d09
Optimize annotations to not use hash multiterms
by Akron
· 5 years ago
b555b60
Clean up primary data handling
by Akron
· 5 years ago
fa82f04
Minor improvements by introducing getters and setters instead of combinators in tokenizer
by Akron
· 5 years ago
72e671f
Minor improvements by introducing getters and setters instead of combinators
by Akron
· 5 years ago
e3e0536
Fixed bug in RWK that broke on certain KorAP-XML documents
by Akron
· 5 years ago
1cdbc9d
Improve DGD support
by Akron
· 5 years ago
07e2477
Include RWK annotations in script
by Akron
· 5 years ago
c403644
Improve RWK morphology parser to support multiple morphological key:value pairs
by Akron
· 5 years ago
85eb5aa
Improve RWK structure parser for *-milestone elements
by Akron
· 5 years ago
28299f4
Introduce special RWK structure parser
by Akron
· 5 years ago
8ff5879
Introduce special RWK morphology parser
by Akron
· 5 years ago
158bd50
Catch if files are not writable to output
by Akron
· 5 years ago
0a187b9
Fix I5 Meta documentation to not confuse dot-separated tag names in
by Akron
· 5 years ago
dec4312
Fixed gap behind last token and <base/s:t> length
by Akron
· 5 years ago
b62d92a
Fixed span position offset bug and fixed milestones behind last token bug
by Akron
· 5 years ago
a0d5af3
Fixed legacy XIP parser
by Akron
· 5 years ago
d4c5c10
Added documentation for supported I5 metadata fields
by Akron
· 5 years ago
57799fc
Fix editionStmt metadata parsing
by Akron
· 5 years ago
f1849aa
Support non-verbal annotations
by Akron
· 6 years ago
c29b8e1
Added support for DGD pseudo-sentences based on anchor milestones
by Akron
· 6 years ago
67b6eda
Support 'FOLK' as corpus sigle for DGD associated corpora
by Akron
· 6 years ago
b05b842
Improve logging
by Akron
· 6 years ago
2029455
Added external link for AGD data in I5 meta
by Akron
· 6 years ago
0d68a4b
Added 'distributor' field to I5 metadata
by Akron
· 6 years ago
7d5e638
Added support for Talismane
by Akron
· 6 years ago
57510c1
Added DGD support
by Akron
· 7 years ago
9b04f60
Update version
by Akron
· 6 years ago
f021ad6
Improve error handling
by Akron
· 6 years ago
eaffe93
Fail hard on tokenization problems now
by Akron
· 6 years ago
955b75b
Remove extract_text and extract_doc in favor of extract_sigle
by Akron
· 6 years ago
31a08cb
Add extract_sigle method to archive
by Akron
· 6 years ago
6bf3cc9
Added links for wikipedia resources
by Akron
· 6 years ago
263274c
Support koral versioning
by Akron
· 6 years ago
c526e75
Include field serialization in versioned json output
by Akron
· 6 years ago
5eb3aa0
Set field types and serialize as koral:fields
by Akron
· 6 years ago
ed9baf0
Support non-word-tokens (fixes #5)
by Akron
· 6 years ago
6eff23b
Updated minimum perl
by Akron
· 7 years ago
dd1c0f1
Updated version
by Akron
· 7 years ago
c893ac3
Added tests and minor metadata parsing adjustments for HNC
by Akron
· 7 years ago
28dc17f
Fix certainty values in TreeTagger output
by Akron
· 7 years ago
0426176
Remove certainty value on lemmata in Treetagger
by Akron
· 7 years ago
4c67919
Support for LWC dependency annotations
by Akron
· 7 years ago
3c56f50
Support file extensions in base tokenization file
by Akron
· 8 years ago
9b67b93
Fix attribute generation for DeReKo
by Akron
· 8 years ago
9a062ce
Fix tarring to include only filenames
by Akron
· 8 years ago
0a6cce1
Remove non-core fc
by Akron
· 8 years ago
3abc03e
Fixed exit codes in script
by Akron
· 8 years ago
0f9b93a
Fixed minor issue in I5 meta parsing
by Akron
· 8 years ago
403934d
Fixed CMC for empty features
by Akron
· 8 years ago
36d4627
Fixed feature treatment in CMC morpho
by Akron
· 8 years ago
ce125b6
Improved documentation on new features
by Akron
· 8 years ago
e599379
Added treatment of CMC data
by Akron
· 8 years ago
918ce42
Fixed primary data handling for data with white space at the beginning and at the end
by Akron
· 8 years ago
a308c71
Start testing with DCK
by Akron
· 8 years ago
da3097e
Finished tar flag
by Akron
· 8 years ago
9ec8887
Introduced sequential extraction flag to circumvent troubles with parallel extraction
by Akron
· 8 years ago
3a486f8
Another unzip flag update (-uo)
by Akron
· 8 years ago
86db52e
Improved unzip overwriting mechanism
by Akron
· 8 years ago
0278ca2
Test zip overwriting
by Akron
· 8 years ago
8150010
Introduced temporary extraction
by Akron
· 8 years ago
636aa11
Added configuration to script
by Akron
· 8 years ago
821db3d
Add wildcard support for inputs
by Akron
· 8 years ago
55778f0
Added preliminary support for diacritic insensitivity support
by Akron
· 8 years ago
5809fea
Fixed casefolding for case insensitivity
by Akron
· 8 years ago
c11f798
Add auto-core-calculation
by Akron
· 8 years ago
3bd942f
Added marmot-support
by Akron
· 8 years ago
60a8caa
Treat prefixes correct for text sigles
by Akron
· 8 years ago
08d5445
Changed meta name for pages
by Akron
· 8 years ago
41ac10b
Added pagebreak annotations (with '~'-prefix)
by Akron
· 8 years ago
0465de5
Improved handling of weird metadata stuff
by Akron
· 8 years ago
Next »