Gitiles
Code Review
Sign In
korap.ids-mannheim.de
/
KorAP
/
deliko
/
fe5cce526837123f177a43e3ca12afabdbaf0b86
fe5cce5
Merge branch '15-make-sure-to-get-the-right-metadata-from-the-dnb-sru-api' into 'main'
by Marc Kupietz
· 2 years ago
f614a4b
Only get metadata for the given record
by Rebecca Wilm
· 2 years ago
9d87e9d
Add genre classification based on metadata keywords
by Marc Kupietz
· 2 years ago
0c24663
Distinguish between idno, isbn and dnbidno
by Marc Kupietz
· 2 years ago
ed3cc3a
Simplify testing
by Marc Kupietz
· 2 years ago
8467724
Get rid of mallet warnings about missing log props
by Marc Kupietz
· 2 years ago
4dc1163
Update textclassifier
by Marc Kupietz
· 2 years ago
5df1b16
Fix missing ERROR inc
by Marc Kupietz
· 2 years ago
54ec28b
Fix missing spaces at <br/> elements
by Marc Kupietz
· 2 years ago
15e7d61
Handle editors
by Marc Kupietz
· 2 years ago
eaa9013
Handle translator
by Marc Kupietz
· 2 years ago
fb0f2c3
Handle span/@class='it'
by Marc Kupietz
· 2 years ago
ad4d446
Handle span/@class='norm'
by Marc Kupietz
· 2 years ago
73a26bf
Count errors in script and don´t ignore them in make test
by Marc Kupietz
· 2 years ago
5b734ce
CI: only dnb I5 files are important artefacts
by Marc Kupietz
· 2 years ago
edce85c
CI: build dnb13.i5.xml
by Marc Kupietz
· 2 years ago
41c4238
Add publication year to corpus title
by Marc Kupietz
· 2 years ago
fddbb51
Add test with [Übersetzer] in metadata
by Marc Kupietz
· 2 years ago
de2ca53
Pick authors only from dc:creator fields that contain [Verfasser] or nothing in brackets
by Marc Kupietz
· 2 years ago
73422ce
CI: use bash explicitly for assert tests
by Marc Kupietz
· 2 years ago
ab0a733
Add framework for semantic CI tests
by Marc Kupietz
· 2 years ago
3460d26
CI: install jre-17 and fix test
by Marc Kupietz
· 2 years ago
8d47303
Disallow robots
by Marc Kupietz
· 2 years ago
3146b63
Increase heap again for marmot+malt annotation
by Marc Kupietz
· 2 years ago
52aa505
Allow doc sigles to be only 2 chars long
by Marc Kupietz
· 2 years ago
398b596
Make doc sigles even safer
by Marc Kupietz
· 2 years ago
ec784c8
CI: load domain classificator
by Marc Kupietz
· 2 years ago
3989c74
Handle h4-h6 and strong
by Marc Kupietz
· 2 years ago
77b6aa9
Make doc sigle more fail safe
by Marc Kupietz
· 2 years ago
ccf0904
Fix marmot and malt output
by Marc Kupietz
· 2 years ago
15144ad
Make: let test depend on models/dereko_domains_s.classifier
by Marc Kupietz
· 2 years ago
8a1e465
Add models/dereko_domains_s.classifier to dependencies
by Marc Kupietz
· 2 years ago
a553865
Add topic domain classification in XSLT pass2
by Marc Kupietz
· 2 years ago
09745e1
Error out on invalid text sigles
by Marc Kupietz
· 2 years ago
d653bb8
Make: drop too slow spacy for now
by Marc Kupietz
· 2 years ago
cd32598
Make: increase heap for annotation tasks
by Marc Kupietz
· 2 years ago
d70fe26
Fix surname initial of second author
by Marc Kupietz
· 2 years ago
66618ca
Fix class attribute conditions
by Marc Kupietz
· 2 years ago
33d1128
Resolve hi priorities in pass3
by Marc Kupietz
· 2 years ago
ad1f3b8
Fix title -> doc sigle stop words
by Marc Kupietz
· 2 years ago
2d15922
Fix getting second author initial
by Marc Kupietz
· 2 years ago
568240f
Use last 5 digits of ISBN as text number
by Marc Kupietz
· 2 years ago
2badfb1
Error out if no author
by Marc Kupietz
· 2 years ago
815cc6c
Error out if no title
by Marc Kupietz
· 2 years ago
d320f99
CI fix global SRC_DIR, YEAR variables
by Marc Kupietz
· 2 years ago
cfda5f0
Update .gitignore
by Marc Kupietz
· 2 years ago
e97a4ef
Update Readme.md
by Marc Kupietz
· 2 years ago
3c72db8
Let external kalamar port default to 80
by Marc Kupietz
· 2 years ago
74fb31d
Ignore faulty xhtml input files and conversion errors
by Marc Kupietz
· 2 years ago
5652c81
Keep more intermediate files as long as we debug
by Marc Kupietz
· 2 years ago
b7a4f6c
Die with ERROR if no year could be extracted from DNB DC metadata
by Marc Kupietz
· 2 years ago
13e2858
Make: default SRC_DIR to production sample
by Marc Kupietz
· 2 years ago
8e4e23e
ISBN checksum 10 is encoded as X
by Marc Kupietz
· 2 years ago
ea684b8
Old ISBN numbers only have 10 digits
by Marc Kupietz
· 2 years ago
38019b1
Catch two more common span classes i and b
by Marc Kupietz
· 2 years ago
ab1f3ac
Build index for all parts
by Marc Kupietz
· 2 years ago
2fe36fa
Fix CI test
by Marc Kupietz
· 2 years ago
deb9546
Sanitize Makefile by dropping YY - use YEARS instead
by Marc Kupietz
· 2 years ago
d059d2d
Fix heap calculation for annotations
by Marc Kupietz
· 2 years ago
0961994
Give enough heap space to marmot and mal
by Marc Kupietz
· 2 years ago
8759751
Improve korapxml2krill performance
by Marc Kupietz
· 2 years ago
958df03
Join adjacent hi elements
by Marc Kupietz
· 2 years ago
8d29363
Delete empty and only nbsp p and div elements
by Marc Kupietz
· 2 years ago
6bcec63
Remove empty idsTexts and idsDocs
by Marc Kupietz
· 2 years ago
5e87311
Ignore highlighting for "regular" classes
by Marc Kupietz
· 2 years ago
13c986a
Let final I5 depend on pass 2 and 3 stylesheets
by Marc Kupietz
· 2 years ago
164a283
Fix last I5 validity errors
by Marc Kupietz
· 2 years ago
28f48e1
Run xslt pass 2 and three on the whole year volumes
by Marc Kupietz
· 2 years ago
f62bc90
Turn off attribute expansion
by Marc Kupietz
· 2 years ago
1a370e0
Start with 2nd XSLT pass
by Marc Kupietz
· 2 years ago
bf47ae7
Fix headings nested in ps
by Marc Kupietz
· 2 years ago
10903f3
Use map in country code function
by Marc Kupietz
· 2 years ago
d8599fc
Fix country codes for (all?) non-German cities
by Marc Kupietz
· 2 years ago
a5d0118
Make sure that the publication year has 4 digits
by Marc Kupietz
· 2 years ago
d101e47
Make: split i5 list into 10 parts to fit the command line max
by Marc Kupietz
· 2 years ago
b926f25
Makefile: handle arbitrarily long lists of documents
by Marc Kupietz
· 2 years ago
2a3e4ee
Add also xhtml1 dtd to local catalog
by Marc Kupietz
· 2 years ago
43cbb11
Use local xhtml DTDs to avoid W3C traffic
by Marc Kupietz
· 2 years ago
4b1f595
Improve div and p nesting validity
by Marc Kupietz
· 2 years ago
a516762
Makefile: Fix yy substitution in for loop
by Marc Kupietz
· 2 years ago
975996e
Makefile: add alli5 target
by Marc Kupietz
· 2 years ago
9d15683
be happy with any *.opf file – not just content.opf
by Marc Kupietz
· 2 years ago
51714ed
Fix permissions in of contents
by Marc Kupietz
· 2 years ago
157e079
Makefile: add i5 and i5valid targets
by Marc Kupietz
· 2 years ago
6cb5223
HACK: Always use just the first metadata element
by Marc Kupietz
· 2 years ago
72a3423
Update korapxml2x tool
by Marc Kupietz
· 2 years ago
c3e858b
Update .gitignore
by Marc Kupietz
· 2 years ago
f73a91d
Use again master branches of KorAP XML tools
by Marc Kupietz
· 2 years ago
94bbe6b
Add preliminary support for split into annual volumes
by Marc Kupietz
· 2 years ago
49124fa
Drop subtitles from title (after '|')
by Marc Kupietz
· 2 years ago
30cc080
catch more highlighting elements
by Marc Kupietz
· 2 years ago
ea38737
Drop [Erzähler] from authors
by Marc Kupietz
· 2 years ago
1e6bfd1
Drop nav and audio elements
by Marc Kupietz
· 2 years ago
8f72730
Exclide more boilerplate pages from conversion
by Marc Kupietz
· 2 years ago
0bf97da
Find content.opf anywhere inside the zip
by Marc Kupietz
· 2 years ago
9b27722
Update Readme.md
by Marc Kupietz
· 2 years, 1 month ago
3274f87
CI: fix missing cache timestamps
by Marc Kupietz
· 2 years, 1 month ago
3b374db
CI: Update cache
by Marc Kupietz
· 2 years, 1 month ago
e8c80f3
Silence curls
by Marc Kupietz
· 2 years, 1 month ago
48c6a68
CI: Reduce threads for maltparsing
by Marc Kupietz
· 2 years, 1 month ago
Next »