- b93fabb Introduce --no-tokenizer parameter by Akron · 1 year, 11 months ago
- dafaa7a Reduce indentation level and test for missing text ids by Akron · 3 years, 9 months ago
- 8a954e5 Automatically replace entities with their corresponding characters by Marc Kupietz · 3 years, 9 months ago
- fd0e6a9 Do not escape double quoutes inside raw_text elements by Marc Kupietz · 4 years, 3 months ago
- 19c6c35 Fix bug in comment removal procedure by Akron · 4 years, 4 months ago
- 0465e9e Add exportable XML escape function by Akron · 4 years, 4 months ago
- 42e18a6 allow to specify both tokenizations (extern and intern) by Peter Harders · 4 years, 4 months ago
- 5fb5e8d Simplify and centralize temporary file creation by Akron · 4 years, 4 months ago
- 57c884e remove temp. files from tests per default by Peter Harders · 4 years, 4 months ago
- 95bc98a Rename delHTMLcom to be in line with other naming conventions and make the function exportable by Akron · 4 years, 5 months ago
- 7fab93b Replace recursion and non-essential regexes with index/substr by Akron · 4 years, 5 months ago
- 2d547bc Fix a bug in delHTMLcom where comments were left open by Akron · 4 years, 5 months ago
- 4f67cd4 Atomize and test comment stripping by Akron · 4 years, 5 months ago