1. b93fabb Introduce --no-tokenizer parameter by Akron · 1 year, 11 months ago
  2. dafaa7a Reduce indentation level and test for missing text ids by Akron · 3 years, 9 months ago
  3. 8a954e5 Automatically replace entities with their corresponding characters by Marc Kupietz · 3 years, 9 months ago
  4. fd0e6a9 Do not escape double quoutes inside raw_text elements by Marc Kupietz · 4 years, 3 months ago
  5. 19c6c35 Fix bug in comment removal procedure by Akron · 4 years, 4 months ago
  6. 0465e9e Add exportable XML escape function by Akron · 4 years, 4 months ago
  7. 42e18a6 allow to specify both tokenizations (extern and intern) by Peter Harders · 4 years, 4 months ago
  8. 5fb5e8d Simplify and centralize temporary file creation by Akron · 4 years, 4 months ago
  9. 57c884e remove temp. files from tests per default by Peter Harders · 4 years, 4 months ago
  10. 95bc98a Rename delHTMLcom to be in line with other naming conventions and make the function exportable by Akron · 4 years, 5 months ago
  11. 7fab93b Replace recursion and non-essential regexes with index/substr by Akron · 4 years, 5 months ago
  12. 2d547bc Fix a bug in delHTMLcom where comments were left open by Akron · 4 years, 5 months ago
  13. 4f67cd4 Atomize and test comment stripping by Akron · 4 years, 5 months ago