1. cae3911 Fix buffer bug in token writer by Akron · 1 year, 8 months ago
  2. d0dfea8 Added context rule for I by Akron · 1 year, 8 months ago
  3. be3d366 Introduce english tokenizer by Akron · 1 year, 8 months ago
  4. 8a5596a Merge "Added update command" by Nils Diewald · 1 year, 10 months ago
  5. 96c6548 Added update command by Akron · 1 year, 10 months ago
  6. 8413f84 Bump github.com/stretchr/testify from 1.7.0 to 1.8.2 (fixes #4) by dependabot[bot] · 1 year, 10 months ago
  7. e995f76 Bump github.com/alecthomas/kong from 0.5.0 to 0.7.1 (closes #3) by dependabot[bot] · 1 year, 10 months ago
  8. 6c92763 New build by Akron · 1 year, 10 months ago
  9. 0597b27 Introduce dependabot support by Akron · 1 year, 10 months ago
  10. a25a7d5 Update pages for EURALEX publication by Akron · 2 years, 5 months ago
  11. 49ebd91 Add paper link (2) by Akron · 2 years, 6 months ago
  12. 656934d Add paper link by Akron · 2 years, 6 months ago
  13. fd120d3 Update dependencies by Akron · 2 years, 7 months ago
  14. 7e4b780 Move references to the end of the readme by Akron · 2 years, 7 months ago
  15. 79ec995 Add references by Akron · 2 years, 7 months ago
  16. a44944d Merge "Add notification regarding load factor" by Nils Diewald · 2 years, 8 months ago
  17. 6a4ce18 Add notification regarding load factor by Akron · 2 years, 8 months ago
  18. b15acb9 Rename token_symbol to token_bound by Akron · 2 years, 9 months ago
  19. d47c67e Add minor rules for XML support by Akron · 2 years, 9 months ago
  20. 6dcb6ce Add arrows by Akron · 2 years, 9 months ago
  21. 3b6c7fb Add Zenodo DOI Badge by Akron · 2 years, 9 months ago
  22. 78f6714 Split tokenizer rules into language-specific and language-dependent by Akron · 2 years, 9 months ago
  23. 61948ef Restructure XFST sources by Akron · 2 years, 9 months ago
  24. 7aa1cbe Improve sentence endings further by Akron · 2 years, 9 months ago
  25. b98e4cf Improve Emoticons by Akron · 2 years, 9 months ago v0.1.5
  26. f94b9ce check parantheses at the end of sentences by Akron · 2 years, 9 months ago
  27. b428755 Support punctuation after quotes by Akron · 2 years, 9 months ago v0.1.4
  28. df27581 Make tokenizer robust and never failing by Akron · 2 years, 9 months ago
  29. 4222ac8 Improve handling of ellipsis by Akron · 2 years, 10 months ago
  30. ece3f01 Support quote combinations at the end of sentences by Akron · 2 years, 10 months ago
  31. e200841 Further improve speech rule for eos with more quotation marks by Akron · 2 years, 10 months ago
  32. e96895f Improve handling of sentence splits including speech by Akron · 2 years, 10 months ago
  33. b02ad07 Improve handling of apostrophes by Akron · 3 years ago
  34. 9a59471 Test single quote handling by Akron · 3 years ago
  35. 54ed7e7 Fix handling of "z.B." by Akron · 3 years ago
  36. d0c6e10 Fix datok tests to be more robust regarding tokenizer changes by Akron · 3 years, 1 month ago
  37. 936c0f5 Support Plusampersand words in compounds by Akron · 3 years, 1 month ago
  38. 00cecd1 Initialize identity for sigma < 256 by Akron · 3 years, 1 month ago
  39. 4880fb6 Improve rune2symbol conversion by Akron · 3 years, 1 month ago
  40. e62e8eb Introducing Plusampersand-Compounds by Akron · 3 years, 1 month ago
  41. 22c565a Fix out of range bug by reverting buffer rewind improvement by Akron · 3 years, 1 month ago v0.1.1
  42. 4ec8cec Prepare first official release by Akron · 3 years, 1 month ago v0.1.0
  43. e87906b Minor improvements by Akron · 3 years, 1 month ago
  44. 90aa45b Minor code simplifications by Akron · 3 years, 1 month ago
  45. fac8abc Reorder longest match operator and update models by Akron · 3 years, 2 months ago
  46. 3976804 Add benchmark rule to Makefile by Akron · 3 years, 2 months ago
  47. 65c0f21 Simplify tokenizer whitespace handling by Akron · 3 years, 2 months ago
  48. c840636 Separate xml rule from main script by Akron · 3 years, 2 months ago
  49. 289414f Update benchmarks by Akron · 3 years, 2 months ago
  50. 7198645 Speed up build by Akron · 3 years, 2 months ago
  51. 6742b96 Add XML entities by Akron · 3 years, 2 months ago
  52. 7e75ef0 Add makefile by Akron · 3 years, 2 months ago
  53. 11a05d9 Extend tokenizer fileending by Akron · 3 years, 2 months ago
  54. 9135b20 Test IPv4 handling by Akron · 3 years, 2 months ago
  55. f1106ec Add single character abbreviations by Akron · 3 years, 2 months ago
  56. 4a6e0ff Fix newline after eot behaiour by Akron · 3 years, 2 months ago
  57. 274600e Fix buffer flushing to work with tei2korapxml by Akron · 3 years, 2 months ago
  58. 9c3bf7f Change fmt to log for easier writing to STDOUT by Akron · 3 years, 2 months ago
  59. 3d31453 Add introduction video to readme by Akron · 3 years, 2 months ago
  60. 6792bd2 Improve Readme example by Akron · 3 years, 2 months ago
  61. 15bb13d Introduce dash flag for STDIN and input file handling for tokenization by Akron · 3 years, 2 months ago
  62. 17984c8 Improving time parsing by Akron · 3 years, 2 months ago
  63. 78dba06 Add time format to transducer by Akron · 3 years, 2 months ago
  64. 066d99c Fix XML empty element handling by Akron · 3 years, 2 months ago
  65. 04335c6 Update tests by Akron · 3 years, 2 months ago
  66. 9fb63af Optimize tests by avoiding reload of tokenizers by Akron · 3 years, 2 months ago
  67. 7035d2e Fix sentence_pos handling by Akron · 3 years, 2 months ago
  68. 96fdc9b Fix TokenWriter regarding sentence boundaries and remove simple TokenWriter by Akron · 3 years, 2 months ago
  69. 2612f99 Improve command help page by Akron · 3 years, 2 months ago
  70. 685861a Improve Readme by Akron · 3 years, 2 months ago
  71. 0f087ea Parse command line options as bit flags by Akron · 3 years, 2 months ago
  72. fceddb6 Add sentence flags (for printing and offsets) by Akron · 3 years, 2 months ago
  73. a9e0c42 Introduce --[no]-tokens flag by Akron · 3 years, 2 months ago
  74. e9431ec Ignore newline after EOT with a flag by Akron · 3 years, 2 months ago
  75. 8cc2dd9 Fix buffer rewind at end of transmission by Akron · 3 years, 2 months ago
  76. 4f6b28c Support token offsets in token writer by Akron · 3 years, 2 months ago
  77. 32416ce Support offsets in token writer by Akron · 3 years, 2 months ago
  78. 98fbfef Improve offset handling in buffers by Akron · 3 years, 2 months ago
  79. f6bdfdb Add trimming at the beginning of a text by Akron · 3 years, 2 months ago
  80. c9c0eae Rename tests to better comply with Go test tool by Akron · 3 years, 2 months ago
  81. a854faa Introduce EOT (end-of-transmission) marker by Akron · 3 years, 2 months ago
  82. ce018e1 Merge "Introduce token_writer object" by Akron · 3 years, 2 months ago
  83. e396a93 Introduce token_writer object by Akron · 3 years, 2 months ago
  84. e0dffe0 Improve readme by Akron · 3 years, 2 months ago
  85. e7751b8 Added License by Akron · 3 years, 3 months ago
  86. 842bc65 Improve Readme by Akron · 3 years, 3 months ago
  87. abcb6a5 Add equivalence test for matrix and DA representations by Akron · 3 years, 3 months ago
  88. 34eb74c Cleanup by Akron · 3 years, 3 months ago
  89. 094a4e8 Use serialized matrix representation in test suite by Akron · 3 years, 3 months ago
  90. 28031b7 Introduce matrix serialization and deserialization by Akron · 3 years, 3 months ago
  91. 941f215 Support both matrix and da in the command by Akron · 3 years, 3 months ago
  92. 16c312e Serialize and deserialize matrix representation by Akron · 3 years, 3 months ago
  93. 5c82a92 Add sentence end detection to matrix by Akron · 3 years, 3 months ago
  94. 1c34ce6 Introduce alternative matrix representation by Akron · 3 years, 3 months ago
  95. 0d0daa2 Split Foma parser from datok by Akron · 3 years, 3 months ago
  96. 7f1097f Rename datokenizer to datok by Akron · 3 years, 3 months ago
  97. 29e306f Combine Niu et al. (2013) and Morita et al. (2001) by Akron · 3 years, 4 months ago
  98. 679b486 Add skip-method proposed by Morita et al. (2001) by Akron · 3 years, 4 months ago
  99. 7b1faa6 Add xCheck() improvement proposed by Niu (2013) by Akron · 3 years, 4 months ago
  100. df37a55 Fixed benchmark tests by Akron · 3 years, 4 months ago