- 936c0f5 Support Plusampersand words in compounds by Akron · 3 years ago
- 00cecd1 Initialize identity for sigma < 256 by Akron · 3 years ago
- 4880fb6 Improve rune2symbol conversion by Akron · 3 years ago
- e62e8eb Introducing Plusampersand-Compounds by Akron · 3 years ago
- 22c565a Fix out of range bug by reverting buffer rewind improvement by Akron · 3 years ago v0.1.1
- 4ec8cec Prepare first official release by Akron · 3 years ago v0.1.0
- e87906b Minor improvements by Akron · 3 years ago
- 90aa45b Minor code simplifications by Akron · 3 years ago
- fac8abc Reorder longest match operator and update models by Akron · 3 years, 1 month ago
- 3976804 Add benchmark rule to Makefile by Akron · 3 years, 1 month ago
- 65c0f21 Simplify tokenizer whitespace handling by Akron · 3 years, 1 month ago
- c840636 Separate xml rule from main script by Akron · 3 years, 1 month ago
- 289414f Update benchmarks by Akron · 3 years, 1 month ago
- 7198645 Speed up build by Akron · 3 years, 1 month ago
- 6742b96 Add XML entities by Akron · 3 years, 1 month ago
- 7e75ef0 Add makefile by Akron · 3 years, 1 month ago
- 11a05d9 Extend tokenizer fileending by Akron · 3 years, 1 month ago
- 9135b20 Test IPv4 handling by Akron · 3 years, 1 month ago
- f1106ec Add single character abbreviations by Akron · 3 years, 1 month ago
- 4a6e0ff Fix newline after eot behaiour by Akron · 3 years, 1 month ago
- 274600e Fix buffer flushing to work with tei2korapxml by Akron · 3 years, 1 month ago
- 9c3bf7f Change fmt to log for easier writing to STDOUT by Akron · 3 years, 1 month ago
- 3d31453 Add introduction video to readme by Akron · 3 years, 1 month ago
- 6792bd2 Improve Readme example by Akron · 3 years, 1 month ago
- 15bb13d Introduce dash flag for STDIN and input file handling for tokenization by Akron · 3 years, 1 month ago
- 17984c8 Improving time parsing by Akron · 3 years, 1 month ago
- 78dba06 Add time format to transducer by Akron · 3 years, 1 month ago
- 066d99c Fix XML empty element handling by Akron · 3 years, 1 month ago
- 04335c6 Update tests by Akron · 3 years, 1 month ago
- 9fb63af Optimize tests by avoiding reload of tokenizers by Akron · 3 years, 1 month ago
- 7035d2e Fix sentence_pos handling by Akron · 3 years, 1 month ago
- 96fdc9b Fix TokenWriter regarding sentence boundaries and remove simple TokenWriter by Akron · 3 years, 1 month ago
- 2612f99 Improve command help page by Akron · 3 years, 1 month ago
- 685861a Improve Readme by Akron · 3 years, 1 month ago
- 0f087ea Parse command line options as bit flags by Akron · 3 years, 1 month ago
- fceddb6 Add sentence flags (for printing and offsets) by Akron · 3 years, 1 month ago
- a9e0c42 Introduce --[no]-tokens flag by Akron · 3 years, 1 month ago
- e9431ec Ignore newline after EOT with a flag by Akron · 3 years, 1 month ago
- 8cc2dd9 Fix buffer rewind at end of transmission by Akron · 3 years, 1 month ago
- 4f6b28c Support token offsets in token writer by Akron · 3 years, 1 month ago
- 32416ce Support offsets in token writer by Akron · 3 years, 1 month ago
- 98fbfef Improve offset handling in buffers by Akron · 3 years, 1 month ago
- f6bdfdb Add trimming at the beginning of a text by Akron · 3 years, 1 month ago
- c9c0eae Rename tests to better comply with Go test tool by Akron · 3 years, 1 month ago
- a854faa Introduce EOT (end-of-transmission) marker by Akron · 3 years, 1 month ago
- ce018e1 Merge "Introduce token_writer object" by Akron · 3 years, 1 month ago
- e396a93 Introduce token_writer object by Akron · 3 years, 1 month ago
- e0dffe0 Improve readme by Akron · 3 years, 1 month ago
- e7751b8 Added License by Akron · 3 years, 2 months ago
- 842bc65 Improve Readme by Akron · 3 years, 2 months ago
- abcb6a5 Add equivalence test for matrix and DA representations by Akron · 3 years, 2 months ago
- 34eb74c Cleanup by Akron · 3 years, 2 months ago
- 094a4e8 Use serialized matrix representation in test suite by Akron · 3 years, 2 months ago
- 28031b7 Introduce matrix serialization and deserialization by Akron · 3 years, 2 months ago
- 941f215 Support both matrix and da in the command by Akron · 3 years, 2 months ago
- 16c312e Serialize and deserialize matrix representation by Akron · 3 years, 2 months ago
- 5c82a92 Add sentence end detection to matrix by Akron · 3 years, 2 months ago
- 1c34ce6 Introduce alternative matrix representation by Akron · 3 years, 2 months ago
- 0d0daa2 Split Foma parser from datok by Akron · 3 years, 2 months ago
- 7f1097f Rename datokenizer to datok by Akron · 3 years, 2 months ago
- 29e306f Combine Niu et al. (2013) and Morita et al. (2001) by Akron · 3 years, 3 months ago
- 679b486 Add skip-method proposed by Morita et al. (2001) by Akron · 3 years, 3 months ago
- 7b1faa6 Add xCheck() improvement proposed by Niu (2013) by Akron · 3 years, 3 months ago
- df37a55 Fixed benchmark tests by Akron · 3 years, 3 months ago
- 4c2a1ad Introduce XML tests by Akron · 3 years, 3 months ago
- 34dbe97 Ignore MCS transitions instead of failing by Akron · 3 years, 3 months ago
- 0630be5 Fix parsing of end states by Akron · 3 years, 3 months ago
- 235ea12 Update generated tokenizers by Akron · 3 years, 3 months ago
- 92704eb Ignore tokenend accepting transitions by Akron · 3 years, 3 months ago
- 4fa28b3 Introduce TransCount method by Akron · 3 years, 3 months ago
- 31f3c06 Ignore MCS in sigma if not used in the transducer by Akron · 3 years, 3 months ago
- de18e90 Minor optimization on edges by Akron · 3 years, 3 months ago
- 6f1c16c Added benchmark for double array creation by Akron · 3 years, 3 months ago
- 3de361e Improved newline and abbreviation handling by Akron · 3 years, 3 months ago
- ea46e8a Add ASCII fast lookup to sigma by Akron · 3 years, 3 months ago
- f1a1650 Turn uint32 array in bc array by Akron · 3 years, 3 months ago
- e61380b Added some minor comments by Akron · 3 years, 3 months ago
- 91bd715 Add more reference to Readme by Akron · 3 years, 3 months ago
- 31cc307 Added readme file by Akron · 3 years, 4 months ago
- 1e10d00 Remove dir/Dir from abbreviation file by Akron · 3 years, 4 months ago
- 527c10c Replace zerolog with log by Akron · 3 years, 4 months ago
- bb4aac5 Optimize loading of datok files by Akron · 3 years, 4 months ago
- 7e269d4 Added conversion to the command line tool by Akron · 3 years, 4 months ago
- 8e1d69b Introduced command line tool by Akron · 3 years, 4 months ago
- 01912fc Remove unnecessary allocation for buffer recasting by Akron · 3 years, 4 months ago
- 4db3ecf Change exit operations to returning nil by Akron · 3 years, 4 months ago
- bd40680 Added transducing benchmark by Akron · 3 years, 4 months ago
- e184a91 Add new generated automata by Akron · 3 years, 4 months ago
- ec835ad Remove Match() method by Akron · 3 years, 4 months ago
- 57d0161 Add known terms with special characters by Akron · 3 years, 4 months ago
- e8837b5 Add file scheme by Akron · 3 years, 4 months ago
- fd92d7e Update abbreviations according to KorAP-Tokenizer by Akron · 3 years, 4 months ago
- a0bded5 Add ordinals by Akron · 3 years, 4 months ago
- 4af79f1 Added support for streetnames by Akron · 3 years, 4 months ago
- 310905f Add foma sources by Akron · 3 years, 4 months ago
- 03ca425 Adopt tokenizer tests from KorAP-Tokenizer by Akron · 3 years, 4 months ago
- 6e70dc8 Fix sentence splitting tests by Akron · 3 years, 4 months ago
- 1594cb8 Fix sentence splitting by Akron · 3 years, 4 months ago
- c5d8d43 Fix check on final states by Akron · 3 years, 4 months ago
- b7e1f13 Simplify transducer (single test broken) by Akron · 3 years, 4 months ago