Thesis on performance benifits of compiling RegEx
These datasets are exlusively used for benchmarking purposes.
Datasets to test RegEx on
RegEx or parsable into RegEx
- Optimizations
- Generated Code
- Character sequences replacing character concatenation ("ab" should be tried as "ab", not "a" then "b")
- Byte automata instead of using
.chars().nth() - Use
MIN_LENconstant to boundry check strings
- Compilation
- Reduce string duplication/copying
- Generated Code