ucto — Rule-Based Unicode Tokenizer | Open Awesome