Understanding UTF-8
Continuation bytes always 10xxxxxx