Instead of looping over the chars in a String, the logic should loop over code points. See https://wiki.sei.cmu.edu/confluence/display/java/STR01-J.+Do+not+assume+that+a+Java+char+fully+represents+a+Unicode+code+point for reference.