Closed
Description
Bug description
When working with a flat file with UTF-8 that contains extended characters the line si tokenized in a wrong way because string.substring is used instead of working with byte arrays.This happens because say an "è" character is made up of two bytes(so two "positions" on the file) but working with a string you see it as one position getting a wrong token.
Environment
All versions
Steps to reproduce
Use a fixed file lenght with some fields and add in one field data a text like "aleè"