The situation is that I had a file whose first 1024 bytes didn't contain a Unicode character, but then a character after the 1024th byte did. An example of such a file is:
http://exercism.io/submissions/1e341848768141cf8eba94c6af6e55a7
Submitting this file mangles the Unicode character.
In contrast, this file has Unicode in the first 1024 bytes, so it is good (even the Unicode that appears after the first 1024 bytes is good)
http://exercism.io/submissions/dce4e3ddf0294034ad987ce7b86cdb38
(These are just example submissions in Hello World, but this affected my submission for a real exercise too, Counter in xgo)
I tracked this down to readFileAsUTF8String in api/iteration.go. This uses the https://godoc.org/golang.org/x/net/html/charset#DetermineEncoding function to determine the encoding, which reads the first 1024 bytes.
I'm not really sure what's the right solution here. I know that function was created for #182 to solve exercism/exercism#2303 so there obviously is a legitimate reason behind all this, I guess maybe now we just need to figure out how to deal with this case as well. I don't yet have a good solution, so I'll file this first and sleep on it for a bit.
The situation is that I had a file whose first 1024 bytes didn't contain a Unicode character, but then a character after the 1024th byte did. An example of such a file is:
http://exercism.io/submissions/1e341848768141cf8eba94c6af6e55a7
Submitting this file mangles the Unicode character.
In contrast, this file has Unicode in the first 1024 bytes, so it is good (even the Unicode that appears after the first 1024 bytes is good)
http://exercism.io/submissions/dce4e3ddf0294034ad987ce7b86cdb38
(These are just example submissions in Hello World, but this affected my submission for a real exercise too, Counter in xgo)
I tracked this down to
readFileAsUTF8Stringinapi/iteration.go. This uses the https://godoc.org/golang.org/x/net/html/charset#DetermineEncoding function to determine the encoding, which reads the first 1024 bytes.I'm not really sure what's the right solution here. I know that function was created for #182 to solve exercism/exercism#2303 so there obviously is a legitimate reason behind all this, I guess maybe now we just need to figure out how to deal with this case as well. I don't yet have a good solution, so I'll file this first and sleep on it for a bit.