The default length semantics for all data files (except UFT-16) are bytes. Thus, in your case, you have a CHAR of 3500 bytes, not characters. You have several multibyte characters in your file, so 2624 characters use more than 3500 bytes, so a message (misleading).
You can sort this using character length semantics instead
change this line in the control file
characterset UTF8
to that
characterset UTF8 length semantics char
and it will work with characters for CHAR fields (and some others) - in the same way as you set up your table, so 3500 characters up to four bytes long.
For more information, see the Utility Guide on Semantics for Character Lengths .
source share