 | TurboSFV - Blog | | |
|
TurboSFV |
2020-06-14 11:06:28 |
TurboSFV v8.60 - UTF-8 encoding for checksum files |
Notes to TurboSFV v8.60:
Along with the Unicode encoding type UTF-16, TurboSFV now supports UTF-8, which provides a different way for encoding Unicode characters. While UTF-16 occupies two or four bytes for a Unicode code point, UTF-8 uses one to four bytes. Thus,
depending on which characters must be encoded, the total size differs between the two encoding
types.
Having a look at UTF-8, you will notice that the first 128 characters in this scheme are in line with the ASCII character set, which is based on the English alphabet. Because these are the characters, which only occupy one byte, a English
text encoded in UTF-8 occupies half the size than encoded in UTF-16. On the other hand, for a text which is based on an Asian language, UTF-16 is probably the better choice, if the characters can be encoded with 2 bytes while UTF-8 needs 3
or 4
bytes.
TurboSFV saves file and folder names in checksum files. If the majority of the used characters are covered by ASCII, then you should go for UTF-8, because the size of the file becomes smaller.
If you have enabled auto detection in the configuration, then TurboSFV looks for Unicode characters and takes either ANSI (extended ASCII depending on your local code page) or an Unicode capable encoding type UTF-8 or UTF-16. A new flag in
the configuration allows to prioritize one of the Unicode encoding
schemes.
If you like to add a comment, then this is the right place.
|
TurboSFV Cologne, Germany |
|
|
|
|
|