Foreign Language in Strings C++

Question

I'm developing a multi-language piece of software (first time working on anything other than English).

I've made code that reads in multiple localization files, the user then selects their language, and that localization file is used.

This all works fine and dandy, but when I try to display symbols from foreign languages (like Korean) it does not show the correct symbols.

Is there something special I need to do to store Chinese, Korean, Japanese, etc into strings? One of my Korean Localization files looks like this....

[Labels]
Username=사용자 이름
Password=암호

So in my code I have a function that gets the designated string like this...

const std::string& UsernameLabel = GetLocalizationString("Korean", "Labels", "Username");
const std::string& PasswordLabel = GetLocalizationString("Korean", "Labels", "Password");

"Foreign language" seems not the real issue. The real issue is probably the handling of non-ASCII characters. Also, your foreign language is someone's native language. — user2486888
– user2486888, Commented Jan 31, 2018 at 8:00
May want to look into std::wstring, but I'm not sure what encoding you are working with. — user4581301
– user4581301, Commented Jan 31, 2018 at 8:07
How are these strings encoded? Maybe UTF-8 or ISO-2022-KR? And what is the encoding of the terminal? What is shown instead of the correct symbols? — Olaf Dietsche
– Olaf Dietsche, Commented Jan 31, 2018 at 8:08
UTF8 is probably what you are looking for. stackoverflow.com/questions/3011082/… — schorsch312
– schorsch312, Commented Jan 31, 2018 at 8:09

Melebius · Accepted Answer · 2018-01-31 08:36:40Z

5

The root of the issue is std::string itself as it deals with chars (that is equal to 1 byte in most cases). As soon as you plan to develop multi-language software, you have to do one of the following:

Use std::wstring as it deals with "wide chars" (usually 2 bytes on Windows). Easy to do, covers most cases.
Step away from standard string classes and use UTF-8 (or UTF-32 etc.) encoding to represent UI info. Thus, it means working with byte buffers, not strings because some symbols are encoded with multiple bytes, some bytes are not symbols at all (like emoji modifiers for skin color, gender etc.). The most correct approach, may be time-consuming.

Update: also, you may find this discussion useful: std::wstring VS std::string

edited Jan 31, 2018 at 8:36

Melebius

6,7944 gold badges45 silver badges55 bronze badges

answered Jan 31, 2018 at 8:31

Yury Schkatula

5,3892 gold badges20 silver badges46 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

user4581301 Over a year ago

Interesting side note, according to the C++ standard, "The sizeof operator yields the number of bytes" and "sizeof(char), sizeof(signed char) and sizeof(unsigned char) are 1" (Quoting N4700, [expr.sizeof]) so I'm pretty sure that means no matter how many bits are in a char, it is still one byte.

Sebastian Redl Over a year ago

@user4581301 As far as the C++ standard is concerned, a byte and a char are pretty much the same thing.

Hans Olsson Over a year ago

"wide chars" have some of the same problems as UTF-8 - i.e. some symbols are represented by more than one 16-bit "wide chars"; so if you want to do it right you gain very little by doing that.

Yury Schkatula Over a year ago

@HansOlsson, as long as you work with UTF-8 as a byte stream/buffer, "there is no char". Just let system-level API to render that bytes on the screen.

Hans Olsson Over a year ago

@YurySchkatula agreed for UTF-8, my point is that this also holds for "wide chars", and you cannot just drop the right-most "wide character" if the string is too long or assume that 10 wide characters are twice as long as 5 wide characters (on the screen).

EricChiu · Accepted Answer · 2018-01-31 08:30:52Z

1

Wide characters should suit your situation,Such as unicode,I am from above country, hope can help you.

answered Jan 31, 2018 at 8:30

EricChiu

261 bronze badge

Collectives™ on Stack Overflow

Foreign Language in Strings C++

2 Answers 2

5 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related