回答編集履歴

2 fix

yohhoy

yohhoy score 4813

2018/12/25 17:13  投稿

セキュリティ要件を理由としている [P0618R0 Deprecating <codecvt>](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0618r0.html) 採択後の C++17標準ライブラリ仕様でも、`codecvt<char16/32_t, char, mbstate_t>`, `codecvt_byname<char16/32_t, char, mbstate_t>` という特殊化は有効です。つまりこの時点では「UTF-8 ⇔ UTF-16/32エンコーディング変換は問題ない(少なくとも非推奨化する理由はない)」という合意形成があった(今もある)と思われます。
[C++17 [locale.codecvt]/paragraph 3](https://timsong-cpp.github.io/cppwp/n4659/locale.codecvt#3)より引用します(下線部は回答者による強調):
> The specializations required in Table 69 (25.3.1.1.1) convert the implementation-defined native character set. `codecvt<char, char, mbstate_t>` implements a degenerate conversion; it does not convert at all. **The specialization `codecvt<char16_t, char, mbstate_t>` converts between the UTF-16 and UTF-8 encoding forms, and the specialization codecvt `<char32_t, char, mbstate_t>` converts between the UTF-32 and UTF-8 encoding forms.** `codecvt<wchar_t, char, mbstate_t>` converts between the native character sets for narrow and wide characters. Specializations on `mbstate_t` perform conversion between encodings known to the library implementer. Other encodings can be converted by specializing on a user-defined `stateT` type. Objects of type `stateT` can contain any state that is useful to communicate to or from the specialized `do_in` or `do_out` members.
> The specializations required in Table 69 (25.3.1.1.1) convert the implementation-defined native character set. `codecvt<char, char, mbstate_t>` implements a degenerate conversion; it does not convert at all. **The specialization `codecvt<char16_t, char, mbstate_t>` converts between the UTF-16 and UTF-8 encoding forms, and the specialization `codecvt <char32_t, char, mbstate_t>` converts between the UTF-32 and UTF-8 encoding forms.** `codecvt<wchar_t, char, mbstate_t>` converts between the native character sets for narrow and wide characters. Specializations on `mbstate_t` perform conversion between encodings known to the library implementer. Other encodings can be converted by specializing on a user-defined `stateT` type. Objects of type `stateT` can contain any state that is useful to communicate to or from the specialized `do_in` or `do_out` members.
----
[P0482R0 char8_t: A type for UTF-8 characters and strings](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0482r0.html) の目的は「UTF-8を表現する`char8_t`型の導入」ですから、単にこの趣旨に沿った調整(`char`→`char8_t`置換)を行っただけではないでしょうか?
1 add link to C++17 paragraph

yohhoy

yohhoy score 4813

2018/12/25 16:04  投稿

セキュリティ要件を理由としている [P0618R0 Deprecating <codecvt>](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0618r0.html) 採択後の C++17標準ライブラリ仕様でも、`codecvt<char16/32_t, char, mbstate_t>`, `codecvt_byname<char16/32_t, char, mbstate_t>` という特殊化は有効です。つまりこの時点では「UTF-8 ⇔ UTF-16/32エンコーディング変換は問題ない(少なくとも非推奨化する理由はない)」という合意形成があった(今もある)と思われます。
C++17 [locale.codecvt]/paragraph 3より引用します(下線部は回答者による強調):
[C++17 [locale.codecvt]/paragraph 3](https://timsong-cpp.github.io/cppwp/n4659/locale.codecvt#3)より引用します(下線部は回答者による強調):
> The specializations required in Table 69 (25.3.1.1.1) convert the implementation-defined native character set. `codecvt<char, char, mbstate_t>` implements a degenerate conversion; it does not convert at all. **The specialization `codecvt<char16_t, char, mbstate_t>` converts between the UTF-16 and UTF-8 encoding forms, and the specialization codecvt `<char32_t, char, mbstate_t>` converts between the UTF-32 and UTF-8 encoding forms.** `codecvt<wchar_t, char, mbstate_t>` converts between the native character sets for narrow and wide characters. Specializations on `mbstate_t` perform conversion between encodings known to the library implementer. Other encodings can be converted by specializing on a user-defined `stateT` type. Objects of type `stateT` can contain any state that is useful to communicate to or from the specialized `do_in` or `do_out` members.
----
[P0482R0 char8_t: A type for UTF-8 characters and strings](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0482r0.html) の目的は「UTF-8を表現する`char8_t`型の導入」ですから、単にこの趣旨に沿った調整(`char`→`char8_t`置換)を行っただけではないでしょうか?

思考するエンジニアのためのQ&Aサイト「teratail」について詳しく知る