Skip to content

Latest commit

 

History

History
72 lines (66 loc) · 4.7 KB

File metadata and controls

72 lines (66 loc) · 4.7 KB
title Byte Classification | Microsoft Docs
ms.custom
ms.date 11/04/2016
ms.reviewer
ms.suite
ms.technology
cpp-standard-libraries
ms.tgt_pltfrm
ms.topic article
f1_keywords
c.types.bytes
dev_langs
C++
helpviewer_keywords
code page 932
byte classification routines
bytes, testing
ms.assetid 1cb52d71-fb0c-46ca-aad7-6472c1103370
caps.latest.revision 12
author corob-msft
ms.author corob
manager ghogen
translation.priority.ht
cs-cz
de-de
es-es
fr-fr
it-it
ja-jp
ko-kr
pl-pl
pt-br
ru-ru
tr-tr
zh-cn
zh-tw

Byte Classification

Each of these routines tests a specified byte of a multibyte character for satisfaction of a condition. Except where specified otherwise, the output value is affected by the setting of the LC_CTYPE category setting of the locale; see setlocale for more information. The versions of these functions without the _l suffix use the current locale for this locale-dependent behavior; the versions with the _l suffix are identical except that they use the locale parameter passed in instead.

Note

By definition, the ASCII characters between 0 and 127 are a subset of all multibyte-character sets. For example, the Japanese katakana character set includes ASCII as well as non-ASCII characters.

The predefined constants in the following table are defined in CTYPE.H.

Multibyte-Character Byte-Classification Routines

Routine Byte Test Condition
isleadbyte, _isleadbyte_l Lead byte; test result depends on LC_CTYPE category setting of current locale
_ismbbalnum, _ismbbalnum_l isalnum || _ismbbkalnum
_ismbbalpha, _ismbbalpha_l isalpha || _ismbbkalnum
_ismbbgraph, _ismbbgraph_l Same as _ismbbprint, but _ismbbgraph does not include the space character (0x20)
_ismbbkalnum, _ismbbkalnum_l Non-ASCII text symbol other than punctuation. For example, in code page 932 only, _ismbbkalnum tests for katakana alphanumeric
_ismbbkana, _ismbbkana_l Katakana (0xA1 - 0xDF), code page 932 only
_ismbbkprint, _ismbbkprint_l Non-ASCII text or non-ASCII punctuation symbol. For example, in code page 932 only, _ismbbkprint tests for katakana alphanumeric or katakana punctuation (range: 0xA1 - 0xDF).
_ismbbkpunct, _ismbbkpunct_l Non-ASCII punctuation. For example, in code page 932 only, _ismbbkpunct tests for katakana punctuation.
_ismbblead, _ismbblead_l First byte of multibyte character. For example, in code page 932 only, valid ranges are 0x81 - 0x9F, 0xE0 - 0xFC.
_ismbbprint, _ismbbprint_l isprint || _ismbbkprint. ismbbprint includes the space character (0x20)
_ismbbpunct, _ismbbpunct_l ispunct || _ismbbkpunct
_ismbbtrail, _ismbbtrail_l Second byte of multibyte character. For example, in code page 932 only, valid ranges are 0x40 - 0x7E, 0x80 - 0xEC.
_ismbslead, _ismbslead_l Lead byte (in string context)
ismbstrail, _ismbstrail_l Trail byte (in string context)
_mbbtype, _mbbtype_l Return byte type based on previous byte
_mbsbtype, _mbsbtype_l Return type of byte within string
mbsinit Tracks the state of a multibyte character conversion.

The MB_LEN_MAX macro, defined in LIMITS.H, expands to the maximum length in bytes that any multibyte character can have. MB_CUR_MAX, defined in STDLIB.H, expands to the maximum length in bytes of any multibyte character in the current locale.

See Also

Run-Time Routines by Category