Determines if a Unicode codepoint is valid.
Description
The definition of a valid Unicode codepoint is taken from the XML definition:
Characters
… Legal characters are tab, carriage return, line feed, and the legal characters of Unicode and ISO/IEC 10646.
… Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
See also
Parameters
$iintrequired- Unicode codepoint.
Source
function valid_unicode( $i ) {
$i = (int) $i;
return (
0x9 === $i || // U+0009 HORIZONTAL TABULATION (HT)
0xA === $i || // U+000A LINE FEED (LF)
0xD === $i || // U+000D CARRIAGE RETURN (CR)
/*
* The valid Unicode characters according to the XML specification:
*
* > any Unicode character, excluding the surrogate blocks, FFFE, and FFFF.
*/
( 0x20 <= $i && $i <= 0xD7FF ) ||
( 0xE000 <= $i && $i <= 0xFFFD ) ||
( 0x10000 <= $i && $i <= 0x10FFFF )
);
}
Changelog
| Version | Description |
|---|---|
| 2.7.0 | Introduced. |
User Contributed Notes
You must log in before being able to contribute a note or feedback.