_wp_utf8_codepoint_count( string $text, ?int $byte_offset, ?int $max_byte_length = PHP_INT_MAX ): int

This function’s access is marked private. This means it is not intended for use by plugin or theme developers, only by core. It is listed here for completeness.

Returns how many code points are found in the given UTF-8 string.

Description

Invalid spans of bytes count as a single code point according to the maximal subpart rule. This function is a fallback method for calling mb_strlen( $text, 'UTF-8' ).

When negative values are provided for the byte offsets or length, this will always report zero code points.

Example:

4  === _wp_utf8_codepoint_count( 'text' );

// Groups are 'test', "\x90" as '�', 'wp', "\xE2\x80" as '�', "\xC0" as '�', and 'test'.
13 === _wp_utf8_codepoint_count( "test\x90wp\xE2\x80\xC0test" );

Parameters

$textstringrequired
Count code points in this string.
$byte_offset?intrequired
Start counting after this many bytes in $text. Must be positive.
$max_byte_length?intoptional
Stop counting after having scanned past this many bytes.
Default is to scan until the end of the string. Must be positive.

Default:PHP_INT_MAX

Return

int How many code points were found.

Source

function _wp_utf8_codepoint_count( string $text, ?int $byte_offset = 0, ?int $max_byte_length = PHP_INT_MAX ): int {
	if ( $byte_offset < 0 ) {
		return 0;
	}

	$count           = 0;
	$at              = $byte_offset;
	$end             = strlen( $text );
	$invalid_length  = 0;
	$max_byte_length = min( $end - $at, $max_byte_length );

	while ( $at < $end && ( $at - $byte_offset ) < $max_byte_length ) {
		$count += _wp_scan_utf8( $text, $at, $invalid_length, $max_byte_length - ( $at - $byte_offset ) );
		$count += $invalid_length > 0 ? 1 : 0;
		$at    += $invalid_length;
	}

	return $count;
}

Changelog

VersionDescription
6.9.0Introduced.

User Contributed Notes

You must log in before being able to contribute a note or feedback.