_wp_scrub_utf8_fallback( string $bytes ): string

This function’s access is marked private. This means it is not intended for use by plugin or theme developers, only by core. It is listed here for completeness.

Fallback mechanism for replacing invalid spans of UTF-8 bytes.

Description

Example:

'Pi�a' === _wp_scrub_utf8_fallback( "Pi\xF1a" ); // “ñ” is 0xF1 in Windows-1252.

See also

Parameters

$bytesstringrequired
UTF-8 encoded string which might contain spans of invalid bytes.

Return

string Input string with spans of invalid bytes swapped with the replacement character.

Source

function _wp_scrub_utf8_fallback( string $bytes ): string {
	$bytes_length   = strlen( $bytes );
	$next_byte_at   = 0;
	$was_at         = 0;
	$invalid_length = 0;
	$scrubbed       = '';

	while ( $next_byte_at <= $bytes_length ) {
		_wp_scan_utf8( $bytes, $next_byte_at, $invalid_length );

		if ( $next_byte_at >= $bytes_length ) {
			if ( 0 === $was_at ) {
				return $bytes;
			}

			return $scrubbed . substr( $bytes, $was_at, $next_byte_at - $was_at - $invalid_length );
		}

		$scrubbed .= substr( $bytes, $was_at, $next_byte_at - $was_at );
		$scrubbed .= "\u{FFFD}";

		$next_byte_at += $invalid_length;
		$was_at        = $next_byte_at;
	}

	return $scrubbed;
}

Changelog

VersionDescription
6.9.0Introduced.

User Contributed Notes

You must log in before being able to contribute a note or feedback.