Skip to content

HTML API: Refactor wp_kses_hair()#9248

Closed
dmsnell wants to merge 1 commit intoWordPress:trunkfrom
dmsnell:html-api/refactor-wp-kses-hair-take-3
Closed

HTML API: Refactor wp_kses_hair()#9248
dmsnell wants to merge 1 commit intoWordPress:trunkfrom
dmsnell:html-api/refactor-wp-kses-hair-take-3

Conversation

@dmsnell
Copy link
Copy Markdown
Member

@dmsnell dmsnell commented Jul 11, 2025

Trac ticket: Core-63724

Replaces #7407, dmsnell#5
Coordination in #9256

Review feedback

  • Historically the value and whole properties of the returned array indicate the raw parsed bytes from the HTML (with some exceptions). This means that HTML character references are not decoded. This represents an abstraction leak between the HTML and structural return value.
    • Should this refactor leave the messy return values in place or should it decode the attribute values to enforce the view of the world developers are imagining when calling it? (that all values are normal PHP strings and not HTML text node strings)?

Implementation

wp_kses_hair() is built around an impressive state machine for parsing the $attr of an HTML tag, that is, the span of text after the tag name and before the closing >. Unfortunately, that parsing code doesn’t fully-implement the HTML specification and may be prone to mis-parsing.

This patch replaces the existing state machine with a straight-forward use of the HTML API to parse the attributes for us, constructing a shell tag for the $attr string and reading the attributes structurally. This shell is necessary because a previous stage of the pipeline has already separated what it thinks is the so-called “attribute list” from a tag.

Dependencies

@github-actions
Copy link
Copy Markdown

github-actions bot commented Jul 11, 2025

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props dmsnell, jonsurrell, westonruter.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 11, 2025
Trac ticket: Core-63694

`wp_kses_hair()` is built around an impressive state machine for parsing
the `$attr` of an HTML tag, that is, the span of text after the tag name
and before the closing `>`. Unfortunately, that parsing code doesn’t
fully-implement the HTML specification and may be prone to mis-parsing.

This patch replaces the existing state machine with a straight-forward
use of the HTML API to parse the attributes for us, constructing a shell
take for the `$attr` string and reading the attributes structurally.
This shell is necessary because a previous stage of the pipeline has
already separated what it thinks is the so-called “attribute list” from
a tag.

Props: dmsnell
@dmsnell dmsnell force-pushed the html-api/refactor-wp-kses-hair-take-3 branch from 68c7746 to b476339 Compare July 11, 2025 22:37
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 11, 2025
Trac ticket: Core-63694

`wp_kses_hair()` is built around an impressive state machine for parsing
the `$attr` of an HTML tag, that is, the span of text after the tag name
and before the closing `>`. Unfortunately, that parsing code doesn’t
fully-implement the HTML specification and may be prone to mis-parsing.

This patch replaces the existing state machine with a straight-forward
use of the HTML API to parse the attributes for us, constructing a shell
take for the `$attr` string and reading the attributes structurally.
This shell is necessary because a previous stage of the pipeline has
already separated what it thinks is the so-called “attribute list” from
a tag.

Props: dmsnell
@dmsnell dmsnell force-pushed the html-api/refactor-wp-kses-hair-take-3 branch from b476339 to 6146ecd Compare July 11, 2025 22:45
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 11, 2025
Trac ticket: Core-63694

`wp_kses_hair()` is built around an impressive state machine for parsing
the `$attr` of an HTML tag, that is, the span of text after the tag name
and before the closing `>`. Unfortunately, that parsing code doesn’t
fully-implement the HTML specification and may be prone to mis-parsing.

This patch replaces the existing state machine with a straight-forward
use of the HTML API to parse the attributes for us, constructing a shell
take for the `$attr` string and reading the attributes structurally.
This shell is necessary because a previous stage of the pipeline has
already separated what it thinks is the so-called “attribute list” from
a tag.

Props: dmsnell
@dmsnell dmsnell force-pushed the html-api/refactor-wp-kses-hair-take-3 branch from 6146ecd to d64f56e Compare July 11, 2025 22:46
@github-actions
Copy link
Copy Markdown

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 12, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 18, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 18, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 18, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 18, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 18, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 18, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 18, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 18, 2025
Prep work for WordPress#9248.

# Conflicts:
#	tests/phpunit/tests/comment.php
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 18, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 18, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 18, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 18, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Sep 30, 2025
Trac ticket: Core-63694

`wp_kses_hair()` is built around an impressive state machine for parsing
the `$attr` of an HTML tag, that is, the span of text after the tag name
and before the closing `>`. Unfortunately, that parsing code doesn’t
fully-implement the HTML specification and may be prone to mis-parsing.

This patch replaces the existing state machine with a straight-forward
use of the HTML API to parse the attributes for us, constructing a shell
take for the `$attr` string and reading the attributes structurally.
This shell is necessary because a previous stage of the pipeline has
already separated what it thinks is the so-called “attribute list” from
a tag.

Props: dmsnell
@dmsnell dmsnell force-pushed the html-api/refactor-wp-kses-hair-take-3 branch from d119749 to e525b75 Compare September 30, 2025 20:59
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Oct 1, 2025
Trac ticket: Core-63694

`wp_kses_hair()` is built around an impressive state machine for parsing
the `$attr` of an HTML tag, that is, the span of text after the tag name
and before the closing `>`. Unfortunately, that parsing code doesn’t
fully-implement the HTML specification and may be prone to mis-parsing.

This patch replaces the existing state machine with a straight-forward
use of the HTML API to parse the attributes for us, constructing a shell
take for the `$attr` string and reading the attributes structurally.
This shell is necessary because a previous stage of the pipeline has
already separated what it thinks is the so-called “attribute list” from
a tag.

Props: dmsnell
@dmsnell dmsnell force-pushed the html-api/refactor-wp-kses-hair-take-3 branch from e525b75 to 9ae464a Compare October 1, 2025 23:14
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Oct 2, 2025
Trac ticket: Core-63694

`wp_kses_hair()` is built around an impressive state machine for parsing
the `$attr` of an HTML tag, that is, the span of text after the tag name
and before the closing `>`. Unfortunately, that parsing code doesn’t
fully-implement the HTML specification and may be prone to mis-parsing.

This patch replaces the existing state machine with a straight-forward
use of the HTML API to parse the attributes for us, constructing a shell
take for the `$attr` string and reading the attributes structurally.
This shell is necessary because a previous stage of the pipeline has
already separated what it thinks is the so-called “attribute list” from
a tag.

Props: dmsnell
@dmsnell dmsnell force-pushed the html-api/refactor-wp-kses-hair-take-3 branch from 9ae464a to 5e878cf Compare October 2, 2025 19:39
@sirreal
Copy link
Copy Markdown
Member

sirreal commented Jan 9, 2026

This function had quirks that change with this PR and I want to understand them.

I created a test suite for wp_kses_hair(), then I merged this branch and updated to get a diff of test changes. I also looked at several of the most popular results from WP Directory to understand usage.

My review of the most common usages on suggest that this change is safe to make and would not negatively impact plugin authors.

  • Historically the value and whole properties of the returned array indicate the raw parsed bytes from the HTML (with some exceptions). This means that HTML character references are not decoded. This represents an abstraction leak between the HTML and structural return value.
    • Should this refactor leave the messy return values in place or should it decode the attribute values to enforce the view of the world developers are imagining when calling it? (that all values are normal PHP strings and not HTML text node strings)?

This is a tricky question. It doesn't seem like folks rely on specifics of the input representation being present in the output, however it's certainly possible.

In one of the examples from plugins, esc_attr() is called on the attribute value to construct a new HTML string. This should be perfectly fine because the original HTML was re-encoded in this PR and esc_attr() will avoid double-encoding. They also statically wrap with ", which made the esc_attr() necessary because the attribute value could have contained "!

After some reflection, I believe the behavior you've implemented here is a good decision. Consider that the input is HTML and the output (value and whole) have always been some form of HTML. The difference here is a normalization of the HTML in the output.


behavior diff
diff --git a/tests/phpunit/tests/kses/wpKsesHair.php b/tests/phpunit/tests/kses/wpKsesHair.php
index 2ed83679f2e3d..05d573bc070bc 100644
--- a/tests/phpunit/tests/kses/wpKsesHair.php
+++ b/tests/phpunit/tests/kses/wpKsesHair.php
@@ -57,7 +57,7 @@ public function data_attribute_parsing() {
 				'title' => array(
 					'name'  => 'title',
 					'value' => 'My Title',
-					'whole' => "title='My Title'",
+					'whole' => 'title="My Title"',
 					'vless' => 'n',
 				),
 			),
@@ -188,8 +188,8 @@ public function data_attribute_parsing() {
 			array(
 				'title' => array(
 					'name'  => 'title',
-					'value' => '<test>',
-					'whole' => 'title="<test>"',
+					'value' => '<test>',
+					'whole' => 'title="<test>"',
 					'vless' => 'n',
 				),
 			),
@@ -200,8 +200,8 @@ public function data_attribute_parsing() {
 			array(
 				'title' => array(
 					'name'  => 'title',
-					'value' => '<hex>',
-					'whole' => 'title="<hex>"',
+					'value' => '<hex>',
+					'whole' => 'title="<hex>"',
 					'vless' => 'n',
 				),
 			),
@@ -212,8 +212,8 @@ public function data_attribute_parsing() {
 			array(
 				'title' => array(
 					'name'  => 'title',
-					'value' => '<HEX>',
-					'whole' => 'title="<HEX>"',
+					'value' => '<HEX>',
+					'whole' => 'title="<HEX>"',
 					'vless' => 'n',
 				),
 			),
@@ -224,8 +224,8 @@ public function data_attribute_parsing() {
 			array(
 				'title' => array(
 					'name'  => 'title',
-					'value' => '&invalid; &#; &#x;',
-					'whole' => 'title="&invalid; &#; &#x;"',
+					'value' => '&invalid; &#; &#x;',
+					'whole' => 'title="&invalid; &#; &#x;"',
 					'vless' => 'n',
 				),
 			),
@@ -249,7 +249,7 @@ public function data_attribute_parsing() {
 				'data-text' => array(
 					'name'  => 'data-text',
 					'value' => 'Single quoted value',
-					'whole' => "data-text='Single quoted value'",
+					'whole' => 'data-text="Single quoted value"',
 					'vless' => 'n',
 				),
 			),
@@ -267,7 +267,7 @@ public function data_attribute_parsing() {
 				'alt'   => array(
 					'name'  => 'alt',
 					'value' => 'single',
-					'whole' => "alt='single'",
+					'whole' => 'alt="single"',
 					'vless' => 'n',
 				),
 				'id'    => array(
@@ -284,8 +284,8 @@ public function data_attribute_parsing() {
 			array(
 				'title' => array(
 					'name'  => 'title',
-					'value' => "It's working",
-					'whole' => 'title="It\'s working"',
+					'value' => 'It's working',
+					'whole' => 'title="It's working"',
 					'vless' => 'n',
 				),
 			),
@@ -296,8 +296,8 @@ public function data_attribute_parsing() {
 			array(
 				'title' => array(
 					'name'  => 'title',
-					'value' => 'He said "hello"',
-					'whole' => 'title=\'He said "hello"\'',
+					'value' => 'He said "hello"',
+					'whole' => 'title="He said "hello""',
 					'vless' => 'n',
 				),
 			),
@@ -327,12 +327,32 @@ public function data_attribute_parsing() {
 
 		yield 'invalid attribute name starting with number' => array(
 			'1invalid="value"',
-			array(),
+			array(
+				'1invalid' => array(
+					'name'  => '1invalid',
+					'value' => 'value',
+					'whole' => '1invalid="value"',
+					'vless' => 'n',
+				),
+			),
 		);
 
 		yield 'invalid attribute name special chars' => array(
 			'@invalid="value" $bad="value"',
-			array(),
+			array(
+				'@invalid' => array(
+					'name'  => '@invalid',
+					'value' => 'value',
+					'whole' => '@invalid="value"',
+					'vless' => 'n',
+				),
+				'$bad'     => array(
+					'name'  => '$bad',
+					'value' => 'value',
+					'whole' => '$bad="value"',
+					'vless' => 'n',
+				),
+			),
 		);
 
 		yield 'duplicate attributes first wins' => array(
@@ -355,7 +375,20 @@ public function data_attribute_parsing() {
 
 		yield 'malformed unclosed double quote' => array(
 			'title="unclosed class="test"',
-			array(),
+			array(
+				'title' => array(
+					'name'  => 'title',
+					'value' => 'unclosed class=',
+					'whole' => 'title="unclosed class="',
+					'vless' => 'n',
+				),
+				'test"' => array(
+					'name'  => 'test"',
+					'value' => '',
+					'whole' => 'test"',
+					'vless' => 'y',
+				),
+			),
 		);
 
 		yield 'very long attribute value' => array(
@@ -610,7 +643,7 @@ public function data_attribute_parsing() {
 				'alt'   => array(
 					'name'  => 'alt',
 					'value' => '',
-					'whole' => "alt=''",
+					'whole' => 'alt=""',
 					'vless' => 'n',
 				),
 				'class' => array(
@@ -625,7 +658,7 @@ public function data_attribute_parsing() {
 		yield 'forward slashes between attributes' => array(
 			'att / att2=2 /// att3="3"',
 			array(
-				'att'   => array(
+				'att'  => array(
 					'name'  => 'att',
 					'value' => '',
 					'whole' => 'att',
@@ -652,13 +685,13 @@ public function data_attribute_parsing() {
 				'att'  => array(
 					'name'  => 'att',
 					'value' => 'val',
-					'whole' => "att='val'",
+					'whole' => 'att="val"',
 					'vless' => 'n',
 				),
 				'att2' => array(
 					'name'  => 'att2',
 					'value' => 'val2',
-					'whole' => "att2='val2'",
+					'whole' => 'att2="val2"',
 					'vless' => 'n',
 				),
 			),
@@ -670,13 +703,13 @@ public function data_attribute_parsing() {
 				'att'  => array(
 					'name'  => 'att',
 					'value' => 'val',
-					'whole' => "att='val'",
+					'whole' => 'att="val"',
 					'vless' => 'n',
 				),
 				'att2' => array(
 					'name'  => 'att2',
 					'value' => 'val2',
-					'whole' => "att2='val2'",
+					'whole' => 'att2="val2"',
 					'vless' => 'n',
 				),
 			),
@@ -688,13 +721,13 @@ public function data_attribute_parsing() {
 				'att'  => array(
 					'name'  => 'att',
 					'value' => 'val',
-					'whole' => "att='val'",
+					'whole' => 'att="val"',
 					'vless' => 'n',
 				),
 				'att2' => array(
 					'name'  => 'att2',
 					'value' => 'val2',
-					'whole' => "att2='val2'",
+					'whole' => 'att2="val2"',
 					'vless' => 'n',
 				),
 			),
@@ -706,13 +739,13 @@ public function data_attribute_parsing() {
 				'att'  => array(
 					'name'  => 'att',
 					'value' => 'val',
-					'whole' => "att='val'",
+					'whole' => 'att="val"',
 					'vless' => 'n',
 				),
 				'att2' => array(
 					'name'  => 'att2',
 					'value' => 'val2',
-					'whole' => "att2='val2'",
+					'whole' => 'att2="val2"',
 					'vless' => 'n',
 				),
 			),
@@ -739,34 +772,67 @@ public function data_attribute_parsing() {
 		// Malformed Equals Patterns.
 		yield 'multiple equals signs' => array(
 			'att=="val"',
-			array(),
+			array(
+				'att' => array(
+					'name'  => 'att',
+					'value' => '="val"',
+					'whole' => 'att="="val""',
+					'vless' => 'n',
+				),
+			),
 		);
 
 		yield 'equals with strange spacing' => array(
 			'att= ="val"',
-			array(),
+			array(
+				'att' => array(
+					'name'  => 'att',
+					'value' => '="val"',
+					'whole' => 'att="="val""',
+					'vless' => 'n',
+				),
+			),
 		);
 
 		yield 'triple equals signs' => array(
 			'att==="val"',
-			array(),
+			array(
+				'att' => array(
+					'name'  => 'att',
+					'value' => '=="val"',
+					'whole' => 'att="=="val""',
+					'vless' => 'n',
+				),
+			),
 		);
 
 		yield 'equals echo pattern' => array(
 			"att==echo 'something'",
 			array(
-				'att' => array(
+				'att'         => array(
 					'name'  => 'att',
 					'value' => '=echo',
 					'whole' => 'att="=echo"',
 					'vless' => 'n',
 				),
+				"'something'" => array(
+					'name'  => "'something'",
+					'value' => '',
+					'whole' => "'something'",
+					'vless' => 'y',
+				),
 			),
 		);
 
 		yield 'attribute starting with equals' => array(
 			'= bool k=v',
 			array(
+				'='    => array(
+					'name'  => '=',
+					'value' => '',
+					'whole' => '=',
+					'vless' => 'y',
+				),
 				'bool' => array(
 					'name'  => 'bool',
 					'value' => '',
@@ -785,18 +851,43 @@ public function data_attribute_parsing() {
 		yield 'mixed quotes and equals chaos' => array(
 			'k=v ="' . "' j=w",
 			array(
-				'k' => array(
+				'k'        => array(
 					'name'  => 'k',
 					'value' => 'v',
 					'whole' => 'k="v"',
 					'vless' => 'n',
 				),
+				'="' . "'" => array(
+					'name'  => '="' . "'",
+					'value' => '',
+					'whole' => '="' . "'",
+					'vless' => 'y',
+				),
+				'j'        => array(
+					'name'  => 'j',
+					'value' => 'w',
+					'whole' => 'j="w"',
+					'vless' => 'n',
+				),
 			),
 		);
 
 		yield 'triple equals quoted whitespace' => array(
 			'==="  "',
-			array(),
+			array(
+				'=' => array(
+					'name'  => '=',
+					'value' => '="',
+					'whole' => '=="=""',
+					'vless' => 'n',
+				),
+				'"' => array(
+					'name'  => '"',
+					'value' => '',
+					'whole' => '"',
+					'vless' => 'y',
+				),
+			),
 		);
 
 		yield 'boolean with contradictory value' => array(
@@ -820,7 +911,13 @@ public function data_attribute_parsing() {
 		yield 'empty attribute name with value' => array(
 			'="value" class="test"',
 			array(
-				'class' => array(
+				'="value"' => array(
+					'name'  => '="value"',
+					'value' => '',
+					'whole' => '="value"',
+					'vless' => 'y',
+				),
+				'class'    => array(
 					'name'  => 'class',
 					'value' => 'test',
 					'whole' => 'class="test"',
@@ -890,7 +987,7 @@ public function data_protocol_filtering() {
 				'href' => array(
 					'name'  => 'href',
 					'value' => 'alert(1)',
-					'whole' => "href='alert(1)'",
+					'whole' => 'href="alert(1)"',
 					'vless' => 'n',
 				),
 			),
@@ -925,8 +1022,8 @@ public function data_protocol_filtering() {
 			array(
 				'src' => array(
 					'name'  => 'src',
-					'value' => 'text/html,<script>alert(1)</script>',
-					'whole' => 'src="text/html,<script>alert(1)</script>"',
+					'value' => 'text/html,&lt;script&gt;alert(1)&lt;/script&gt;',
+					'whole' => 'src="text/html,&lt;script&gt;alert(1)&lt;/script&gt;"',
 					'vless' => 'n',
 				),
 			),

Here are two examples from the most most popular plugins in the WP Directory search:

From YITH (this appears to be part of the yith library used in many of their plugins):

	function yith_plugin_fw_html_attributes_to_string( $attributes = array(), $echo = false ) {
		$html_attributes = '';
		if ( ! ! $attributes ) {
			if ( is_string( $attributes ) ) {
				$parsed_attrs = wp_kses_hair( $attributes, wp_allowed_protocols() );
				$attributes   = array();
				foreach ( $parsed_attrs as $attr ) {
					$attributes[ $attr['name'] ] = 'n' === $attr['vless'] ? $attr['value'] : null;
				}
			}

			if ( is_array( $attributes ) ) {
				$html_attributes = array();
				foreach ( $attributes as $key => $value ) {
					if ( ! is_null( $value ) ) {
						$html_attributes[] = esc_attr( $key ) . '="' . esc_attr( $value ) . '"';
					} else {
						$html_attributes[] = esc_attr( $key );
					}
				}
				$html_attributes = implode( ' ', $html_attributes );
			}
		}

		if ( $echo ) {
			// Already escaped above.
			echo $html_attributes; // phpcs:ignore WordPress.Security.EscapeOutput.OutputNotEscaped
		}

		return $html_attributes;
	}

And Jetpack:

				$params = wp_kses_hair( $params, array( 'http' ) );


				$width  = isset( $params['width'] ) ? (int) $params['width']['value'] : 0;
				$height = isset( $params['height'] ) ? (int) $params['height']['value'] : 0;
				$wh     = '';


				if ( $width && $height ) {
					$wh = "&w=$width&h=$height";
				}


				$url = esc_url_raw( "https://www.youtube.com/watch?v={$match[3]}{$wh}" );

Copy link
Copy Markdown
Member

@sirreal sirreal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a good improvement.

* );
*
* @since 1.0.0
* @since 6.9.0 Rebuilt on HTML API
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @since 6.9.0 Rebuilt on HTML API
* @since 7.0.0 Rebuilt on HTML API

Copy link
Copy Markdown
Member

@westonruter westonruter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Props to Gemini)

Trac ticket: Core-63694

`wp_kses_hair()` is built around an impressive state machine for parsing
the `$attr` of an HTML tag, that is, the span of text after the tag name
and before the closing `>`. Unfortunately, that parsing code doesn’t
fully-implement the HTML specification and may be prone to mis-parsing.

This patch replaces the existing state machine with a straight-forward
use of the HTML API to parse the attributes for us, constructing a shell
take for the `$attr` string and reading the attributes structurally.
This shell is necessary because a previous stage of the pipeline has
already separated what it thinks is the so-called “attribute list” from
a tag.

Props: adamziel, dmsnell, jonsurrell, jorbin, westonruter.

Co-authored-by: Adam Zieliński <zieladam@git.wordpress.org>
Co-authored-by: Aaron Jorbin <jorbin@git.wordpress.org>
Co-authored-by: Jon Surrell <jonsurrell@git.wordpress.org>
CO-authored-by: Weston Ruter <westonruter@git.wordpress.org>
Github-PR: 9248
Github-PR-URL: WordPress#9248
Trac-Ticket: 63694
Trac-Ticket-URL: https://core.trac.wordpress.org/ticket/63724
Git-Branch: html-api/refactor-wp-kses-hair-take-3
@github-actions
Copy link
Copy Markdown

A commit was made that fixes the Trac ticket referenced in the description of this pull request.

SVN changeset: 61467
GitHub commit: b5daf3e

This PR will be closed, but please confirm the accuracy of this and reopen if there is more work to be done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants