HTML API: Reliably parse HTML in wp_html_split().#9270
Draft
dmsnell wants to merge 2 commits intoWordPress:trunkfrom
Draft
HTML API: Reliably parse HTML in wp_html_split().#9270dmsnell wants to merge 2 commits intoWordPress:trunkfrom
wp_html_split().#9270dmsnell wants to merge 2 commits intoWordPress:trunkfrom
Conversation
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
1410116 to
0440d83
Compare
12 tasks
1 task
Member
|
I believe this would fix https://core.trac.wordpress.org/ticket/45387. |
Member
Author
|
@github-actions why don’t I come in and mess with all of your work unsolicited, huh? |
0440d83 to
4a0e1a2
Compare
This was referenced Sep 11, 2025
5fe2dcd to
3f806d8
Compare
dmsnell
commented
Sep 11, 2025
| $regex = get_html_split_regex(); | ||
| $result = benchmark_pcre_backtracking( $regex, $input, 'split' ); | ||
| return $this->assertLessThan( 200, $result ); | ||
| } |
Member
Author
There was a problem hiding this comment.
There is no longer a PCRE used in wp_html_split() and therefore no backtracking.
d319642 to
c5b62b8
Compare
a51cc59 to
8a5805f
Compare
e17de28 to
479b18a
Compare
a52a8f9 to
fcac561
Compare
c06e2c8 to
e4f3798
Compare
fcb6b14 to
f8a1e05
Compare
f8a1e05 to
c33e78c
Compare
c33e78c to
a8bd532
Compare
Trac ticket: Core-63694 This probably improves the performance in terms of both CPU time and memory compared to the old PCRE-based approach.
a8bd532 to
fb69bf2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Trac ticket: Core-63694
Replaces #6651
See: (#9270), #9850, #9851
Status
Design feedback
<[[gallery]]>to be an escaped shortcode inside an HTML tag, but HTML considers it plaintext instead of a tag (because the starting character after the initial<is not a letter).a. Is this actually a shortcode inside a tag to be ignored?
b. Is this a shortcode inside a text node?
<[gallery]>and the[gallery]shortcode translated into a tag name then this entire thing would become a tag on replacement.Implementation
This probably improves the performance in terms of both CPU time and memory compared to the old PCRE-based approach.