Formatting: Strip CJK punctuation from slugs in sanitize_title_with_dashes()#9701
Formatting: Strip CJK punctuation from slugs in sanitize_title_with_dashes()#9701himanshupathak95 wants to merge 4 commits intoWordPress:trunkfrom
sanitize_title_with_dashes()#9701Conversation
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
805b3ce to
8effbc9
Compare
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Core Committers: Use this line as a base for the props when committing in SVN: To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
Trac ticket: https://core.trac.wordpress.org/ticket/22402
Currently,
sanitize_title_with_dashes()only strips ASCII non-alphanumeric characters from slugs, but preserves multi-byte punctuation marks. This results in non-western (and some western) punctuation appearing in URL slugs as encoded characters.This patch adds common CJK punctuation marks to the existing character blacklist:
Before: "Hello World。" -> slug is
hello-world%e3%80%82After: "Hello World。" -> slug is
hello-worldThe fix follows the existing pattern of explicitly listing problematic characters and only affects the 'save' context.