-
Notifications
You must be signed in to change notification settings - Fork 8.1k
Webcmdlets cleanup #19308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Webcmdlets cleanup #19308
Conversation
| private static Regex s_linkRegex; | ||
| private static Regex s_tagRegex; | ||
| [GeneratedRegex(@"([^""'>/=\s\p{Cc}]+)(?:\s*=\s*(?:""(.*?)""|'(.*?)'|([^'"">\s]+)))?", RegexOptions.Singleline | RegexOptions.IgnoreCase)] | ||
| private static partial Regex s_attribNameValueRegex(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the point of this change. The original Regex is compiled and works just as fast, but it is more compact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to this change we can remove EnsureHtmlParser();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And what are the benefits? The final code is still noticeably larger, but the performance does not increase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will revert the regex stuff, move the remaining changes to the other webcmdlet cleanup PR and close this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's significant difference between RegexOptions.Compiled and GeneratedRegex. Regex Source Generator will not only reduce startup time but also increase performance.
I can't simply put it into words how much better generated regex is, thankfully Stephen Toub has great post about them here https://devblogs.microsoft.com/dotnet/regular-expression-improvements-in-dotnet-7/#source-generation (In fact, .NET team wants people to switch to GeneratedRegex so bad that they want to intercept every new Regex(..) call and delegate them to source generated one instead)
Stephen also talks about them here https://www.youtube.com/live/rwfNDyBBgks?t=1549
Nick Chapsas has also a quick summary of them https://www.youtube.com/watch?v=WosEhlHATOk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't simply put it into words how much better generated regex is,
When it comes to code performance neither words nor emotions work - you have to prove with tests that a particular code will become smaller and faster.
In this particular case, the code will only get bigger because PowerShell is dynamic and can never completely get rid of the Regex compiler. The initial improvements won't be noticeable since the initialization of the cmdlet takes a significant amount of time.
Where we would like Regex SG and where it is justified is the pwsh startup scenario. And there are indeed places where this SG can be applied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iSazonov if you pointed me to such cases I could try to apply Regex SG
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When it comes to code performance neither words nor emotions work - you have to prove with tests that a particular code will become smaller and faster.
You're correct here. I should have provided data for my claims.
In this particular case, the code will only get bigger because PowerShell is dynamic and can never completely get rid of the Regex compiler. The initial improvements won't be noticeable since the initialization of the cmdlet takes a significant amount of time.
Where we would like Regex SG and where it is justified is the pwsh startup scenario. And there are indeed places where this SG can be applied.
I didn't realize this code wasn't used in the pwsh itself, I admit my mistake. In this case delaying parsing and doing all the complex IL emitting is justified, if this code is not called in a cmdlet.
This also makes me want to compare the size of the IL emitted for RegexOptions.Compiled and the IL compiled by Roslyn for the source generated one.. interesting.
|
This PR has Quantification details
Why proper sizing of changes matters
Optimal pull request sizes drive a better predictable PR flow as they strike a
What can I do to optimize my changes
How to interpret the change counts in git diff output
Was this comment helpful? 👍 :ok_hand: :thumbsdown: (Email) |
|
this was a huge dump of different files and programs to see if it could be handled thanks for your feedback !!!Sent from my iPhoneOn Mar 10, 2023, at 12:02 PM, Ilya ***@***.***> wrote:
@iSazonov commented on this pull request.
In src/Microsoft.PowerShell.Commands.Utility/commands/utility/WebCmdlet/Common/BasicHtmlWebResponseObject.Common.cs:
{
#region Private Fields
- private static Regex s_attribNameValueRegex;
- private static Regex s_attribsRegex;
- private static Regex s_imageRegex;
- private static Regex s_inputFieldRegex;
- private static Regex s_linkRegex;
- private static Regex s_tagRegex;
+ [GeneratedRegex(@"([^""'>/=\s\p{Cc}]+)(?:\s*=\s*(?:""(.*?)""|'(.*?)'|([^'"">\s]+)))?", RegexOptions.Singleline | RegexOptions.IgnoreCase)]
+ private static partial Regex s_attribNameValueRegex();
I don't see the point of this change. The original Regex is compiled and works just as fast, but it is more compact.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
PR Summary
EnsureHtmlParser()(no longer needed)PR Context
PR Checklist
.h,.cpp,.cs,.ps1and.psm1files have the correct copyright headerWIP:or[ WIP ]to the beginning of the title (theWIPbot will keep its status check atPendingwhile the prefix is present) and remove the prefix when the PR is ready.(which runs in a different PS Host).