Improve performance of CheckSuspiciousContent #18665

iSazonov · 2022-11-26T13:45:03Z

PR Summary

By commit:

Use MIN_LEN and MAX_LEN constants for better readability.
Skip hash searches for length < MIN_LEN.
- Before we searched for strings with lengths 1, 2, and 3 but shortest pattern (Emit) has length 4.
Use Span<> instead of array
- Exclude array allocations
- Exclude boundary checks in loops
Replace multiplication with more efficient operations
- Multiplying a value by 31 is multiplying by 32 minus one value. Then change the multiplication to a shift. (The compiler knows how to replace multiplication by a power of two with a shift, but we do this explicitly to be safe.)
Improve to-lower-case
- before we do 4-8 operations, 6 in most cases
- after we do 3-5 operations, 3 in most cases

Results for Match("aaaaaaaEmitaaaaaaa"):

BenchmarkDotNet=v0.13.2, OS=Windows 10 (10.0.19044.2251/21H2/November2021Update)
Intel Core i5-2410M CPU 2.30GHz (Sandy Bridge), 1 CPU, 4 logical and 2 physical cores
.NET SDK=7.0.100
  [Host]     : .NET 7.0.0 (7.0.22.51805), X64 RyuJIT AVX
  DefaultJob : .NET 7.0.0 (7.0.22.51805), X64 RyuJIT AVX

Method	text	Mean	Ratio	Code Size	Allocated	Alloc Ratio
Orig	aaaaaaaEmitaaaaaaa	423.8 ns	1.00	4,445 B	144 B	1.00
Fast	aaaaaaaEmitaaaaaaa	266.0 ns	0.63	4,493 B	-	0.00

PR Context

PR Checklist

SteveL-MSFT · 2022-11-28T17:30:27Z

LookUpHash() uses a large switch statement, would it be faster to change to a dictionary?

iSazonov · 2022-11-29T03:55:02Z

LookUpHash() uses a large switch statement, would it be faster to change to a dictionary?

I tested this and found the switch statement is more faster (looking asm code I guess C# compiler tries to do a binary search).

src/System.Management.Automation/engine/runtime/CompiledScriptBlock.cs

daxian-dbw · 2022-11-29T23:50:47Z

src/System.Management.Automation/engine/runtime/CompiledScriptBlock.cs

Will blindly doing h |= 0x20 cause any problem? h could be any character, not just ASCII ones.
The original code is way more readable than this, and I double if the perf gain is worth sacrificing the readability.

It is not problem because we reject all except A-Z, a-z, and -.
Perf win of the change is ~10% because:

old code optimized for upper case chars

new code does less operations for any chars.

We are talking about improvements in nanosecond level. I don't think it's the bottleneck to the overall script execution, and because of that, I prefer to keep the original code for readability reason.

daxian-dbw · 2022-11-29T23:52:15Z

src/System.Management.Automation/engine/runtime/CompiledScriptBlock.cs

Why doing a Slice here? Can't we use runningHash directly in the loop below?

The change allows compiler to remove boundary checks in loop.

Please don't use Slice here, so as to make a separate fix easier. Just replacing the array with span would already be a great change.

I removed the Slice but this stopping compiler from removing boundary checks in the loop.

We can have the Slice with comment:

// We need cut unused tail. Since 'i' is current position, the actual length is 'i + 1'. Ex., if i = 0 (first char in the string) the span length is 1, if i = 1 (second char) the span length is 2, and so on. Span<uint> rh = runningHash.Slice(0, Math.Min(i + 1, runningHash.Length));

Given that rh[j - 1] is used in the loop, I doubt the boundary check on that can be eliminated too.

Yes, this works for rh[j] only.
I was thinking about vectorization, but I assume that you will be against the complication of the code. :-)

Since we have #18693 with tests can we return span?

Nope, let's not using Slice here, just to keep the code simple and readable.

If you want, ~~you can eliminate the call to Math.Min by doing j = val1 < val2 ? val1 : val2 directly before the loop.~~
Or better than that, have a local Min method that implements Math.Min. So, using Min(a, b) allows the jitter to inline the method as needed, while still giving us the same readable code.

src/System.Management.Automation/engine/runtime/CompiledScriptBlock.cs

TravisEz13 · 2022-11-30T16:30:05Z

src/System.Management.Automation/engine/runtime/CompiledScriptBlock.cs

If you are going to change how the hashes are generated, you need to update the hashes in the code.

still used on line 2095

TravisEz13 · 2022-11-30T16:30:40Z

src/System.Management.Automation/engine/runtime/CompiledScriptBlock.cs

still used on line 2095

daxian-dbw · 2022-11-30T18:30:18Z

src/System.Management.Automation/engine/runtime/CompiledScriptBlock.cs

Given this is not a for (i = 0; i < span.Length; i++) pattern, are you sure the boundary check will be eliminated?

Yes, this works. https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgEYBYAKGIGYACMhgYQYG8aHunHiUGAWQAUASg5ceUyADtcGBgEsZCwQEkAcgH0AMgFENDALwMUAbklTuAN2xQGAC2MMZMAO4MwDuwG1yABn8AXQtqKx4AM2gGYVt7RWd1bX1DOAZyMyUGAB5HADodGBkAcwwHTMUAakrRSytOMPCrYnIATmEHH0Ug0VCmhgBfOqHqAaA==

That doesn't reflect the actual situation of this method. This does.

Having Slice here moves the boundary check from within the loop to outside the loop, so it is better. Also, it doesn't affect readability, so it's good to have this change.

ghost · 2022-12-09T02:00:56Z

This pull request has been automatically marked as Review Needed because it has been there has not been any activity for 7 days.
Maintainer, please provide feedback and/or mark it as Waiting on Author

Co-authored-by: Dongbo Wang <dongbow@microsoft.com>

This reverts commit c5ef64148cce901daf06476d95b07b12b286238b.

iSazonov · 2022-12-16T17:51:52Z

@daxian-dbw I changed the direction of the loop so that all the boundary checks are removed. Previously, I removed all allocations and made a quick character filter. So now this is the fastest version.
In addition, I added result length check, which should noticeably reduce the number of collisions in real-world scenarios. If Travis sees it in his tests, maybe you (team) can approve this PR. Otherwise just close it.

daxian-dbw · 2022-12-16T22:59:30Z

@iSazonov @TravisEz13's comments were not addressed. Also, I doubt if Travis still has his original tests available, but will let Travis to confirm.

pull-request-quantifier-deprecated · 2022-12-17T08:20:41Z

This PR has 29 quantified lines of changes. In general, a change size of upto 200 lines is ideal for the best PR experience!

Quantification details

Label      : Extra Small
Size       : +16 -13
Percentile : 11.6%

Total files changed: 1

Change summary by file extension:
.cs : +16 -13

Change counts above are quantified counts, based on the PullRequestQuantifier customizations.

Why proper sizing of changes matters

Optimal pull request sizes drive a better predictable PR flow as they strike a
balance between between PR complexity and PR review overhead. PRs within the
optimal size (typical small, or medium sized PRs) mean:

Fast and predictable releases to production:
- Optimal size changes are more likely to be reviewed faster with fewer
  iterations.
- Similarity in low PR complexity drives similar review times.
Review quality is likely higher as complexity is lower:
- Bugs are more likely to be detected.
- Code inconsistencies are more likely to be detected.
Knowledge sharing is improved within the participants:
- Small portions can be assimilated better.
Better engineering practices are exercised:
- Solving big problems by dividing them in well contained, smaller problems.
- Exercising separation of concerns within the code changes.

What can I do to optimize my changes

Use the PullRequestQuantifier to quantify your PR accurately
- Create a context profile for your repo using the context generator
- Exclude files that are not necessary to be reviewed or do not increase the review complexity. Example: Autogenerated code, docs, project IDE setting files, binaries, etc. Check out the Excluded section from your prquantifier.yaml context profile.
- Understand your typical change complexity, drive towards the desired complexity by adjusting the label mapping in your prquantifier.yaml context profile.
- Only use the labels that matter to you, see context specification to customize your prquantifier.yaml context profile.
Change your engineering behaviors
- For PRs that fall outside of the desired spectrum, review the details and check if:
  - Your PR could be split in smaller, self-contained PRs instead
  - Your PR only solves one particular issue. (For example, don't refactor and code new features in the same PR).

How to interpret the change counts in git diff output

One line was added: +1 -0
One line was deleted: +0 -1
One line was modified: +1 -1 (git diff doesn't know about modified, it will
interpret that line like one addition plus one deletion)
Change percentiles: Change characteristics (addition, deletion, modification)
of this PR in relation to all other PRs within the repository.

Was this comment helpful? 👍 :ok_hand: :thumbsdown: (Email)
Customize PullRequestQuantifier for this repository.

iSazonov · 2022-12-17T08:21:25Z

LOG const remove at all.

ghost · 2022-12-24T14:00:55Z

This pull request has been automatically marked as Review Needed because it has been there has not been any activity for 7 days.
Maintainer, please provide feedback and/or mark it as Waiting on Author

iSazonov added the CL-Performance Indicates that a PR should be marked as a performance improvement in the Change Log label Nov 26, 2022

iSazonov requested review from PaulHigin, SteveL-MSFT and daxian-dbw November 26, 2022 13:45

iSazonov assigned PaulHigin Nov 26, 2022

pull-request-quantifier-deprecated bot added the Extra Small label Nov 26, 2022

iSazonov force-pushed the perf-suspi branch from 9c40c0f to 9347244 Compare November 26, 2022 16:50

daxian-dbw reviewed Nov 29, 2022

View reviewed changes

src/System.Management.Automation/engine/runtime/CompiledScriptBlock.cs Outdated Show resolved Hide resolved

daxian-dbw reviewed Nov 29, 2022

View reviewed changes

PowerShell deleted a comment from iSazonov Nov 30, 2022

daxian-dbw reviewed Nov 30, 2022

View reviewed changes

src/System.Management.Automation/engine/runtime/CompiledScriptBlock.cs Outdated Show resolved Hide resolved

TravisEz13 reviewed Nov 30, 2022

View reviewed changes

TravisEz13 requested changes Nov 30, 2022

View reviewed changes

src/System.Management.Automation/engine/runtime/CompiledScriptBlock.cs Outdated

Copy link

Member

TravisEz13 Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still used on line 2095

ghost added the Waiting on Author The PR was reviewed and requires changes or comments from the author before being accept label Nov 30, 2022

daxian-dbw reviewed Nov 30, 2022

View reviewed changes

ghost removed the Waiting on Author The PR was reviewed and requires changes or comments from the author before being accept label Dec 1, 2022

ghost added the Review - Needed The PR is being reviewed label Dec 9, 2022

iSazonov and others added 10 commits December 16, 2022 17:44

Add MIN_LEN and MAX_LEN constants

5e82951

Skip all before shortest suspicious string

42d8d40

Exclude allocations replacing array with Span

8759e13

Replace multiplication with more efficient operations

06ba42a

Improve to-lower-case

29d29ff

Apply suggestions from code review

0caf6ee

Co-authored-by: Dongbo Wang <dongbow@microsoft.com>

Remove Span

65a7438

Revert "Remove Span"

abbea70

This reverts commit c5ef64148cce901daf06476d95b07b12b286238b.

Use Span

50caf76

Revert loop and check length

213242e

iSazonov force-pushed the perf-suspi branch from c5ef641 to 213242e Compare December 16, 2022 12:46

Fix length

abc97f3

ghost removed the Review - Needed The PR is being reviewed label Dec 16, 2022

Replace LCG with 31

432845b

ghost added the Review - Needed The PR is being reviewed label Dec 24, 2022

iSazonov closed this Apr 23, 2023

ghost removed the Review - Needed The PR is being reviewed label Apr 23, 2023

Improve performance of CheckSuspiciousContent #18665

Improve performance of CheckSuspiciousContent #18665

Uh oh!

Conversation

iSazonov commented Nov 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

PR Context

PR Checklist

Uh oh!

SteveL-MSFT commented Nov 28, 2022

Uh oh!

iSazonov commented Nov 29, 2022

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iSazonov Nov 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

daxian-dbw Nov 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iSazonov Dec 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

daxian-dbw Dec 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ghost commented Dec 9, 2022

Uh oh!

iSazonov commented Dec 16, 2022

Uh oh!

daxian-dbw commented Dec 16, 2022

Uh oh!

pull-request-quantifier-deprecated bot commented Dec 17, 2022

What can I do to optimize my changes

How to interpret the change counts in git diff output

Uh oh!

iSazonov commented Dec 17, 2022

Uh oh!

ghost commented Dec 24, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

iSazonov commented Nov 26, 2022 •

edited

Loading

iSazonov Nov 30, 2022 •

edited

Loading

daxian-dbw Nov 30, 2022 •

edited

Loading

iSazonov Dec 1, 2022 •

edited

Loading

daxian-dbw Dec 1, 2022 •

edited

Loading