Skip to content

Conversation

@daxian-dbw
Copy link
Member

PR Summary

There have been refactoring to the code base about replacing string.Format with string.Create(<interpolate-string>). The list of relevant PRs are captured in #18974 (comment).

After reviewing those PRs, I found not all the changes are appropriate, and need to revert some of them for one of the following 2 reasons:

  1. it's important to make the string template clear and easy to read.
  2. the resulted interpolated string contains method calls or long ternary expression, which makes it way less readable.

Note that, since there are tons of changes, my review cannot be thorough enough to catch all those that needs to be reverted. If anyone notice any changes from those PRs that are questionable but not captured in this PR, please leave a comment or submit a new PR.

PR Checklist

Copy link
Collaborator

@iSazonov iSazonov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daxian-dbw I'm sorry if this annoys you.


  1. it's important to make the string template clear and easy to read.

I don't remember anyone changing these formatting lines, it happens very rarely. I mean this code is not something anybody have to read and analyze all the time. On the other hand there are benefits of reduced allocations and increased productivity.

  1. the resulted interpolated string contains method calls or long ternary expression, which makes it way less readable.

I agree that they could have been taken out of the format strings where possible.


Instead of reverting I have another suggestion. We could add comments with the old formatting lines before the new code. @daxian-dbw What do you think about this compromise? We could ask CarloToso to do this.

New code:

# Format: "(?<{0}>){1}"
patterns.Add(string.Create(CultureInfo.InvariantCulture, $"(?<{FullTextRuleGroupName}>){ValuePattern}"));

Old code:

patterns.Add(string.Format(CultureInfo.InvariantCulture, "(?<{0}>){1}", FullTextRuleGroupName, ValuePattern));

Comment on lines 2537 to 2543
sb.AppendFormat(
CultureInfo.InvariantCulture,
" {0}{1} {2}{3};\n",
MapAttributesToMof(enumNames, attributes, embeddedInstanceType),
mofType,
member.Name,
arrayAffix);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only increase the number of allocations in the parser and make it slower.

Copy link
Member Author

@daxian-dbw daxian-dbw Jan 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this particular case, sb.AppendFormat will incur an array allocation because there are 4 objects passed in.
Given that this is in a loop, I updated the code to continue using sb.Append(, but in a readable manner.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such allocations will are removed in .Net 8.0 (for all APIs with params) so you can revert to sb.AppendFormat if you prefer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the new update is already in a readable way, I'm fine keeping it as is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But a question, @iSazonov, if .NET 8 is going to support 4 or more formating objects in string.Format or StringBuilder.AppendFormat, then will most of the changes to string.Create(<interpolated-string>) be in vain?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only remove allocating array for arguments. One more optimization.
Interpolated strings already use InterpolatedStringHandler which utilizes stackalloc to avoid allocations at all in formatting process. Also it excludes parsing format string again and again. (They added new API CompositeFormat on this week to cache format parsing.)

Copy link
Contributor

@PaulHigin PaulHigin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
I agree that this is much more readable and maintainable. @daxian-dbw thanks for making the changes.

@daxian-dbw
Copy link
Member Author

daxian-dbw commented Jan 24, 2023

@iSazonov I totally understand the good intention behind those changes -- some temporary array allocations will be saved for those string.Format usages with more than 3 formatting arguments. However, readability and stability are equally important (if not more important), and therefore, the perf change should not sacrifice readability and stability.

For remoting connection strings, message packets, and WSMan XML, it's best to not touch this code unnecessarily for stability reason, and certainly not make it less readable.

Instead of reverting I have another suggestion. We could add comments with the old formatting lines before the new code.

Comments are easy to be out of sync with the code. Some string.Format instances don't have the overhead of temp array allocation. For some other that do have an overhead, they may not be important for optimization (such as the ToString overloads in Binders). If it's more important to keep the template clear and readable, it's OK to not optimize it with the interpolated string. This micro-optimization is good add-on when it's appropriate, but it's not necessary to apply it everywhere unconditionally.

@daxian-dbw daxian-dbw added the CL-CodeCleanup Indicates that a PR should be marked as a Code Cleanup change in the Change Log label Jan 24, 2023
@daxian-dbw daxian-dbw closed this Jan 24, 2023
@daxian-dbw daxian-dbw reopened this Jan 24, 2023
@pull-request-quantifier-deprecated

This PR has 380 quantified lines of changes. In general, a change size of upto 200 lines is ideal for the best PR experience!


Quantification details

Label      : Large
Size       : +298 -82
Percentile : 78%

Total files changed: 35

Change summary by file extension:
.cs : +298 -82

Change counts above are quantified counts, based on the PullRequestQuantifier customizations.

Why proper sizing of changes matters

Optimal pull request sizes drive a better predictable PR flow as they strike a
balance between between PR complexity and PR review overhead. PRs within the
optimal size (typical small, or medium sized PRs) mean:

  • Fast and predictable releases to production:
    • Optimal size changes are more likely to be reviewed faster with fewer
      iterations.
    • Similarity in low PR complexity drives similar review times.
  • Review quality is likely higher as complexity is lower:
    • Bugs are more likely to be detected.
    • Code inconsistencies are more likely to be detected.
  • Knowledge sharing is improved within the participants:
    • Small portions can be assimilated better.
  • Better engineering practices are exercised:
    • Solving big problems by dividing them in well contained, smaller problems.
    • Exercising separation of concerns within the code changes.

What can I do to optimize my changes

  • Use the PullRequestQuantifier to quantify your PR accurately
    • Create a context profile for your repo using the context generator
    • Exclude files that are not necessary to be reviewed or do not increase the review complexity. Example: Autogenerated code, docs, project IDE setting files, binaries, etc. Check out the Excluded section from your prquantifier.yaml context profile.
    • Understand your typical change complexity, drive towards the desired complexity by adjusting the label mapping in your prquantifier.yaml context profile.
    • Only use the labels that matter to you, see context specification to customize your prquantifier.yaml context profile.
  • Change your engineering behaviors
    • For PRs that fall outside of the desired spectrum, review the details and check if:
      • Your PR could be split in smaller, self-contained PRs instead
      • Your PR only solves one particular issue. (For example, don't refactor and code new features in the same PR).

How to interpret the change counts in git diff output

  • One line was added: +1 -0
  • One line was deleted: +0 -1
  • One line was modified: +1 -1 (git diff doesn't know about modified, it will
    interpret that line like one addition plus one deletion)
  • Change percentiles: Change characteristics (addition, deletion, modification)
    of this PR in relation to all other PRs within the repository.


Was this comment helpful? 👍  :ok_hand:  :thumbsdown: (Email)
Customize PullRequestQuantifier for this repository.

@daxian-dbw daxian-dbw merged commit b195088 into PowerShell:master Jan 25, 2023
@daxian-dbw daxian-dbw deleted the revert branch January 25, 2023 17:46
@daxian-dbw daxian-dbw assigned daxian-dbw and unassigned TravisEz13 Jan 25, 2023
@ghost
Copy link

ghost commented Mar 14, 2023

🎉v7.4.0-preview.2 has been released which incorporates this pull request.:tada:

Handy links:

@trackd
Copy link

trackd commented Oct 19, 2023

I'm not sure if this is by design, but it would not be my preference.

commit

    # 7.3.8
    PS7.3> pwsh -nop -c '(Get-Date | Group-Object).Name'
    2023-10-19 13:28:30
    # 7.4.0-preview.6
    PS7.4> pwsh -nop -c '(Get-Date | Group-Object).Name'
    10/19/2023 13:28:26

I don't think datetime values should be converted to InvariantCulture in general, it's a bit confusing for anyone outside US..

perhaps a more "common" use case,
gci . | group Creationtime
now returns all dateformats in InvariantCulture (MM/dd/yyyy HH:mm:ss) as well.

I'm not sure if it's affecting other commands.

@jhoneill
Copy link

The use of Invariant culture needs enormous care

As @trackd mentions above, it results in "Least significant in the middle" date formatting which is considered broken outside the US, and a uses "." as a decimal separator which is not used in France, Germany, etc. (Worse . is the thousand separator in some cultures).

Any replacement of a .ToString() - which will use local culture - with a something which is effectively .ToString(US Culture) will appear to non US users as a regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CL-CodeCleanup Indicates that a PR should be marked as a Code Cleanup change in the Change Log Large

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants