Optimization batch 10: avoid detecting even more irrelevant renames#853
Optimization batch 10: avoid detecting even more irrelevant renames#853newren wants to merge 8 commits intogitgitgadget:temporary/ort-perf-batch-9from
Conversation
4523b80 to
2a75ef0
Compare
0a72a5e to
69d8e51
Compare
2a75ef0 to
57f5b94
Compare
69d8e51 to
d33d841
Compare
38a5af7 to
fe3c2d3
Compare
a6e7c2e to
ec57808
Compare
fe3c2d3 to
4bfef15
Compare
ec57808 to
043aba7
Compare
59c6d19 to
911a730
Compare
0d3c472 to
4077b17
Compare
5676cfd to
d8e921a
Compare
d8e921a to
9b0eef8
Compare
0c00d8e to
43fe23d
Compare
9b0eef8 to
c2eca0c
Compare
43fe23d to
9365786
Compare
c2eca0c to
030edc4
Compare
9365786 to
d40e692
Compare
4e5a08e to
9f16076
Compare
1ab395a to
495c109
Compare
|
/submit |
|
This patch series was integrated into seen via git@2c6478c. |
|
This patch series was integrated into seen via git@40cd16d. |
|
There was a status update about the branch I made a mistake of picking these up before they got sufficient exposure to the reviewers and ended up a source of potential mess when it turns out that any of the earlier ones need rewriting (I probably should stop picking up nested series that exceeds reviewer bandwidth), but how ready is this and subsequent topics? |
|
This patch series was integrated into seen via git@8f5d9e1. |
|
This patch series was integrated into seen via git@be266b8. |
|
This patch series was integrated into seen via git@36fe10e. |
|
This patch series was integrated into seen via git@41a1fe2. |
|
This patch series was integrated into seen via git@eb724bc. |
|
This patch series was integrated into seen via git@16bde12. |
|
This patch series was integrated into seen via git@4218b45. |
|
There was a status update about the branch Various rename detection optimization to help "ort" merge strategy backend. |
|
This patch series was integrated into seen via git@b12fdbd. |
|
This patch series was integrated into seen via git@1523dff. |
|
This patch series was integrated into next via git@9826d89. |
|
This patch series was integrated into seen via git@d154185. |
|
This patch series was integrated into seen via git@946bfd0. |
|
There was a status update about the branch Various rename detection optimization to help "ort" merge strategy backend. Will merge to 'master'. |
|
This patch series was integrated into seen via git@0104f53. |
|
There was a status update in the "Cooking" section about the branch Various rename detection optimization to help "ort" merge strategy backend. Will merge to 'master'. |
|
This patch series was integrated into seen via git@01216b2. |
|
This patch series was integrated into seen via git@1381089. |
|
There was a status update in the "Cooking" section about the branch Various rename detection optimization to help "ort" merge strategy backend. Will merge to 'master'. |
|
This patch series was integrated into seen via git@e2e1a03. |
|
This patch series was integrated into next via git@e2e1a03. |
|
This patch series was integrated into master via git@e2e1a03. |
|
Closed via e2e1a03. |
This series depends on ort-perf-batch-9.
=== Basic Optimization idea ===
This series adds additional special cases where detection of renames
is irrelevant, where the irrelevance is due to the fact that the merge
machinery will arrive at the same result regardless of whether a
rename is detected for any of those paths. That high level wording
makes it sound the same as ort-perf-batch-9, and basically it is, it's
just trying to take the optimization a step further.
As noted in the last series, there are two reasons that the merge
machinery needs renames:
ort-perf-batch-9 provided a rough approximation for the second
criteria that was good enough, but which still left us detecting more
renames than necessary. This series focuses further on that criteria
and finds ways to avoid the need to detect as many renames while still
detecting directory renames identically to before. Thus, this series
is an improvement on "Optimization #2" from my Git Merge 2020 talk[1].
=== Results ===
For the testcases mentioned in commit 557ac03 ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
the changes in just this series improves the performance as follows:
While those results may look somewhat meager, it is important to note
that the previous optimizations have already reduced rename detection
time to nearly 0 for these particular testcases so there just isn't
much left to improve. The final patch in the series shows an
alternate testcase where the previous optimizations aren't as
effective (a simple cherry-pick of a commit that simply adds one new
empty file), where there was a speedup factor of approximately 3 due
to this series:
There was also another testcase at $DAYJOB where I saw a factor 7
improvement from this particular optimization, so it certainly has the
potential to help when the previous optimizations are not quite
enough.
As a reminder, before any merge-ort/diffcore-rename performance work,
the performance results we started with (as noted in the same commit
message) were:
[1] https://github.com/newren/presentations/blob/pdfs/merge-performance/merge-performance-slides.pdf
cc: Derrick Stolee dstolee@microsoft.com
cc: Ævar Arnfjörð Bjarmason avarab@gmail.com
cc: Jonathan Tan jonathantanmy@google.com
cc: Taylor Blau me@ttaylorr.com
cc: Derrick Stolee stolee@gmail.com
cc: Elijah Newren newren@gmail.com