Skip to content

Conversation

@bergmeister
Copy link
Contributor

@bergmeister bergmeister commented May 27, 2018

PR Summary

Related: #6944

Reduce PR build time by 5 minutes by:

  • Having Packaging as a separate build job in a matrix -> runs in parallel in PR builds because the Microsoft account is a paid account that allows that (at no additional costs)
  • Not caching the dotnet folder any more, which is too large and the overhead of zip/unzip/upload/download does not pay off (and fails in forked builds that are on a free AppVeyor account due to the size).
  • Setting the environment variable DOTNET_SKIP_FIRST_TIME_EXPERIENCE to 1 because the initialization of the dotnet cli cache (1 minute) does not pay off for the whole build.

The total build time of builds on a fork that is on a free AppVeyor account and therefore do not have parallelism, remains the same due to the time saving of redundant caching.

This is just a simple example of what we can easily achieve, we could continue this pattern and split the test runs as per the referenced issue to bring PR builds down to 10 minutes (but this will incur an increase for fork builds on free AppVeyor accounts)

PR Checklist

@iSazonov
Copy link
Collaborator

iSazonov commented May 28, 2018

Seems we don't want publish packages if tests failed.
And

matrix:
  fast_finish: true

doesn't help.

@iSazonov
Copy link
Collaborator

Restoring build cache take 4 minutes! Unzipping - can we cache unzipped data?

… forks for a fail-fast approach and to get deliverables early)

-Remove caching dotnet, which does not seem to pay off due to high zipping and unzipping costs
-Set DOTNET_SKIP_FIRST_TIME_EXPERIENCE to 1 to save 1 minute for unnecessary cache initialization since now we always install the sdk from fresh
@bergmeister
Copy link
Contributor Author

bergmeister commented May 28, 2018

@iSazonov fast_finish: true would still be helpful in the sense that running tests don't continue to run if the packaging build is broken. I don't see a reason for not publishing the packages even if some tests fail later on because if it is something that is just a prototype/experiment then I would not want to be blocked on that, in fact, the packages are the first thing that I want from a build.
I do think that trying to cache dotnet does not pay off the zip/unzip/upload/download costs, instead of just installing the required dotnet SDK anyway (which takes only 12 seconds to download, extract and install). Especially in forks, the cache is too big for the free version, therefore that time is wasted because it fails at the end.
I now set an environment variable DOTNET_SKIP_FIRST_TIME_EXPERIENCE to 1 to avoid unnecessary cache initialisation in this case, which saves around one minute

@bergmeister bergmeister changed the title WIP PoC of using Appveyor matrix for faster builds Use Appveyor matrix for faster PR builds May 28, 2018
@iSazonov
Copy link
Collaborator

Make sense set DOTNET_SKIP_FIRST_TIME_EXPERIENCE in second job too?

… it makes the 2nd build slower (it did before but maybe it was a bad run
@iSazonov
Copy link
Collaborator

Seems we could decouple still two job - for admin tests and xUnit tests. If first we start CI-tagged tests and then packaging, admin-tagged tests and xUnit tests we get minimal time about 14 minutes.

@bergmeister
Copy link
Contributor Author

bergmeister commented May 28, 2018

Yes, that's why this PR is not supposed to close the issue but yes, this is the way to go, I have outlined ideas like that in the issue. This PR is just the first step to show the direction without disadvantages for builds on a fork with a free account. Bear in mind that for every additional build job, people on non-free AppVeyor accounts incur a 2 minute penalty on their fork builds (i.e. their own builds before they have submitted a PR) due to it having to bootstrap the environment.
In general, I'd prefer to drip-feed those improvements one by one into master instead of having one big PR. This way we can also split the work should it get more technical with XUnit/Admin tests.
From my point of view this PR is complete as it leads to an overall improvement on both sides (PR and fork builds) without any disadvantages.

@TravisEz13
Copy link
Member

I'm not sure this will speed things up. It may actually slow things down as we often hit our maximum number of jobs. We would need more budget to make this actually work. @SteveL-MSFT What do you think?

@bergmeister
Copy link
Contributor Author

bergmeister commented May 31, 2018

I ran this also on my fork (which has only a concurrency of 1 because I am cheap 😛 ), the time was the same (due to time savings of dumping the dotnet caching and skipping the dotnet initialization). Therefore at least, when job pipelines are available (which probably applies more to me due due to my difference in time-zone and more frequent weekend work), then there will be a speedup. Of course a more efficient split-up would be to do something more expensive than packaging in the 2nd build but this is just the first step to show the future path


# cache version - netcoreapp.2.1-sdk.2.1.300-rc1
cache:
- '%LocalAppData%\Microsoft\dotnet -> appveyor.yml'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is here for reliability. Please don't remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying the download of the SDK sometimes fails? Usually cache makes things less reliable due to leftovers. The reason for removal was the amount of time it was taking and the fact that it often also fails due to its size.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any download sometimes fails.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but then shouldn't we rather cache just what we have downloaded instead of caching the appdata folder of dotnet?

@TravisEz13
Copy link
Member

Does DOTNET_SKIP_FIRST_TIME_EXPERIENCE mean dotnet cli doesn't download the files? We weren't caching the dotnet cache for perf but for reliability.

@bergmeister
Copy link
Contributor Author

bergmeister commented May 31, 2018

@TravisEz13
DOTNET_SKIP_FIRST_TIME_EXPERIENCE skips the dotnet internal initial cache performance optimisation (that is only useful for long term performance but not when one uses dotnet only a couple of times in a build). It only happens on the very first dotnet command after a fresh install. This process takes about a minute, which is time that we never get back.
For details see e.g. here:
http://donovanbrown.com/post/Stop-wasting-time-during-NET-Core-builds

# version is set in tools\appveyor.psm1 - Invoke-AppVeyorInstall
environment:
POWERSHELL_TELEMETRY_OPTOUT: 1
DOTNET_SKIP_FIRST_TIME_EXPERIENCE: 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @DarqueWarrior 😄

Copy link
Member

@TravisEz13 TravisEz13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good if we add a comment about what the skip first-time experience variable means.

…o AppVeyorMatrix

and add requested comment

# Conflicts:
#	appveyor.yml
@bergmeister
Copy link
Contributor Author

bergmeister commented Jun 4, 2018

@TravisEz13 Ok, done. What are your thoughts on the concern about reliability due to removing the CLI cache and instead always just installing it from scratch. PSScriptAnalyzer does this as well and I have not seen a sporadic failure due to that yet.

@TravisEz13
Copy link
Member

@bergmeister I think we can try it and see what happens.

matrix:
fast_finish: true

# cache version - netcoreapp.2.1-sdk.2.1.300
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please leave this comment in place? It's needed to flush the nuget package cache when we update the sdk or runtime for the build.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I didn't notice this. Yes, please add this back

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, added back

appveyor.yml Outdated
# version is set in tools\appveyor.psm1 - Invoke-AppVeyorInstall
environment:
POWERSHELL_TELEMETRY_OPTOUT: 1
DOTNET_SKIP_FIRST_TIME_EXPERIENCE: 1 # avoid expensive initialization of dotnet cli, see: http://donovanbrown.com/post/Stop-wasting-time-during-NET-Core-builds
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is better to place comments on new line.

image: Visual Studio 2017
# Stop all jobs in the matrix if any one them fail
matrix:
fast_finish: true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we split jobs should we fast stop?

Copy link
Contributor Author

@bergmeister bergmeister Jun 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was: if the build itself is broken, then both jobs will fail at the same stage anyway because they both need to build. The only undesired case would be when adding/updating new 3rd party packages and packaging is broken then the tests would abort early. I am happy to revert this if you wish

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Closed.

@iSazonov
Copy link
Collaborator

iSazonov commented Jun 6, 2018

@bergmeister Thanks for the contribution!

TravisEz13 pushed a commit that referenced this pull request Jun 11, 2018
… time by another 5 minutes (#7021)

Closes #6944

Following PR #6945, further, reduce the total build time (without any disadvantages) by around 5 minutes by making sure there is a more even split between the 2 build jobs (the 2nd build job used to be much shorter).
Therefore this PR moves also the xUnit and Pester-Admin tests into the 2nd build job. If it is a daily/feature test commit, then the feature tests will also happen (only) in the 2nd build job. Because both jobs now run tests, the failfast option was removed. The final question from my side is whether running tests in 2 build jobs is OK for the daily build, which uploads code coverage results?

The time to wait for the AppVeyor build results is now 15 +/- 2 minutes, which is a huge improvement to what used to be around 28 minutes before the build matrix was introduced.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants