Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #62451 +/- ##
=======================================
Coverage 89.70% 89.71%
=======================================
Files 691 691
Lines 213935 213963 +28
Branches 41050 41047 -3
=======================================
+ Hits 191907 191950 +43
- Misses 14095 14106 +11
+ Partials 7933 7907 -26
🚀 New features to boost your workflow:
|
| assert.strictEqual(ports[ports.length - 1], port); | ||
| makeRequest(url, agent, common.mustCall((port) => { | ||
| assert.strictEqual(ports[ports.length - 1], port); | ||
| server.closeAllConnections(); |
There was a problem hiding this comment.
Why isn't agent.destroy() below sufficient?
There was a problem hiding this comment.
That's part of what I'm investigating. I'm running a stress test of this change on windows to determine if that deals with the flakiness. If it does, a follow up will be to figure out why the destroy isn't sufficient.
What's apparent is that there is a race condition on cleanup. Where and exactly why still needs to be determined along with a long term fix.
| }, common.mustCall((res) => { | ||
| assert.strictEqual(res.statusCode, 200); | ||
| res.on('end', () => { | ||
| server.closeAllConnections(); |
There was a problem hiding this comment.
What is preventing the socket from being closed normally? Is the keep-alive timeout longer than the test timeout? and if it is the case, can keep-alive be disable? I don't think it is needed for this test.
There was a problem hiding this comment.
Thinking through this... disabling keep-alive might address the issue for the various test-http(s)-* tests but there are tcp/tls tests also failing with the same crash that follow the same pattern, suggesting that the issue is a bit more fundamental than just using keep-alive.
This comment was marked as outdated.
This comment was marked as outdated.
|
Hmm... the CI stress test job seems to not actually be working correctly on most of the Windows configurations. |
This comment was marked as outdated.
This comment was marked as outdated.
|
@joyeecheung or @addaleax ... curious if either of you have thoughts on the root cause here. The flakes are just |
|
I don't have a Windows machine at hand at the moment but can you reproduce locally or is it only in CI? Did you try with |
|
I haven't even booted my windows machine in over two years. Once I get it back up and updated I'll see if I can reproduce locally and, if so, I'll try to get a usable dump from from it. |
8297908 to
89cd5ab
Compare
This is an attempt to fix the flaky failures on http(s) and tcp tests on Windows.
89cd5ab to
374cc83
Compare
The test-https-server-options-incoming-message and
test-https-server-options-server-response tests can
be flaky on Windows.
Opencode/Opus is helping figure out why.
If this PR makes the tests non-flaky it's just a bandaid. There appears to be a race condition on cleanup on windows where shutting down connections races with shutting down the server. We'd still need to identify specifically where that race is occurring and where to fix it specifically. Right now, however, my goal is to unblock CI flakes.
The flakiness does seem generalized to any
test-http(s)-*and not just one... which makes me strongly suspect the race is quite low in the stack, possibly even in libuv but that's yet to be determined precisely.