Skip to content

Conversation

@bwoebi
Copy link
Contributor

@bwoebi bwoebi commented Feb 18, 2020

Consider the following scenario:

uv_poll_init(loop, poll, fd);
uv_poll_start(poll, UV_READABLE, cb);
// the cb gets invoked etc.
uv_poll_stop(poll);

// some time later:
uv_poll_init(loop, otherpoll, fd);
uv_poll_start(otherpoll, UV_READABLE, cb);

uv_close(poll); // uv__io_stop: Assertion `loop->watchers[w->fd] == w' failed.

Reasons to change this:

  • This was hard to track down. (In particular for users using the bindings for other languages, they don't know to debug the C state properly.)
  • This turned out to be a very dangerous issue due to the poll handle being closed as part of a delayed garbage collection. The point in time was very much non-deterministic and it would or would maybe not happen for a long time, depending on whether there was another poll handle alive at the time or not at this fd number.
  • Closing an already stopped handle should not be a failure case in general.

Fixes downstream issue discovered with the PHP bindings for libuv: amphp/ext-uv#81

@bwoebi bwoebi requested a review from bnoordhuis February 18, 2020 09:56
@bwoebi bwoebi changed the title Do not stop the poll watcher twice fix: Do not stop the poll watcher twice Feb 18, 2020
@saghul
Copy link
Member

saghul commented Feb 18, 2020

Creating a second poll handle over the same fd is not supported. Currently we do give an error if the handle is running, but we cannot detect it if the handle was stopped (maybe we could iterate the loop handle list?) so you are dancing in undefined behavior I think. Nothing prevents you from uv_poll_start -ing the old poll handle after you stopped it (this is allowed) so things would break.

The typical recommendattion here is to keep a map of fd-poll-handle at the application layer and rreuse the handle for the same fd.

@santigimeno
Copy link
Member

santigimeno commented Feb 18, 2020

Refs: #1172 (comment) with a description of possible solutions to the issue.

@bnoordhuis
Copy link
Member

Thanks for the PR but the scenario you describe is a bug in libuv usage: you're not supposed to have two uv_poll_t handles with overlapping lifetimes that reference the same file descriptor.

I can see how your change "fixes" the immediate issue but it really just hides an application bug. it'd be better to catch it in uv_poll_start().

uv_poll_init() already has an "is this fd in use?" check but that only catches bugs when the other poll handle is already started. Can you check if this patch works for your use case?

diff --git a/src/unix/poll.c b/src/unix/poll.c
index 3d5022b2..e43f4ec9 100644
--- a/src/unix/poll.c
+++ b/src/unix/poll.c
@@ -116,12 +116,22 @@ int uv_poll_stop(uv_poll_t* handle) {
 
 
 int uv_poll_start(uv_poll_t* handle, int pevents, uv_poll_cb poll_cb) {
+  uv__io_t** watchers;
+  uv__io_t* w;
   int events;
 
   assert((pevents & ~(UV_READABLE | UV_WRITABLE | UV_DISCONNECT |
                       UV_PRIORITIZED)) == 0);
   assert(!uv__is_closing(handle));
 
+  watchers = handle->loop->watchers;
+  w = &handle->io_watcher;
+
+  if ((unsigned) w->fd < handle->loop->nwatchers)
+    if (watchers[w->fd] != NULL)
+      if (watchers[w->fd] != w)
+        return UV_EEXIST;
+
   uv__poll_stop(handle);
 
   if (pevents == 0)

@bwoebi
Copy link
Contributor Author

bwoebi commented Feb 18, 2020

@saghul Which is sort of okay, if it wasn't for file descriptor reuse from the OS. I.e. the cycle collector closes the fd, a destructor creates a new fd and watches it. The cycle collector closes the poll handle and it explodes.
It's nothing users of PHP can have an influence on and thus if this is not handled, just makes using libuv a ticking time bomb in PHP.

@bwoebi
Copy link
Contributor Author

bwoebi commented Feb 18, 2020

@bnoordhuis This does not help - after all the original file descriptor has been stopped, just not closed yet. But it really does not help with the edge case where a file descriptor gets reused by bad luck.

The true alternative for me would be guarding the explicit uv_close by a destructor with a check whether the pevents == 0. But it's private API, so I cannot rely on that...

@bnoordhuis
Copy link
Member

But it really does not help with the edge case where a file descriptor gets reused by bad luck.

I don't think that's libuv job to fix. You'll have to make php-uv conform to libuv's API contract.

@bwoebi
Copy link
Contributor Author

bwoebi commented Feb 18, 2020

What really is unsupported, judging from the code, is having two poll handles active at the same fd at the same time. Everything else can be easily allowed without breaking anything.

@bwoebi
Copy link
Contributor Author

bwoebi commented Feb 18, 2020

It is not okay to have multiple active poll handles for the same socket, this can cause libuv to busyloop or otherwise malfunction. ~ http://docs.libuv.org/en/v1.x/poll.html

@bnoordhuis In this case the docs are at best misleading. Docs explicitly state active poll handles. Nothing about just having two initialized poll handles of which one is inactive. (Which, I see no reason for to malfunction.)

@bwoebi
Copy link
Contributor Author

bwoebi commented Feb 18, 2020

Additionally, the docs state:

However the fd can be safely closed immediately after a call to uv_poll_stop() or uv_close(). ~ http://docs.libuv.org/en/v1.x/poll.html

But it turns out I cannot, due to the behavior exposed. The current behavior requires me to first uv_close() before a close(2) call; uv_poll_stop() is not enough. (After all the docs tell me to always call uv_close() as well, if I don't want to leak memory.)

FWIW @saghul I cannot keep a file descriptor to handle map in PHP, because PHP does not expose file descriptors natively. I only can keep a map to a resource (which we do). But the resource number is not related to the fd number.

@bnoordhuis
Copy link
Member

Valid observation. Is this the behavior you're after?

diff --git a/src/unix/core.c b/src/unix/core.c
index 1e50f798..2315105e 100644
--- a/src/unix/core.c
+++ b/src/unix/core.c
@@ -899,13 +899,12 @@ void uv__io_stop(uv_loop_t* loop, uv__io_t* w, unsigned int events) {
   if (w->pevents == 0) {
     QUEUE_REMOVE(&w->watcher_queue);
     QUEUE_INIT(&w->watcher_queue);
+    w->events = 0;
 
-    if (loop->watchers[w->fd] != NULL) {
-      assert(loop->watchers[w->fd] == w);
+    if (w == loop->watchers[w->fd]) {
       assert(loop->nfds > 0);
       loop->watchers[w->fd] = NULL;
       loop->nfds--;
-      w->events = 0;
     }
   }
   else if (QUEUE_EMPTY(&w->watcher_queue))

@saghul
Copy link
Member

saghul commented Feb 18, 2020

FWIW @saghul I cannot keep a file descriptor to handle map in PHP, because PHP does not expose file descriptors natively. I only can keep a map to a resource (which we do). But the resource number is not related to the fd number.

But yiou could expose it in your libuv wrapper can't you? Someone must know it in order to passs it along to uv_poll_init.

@bwoebi
Copy link
Contributor Author

bwoebi commented Feb 18, 2020

@bnoordhuis Yes, looks good.

I think we should additionally add your uv_poll_start() change. So that the safeguard is still in place (and checking it at poll start is a much better time than at poll stop).

@bwoebi
Copy link
Contributor Author

bwoebi commented Feb 18, 2020

@saghul I could add a function to get the fd, yes. But the point of the PHP I/O layer is abstracting file descriptors away. Currently the fd is extracted from the PHP resource within the uv_poll_init() function of the PHP bindings, opaquely to the user.

@bwoebi
Copy link
Contributor Author

bwoebi commented Feb 18, 2020

@bnoordhuis Shall I still do anything about it / change my PR to incorporate your patch suggestions? Or will you handle it fully from here on?

Anyway, thanks for the quick and interactive replies :-)

@bnoordhuis
Copy link
Member

It's going to need regression tests so if you're up for writing those you're welcome to take my patches. If not, I'll get to it eventually but probably not today.

@bwoebi
Copy link
Contributor Author

bwoebi commented Feb 18, 2020

I'll give it a go at writing some regression test.

@bwoebi bwoebi force-pushed the patch-1 branch 2 times, most recently from 524de8f to 9557213 Compare February 18, 2020 15:13
@bwoebi
Copy link
Contributor Author

bwoebi commented Feb 18, 2020

@bnoordhuis Applied your patches and added a regression test for it.

@bwoebi
Copy link
Contributor Author

bwoebi commented Mar 5, 2020

@bnoordhuis bump ?

Copy link
Member

@bnoordhuis bnoordhuis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM apart from some stylistic issues.

Comment on lines 55 to 58
uv_poll_init_socket(uv_default_loop(), &first_poll_handle, sock);
uv_poll_init_socket(uv_default_loop(), &second_poll_handle, sock);

uv_poll_start(&first_poll_handle, UV_READABLE, NULL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check the return values?

@bwoebi bwoebi force-pushed the patch-1 branch 2 times, most recently from 8776932 to 374ddca Compare March 17, 2020 11:16
@bwoebi
Copy link
Contributor Author

bwoebi commented Mar 17, 2020

@bnoordhuis Applied/fixed your suggestions and comments - should be fine now I hope :-)

Copy link
Member

@bnoordhuis bnoordhuis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bwoebi
Copy link
Contributor Author

bwoebi commented Mar 17, 2020

@bnoordhuis fixed the issue on linux, on windows apparently close_cb() is not called at all?
Is that a bug (in libuv in general) or expected windows behavior?

I was under the assumption that close gets always called on a next tick for polls, but I have no knowledge about the windows internals… Seems to be a bit delayed as windows seems to force a wait before closing? Can you please have a look how to best handle that in the test? Or just make the whole test unix only? EDIT: Or issue uv_stop() in close cb and not use run_once?

@bnoordhuis
Copy link
Member

Sorry, missed the notification. Here's a CI with this PR rebased against v1.x's tip, let's see how it fares: https://ci.nodejs.org/job/libuv-test-commit/1924/

@bnoordhuis
Copy link
Member

Seems there was some infra issue. New run: https://ci.nodejs.org/job/libuv-test-commit/1927/

Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests fail on Windows, otherwise, LGTM

Comment on lines 86 to 88
ASSERT(1 == uv_run(uv_default_loop(), UV_RUN_ONCE));
ASSERT(close_cb_called == 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ASSERT(1 == uv_run(uv_default_loop(), UV_RUN_ONCE));
ASSERT(close_cb_called == 1);
uv_unref(&second_poll_handle);
ASSERT(1 == uv_run(uv_default_loop(), UV_RUN_DEFAULT));
ASSERT(close_cb_called == 1);
uv_ref(&second_poll_handle);

Or add uv_stop to the close_cb, instead of doing unref

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, or equivalent, is still needed (to make Win32 pass)

Copy link
Contributor Author

@bwoebi bwoebi Dec 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the test fail on linux at least:
# Assertion failed in test/test-poll-multiple-handles.c on line 87: 1 == uv_run(uv_default_loop(), UV_RUN_DEFAULT)

I guess that's normal then given the poll ref disappearing. Needs to check == 0 then.

Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI failed on Windows, but otherwise, LGTM

@enumag
Copy link

enumag commented Oct 19, 2020

what's the status here?

@bwoebi
Copy link
Contributor Author

bwoebi commented Dec 27, 2020

@vtjnash Is there any chance to get help from you libuv maintainers to fix the CI for windows?

@vtjnash
Copy link
Member

vtjnash commented Dec 27, 2020

Rebase and I can re-run CI. I try to sweep up old issues every 6 months, if I can get the time. But often depend on others doing the work, so review can be quicker.

@bwoebi
Copy link
Contributor Author

bwoebi commented Dec 27, 2020

@vtjnash Thanks for the quick reply: rebased it now.

Consider the following scenario:

uv_poll_init(loop, poll, fd);
uv_poll_start(poll, UV_READABLE, cb);
// the cb gets invoked etc.
uv_poll_stop(poll);

close(fd);
fd = allocate_new_socket(); // allocate_new_socket() is assigned the same fd by "bad luck" from the OS

// some time later:
uv_poll_init(loop, otherpoll, fd);
uv_poll_start(otherpoll, UV_READABLE, cb);

uv_close(poll); // uv__io_stop: Assertion `loop->watchers[w->fd] == w' failed.

According to documentation, "however the fd can be safely closed
immediately after a call to uv_poll_stop() or uv_close()."
Though, in this scenario, we close()'d our file descriptor, and by
bad luck we got the same file descriptor again and register a new
handle for it and start polling.

Previously that would lead to an assertion failure, if we were to
properly free the original handle via uv_close().

This commit fixes that by moving the check whether a only a single
poll handle is active to uv_poll_start() instead of the stopping
routines.
@bwoebi
Copy link
Contributor Author

bwoebi commented Dec 28, 2020

@vtjnash Applied your suggested fix - can you please re-run CI now?

Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vtjnash vtjnash merged commit c9406ba into libuv:v1.x Dec 28, 2020
cjihrig added a commit to cjihrig/libuv that referenced this pull request Jan 5, 2021
cjihrig added a commit to cjihrig/libuv that referenced this pull request Jan 5, 2021
cjihrig added a commit that referenced this pull request Jan 16, 2021
Refs: #2686
PR-URL: #3088
Reviewed-By: Jameson Nash <vtjnash@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
JeffroMF pushed a commit to JeffroMF/libuv that referenced this pull request May 16, 2022
Consider the following scenario:

uv_poll_init(loop, poll, fd);
uv_poll_start(poll, UV_READABLE, cb);
// the cb gets invoked etc.
uv_poll_stop(poll);

close(fd);
fd = allocate_new_socket(); // allocate_new_socket() is assigned the same fd by "bad luck" from the OS

// some time later:
uv_poll_init(loop, otherpoll, fd);
uv_poll_start(otherpoll, UV_READABLE, cb);

uv_close(poll); // uv__io_stop: Assertion `loop->watchers[w->fd] == w' failed.

According to documentation, "however the fd can be safely closed
immediately after a call to uv_poll_stop() or uv_close()."
Though, in this scenario, we close()'d our file descriptor, and by
bad luck we got the same file descriptor again and register a new
handle for it and start polling.

Previously that would lead to an assertion failure, if we were to
properly free the original handle via uv_close().

This commit fixes that by moving the check whether a only a single
poll handle is active to uv_poll_start() instead of the stopping
routines.

Fixes: libuv#1172
Fixes: amphp/ext-uv#81
Fixes: https://github.com/b2wdigital/aiologger/issues/82
Fixes: JuliaDatabases/LibPQ.jl#140
PR-URL: libuv#2686
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Jameson Nash <vtjnash@gmail.com>
JeffroMF pushed a commit to JeffroMF/libuv that referenced this pull request May 16, 2022
Refs: libuv#2686
PR-URL: libuv#3088
Reviewed-By: Jameson Nash <vtjnash@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
liujinye-sys pushed a commit to open-vela/apps_system_libuv that referenced this pull request Jul 23, 2025
Consider the following scenario:

uv_poll_init(loop, poll, fd);
uv_poll_start(poll, UV_READABLE, cb);
// the cb gets invoked etc.
uv_poll_stop(poll);

close(fd);
fd = allocate_new_socket(); // allocate_new_socket() is assigned the same fd by "bad luck" from the OS

// some time later:
uv_poll_init(loop, otherpoll, fd);
uv_poll_start(otherpoll, UV_READABLE, cb);

uv_close(poll); // uv__io_stop: Assertion `loop->watchers[w->fd] == w' failed.

According to documentation, "however the fd can be safely closed
immediately after a call to uv_poll_stop() or uv_close()."
Though, in this scenario, we close()'d our file descriptor, and by
bad luck we got the same file descriptor again and register a new
handle for it and start polling.

Previously that would lead to an assertion failure, if we were to
properly free the original handle via uv_close().

This commit fixes that by moving the check whether a only a single
poll handle is active to uv_poll_start() instead of the stopping
routines.

Fixes: libuv/libuv#1172
Fixes: amphp/ext-uv#81
Fixes: https://github.com/b2wdigital/aiologger/issues/82
Fixes: JuliaDatabases/LibPQ.jl#140
PR-URL: libuv/libuv#2686
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Jameson Nash <vtjnash@gmail.com>
liujinye-sys pushed a commit to open-vela/apps_system_libuv that referenced this pull request Jul 23, 2025
Refs: libuv/libuv#2686
PR-URL: libuv/libuv#3088
Reviewed-By: Jameson Nash <vtjnash@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
liujinye-sys pushed a commit to open-vela/apps_system_libuv that referenced this pull request Dec 16, 2025
Consider the following scenario:

uv_poll_init(loop, poll, fd);
uv_poll_start(poll, UV_READABLE, cb);
// the cb gets invoked etc.
uv_poll_stop(poll);

close(fd);
fd = allocate_new_socket(); // allocate_new_socket() is assigned the same fd by "bad luck" from the OS

// some time later:
uv_poll_init(loop, otherpoll, fd);
uv_poll_start(otherpoll, UV_READABLE, cb);

uv_close(poll); // uv__io_stop: Assertion `loop->watchers[w->fd] == w' failed.

According to documentation, "however the fd can be safely closed
immediately after a call to uv_poll_stop() or uv_close()."
Though, in this scenario, we close()'d our file descriptor, and by
bad luck we got the same file descriptor again and register a new
handle for it and start polling.

Previously that would lead to an assertion failure, if we were to
properly free the original handle via uv_close().

This commit fixes that by moving the check whether a only a single
poll handle is active to uv_poll_start() instead of the stopping
routines.

Fixes: libuv/libuv#1172
Fixes: amphp/ext-uv#81
Fixes: https://github.com/b2wdigital/aiologger/issues/82
Fixes: JuliaDatabases/LibPQ.jl#140
PR-URL: libuv/libuv#2686
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Jameson Nash <vtjnash@gmail.com>
liujinye-sys pushed a commit to open-vela/apps_system_libuv that referenced this pull request Dec 16, 2025
Refs: libuv/libuv#2686
PR-URL: libuv/libuv#3088
Reviewed-By: Jameson Nash <vtjnash@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

not-stale Issues that should never be marked stale

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants