Skip to content

Commit 18e2862

Browse files
committed
libct/nsenter: fix extra runc re-exec on tmpfs
After adding some debug info to cloned_binary.c I found out that is_self_cloned() is not working right when runc binary is on tmpfs, resulting in one extra re-exec of runc. With some added debug: $ mkdir bin $ sudo mount -t tmpfs tmp bin $ sudo cp runc bin $ sudo ./bin/runc --debug exec xxx true DEBU[0000] nsexec[763590]: => is_self_cloned DEBU[0000] nsexec[763590]: got seals 1 (want 15) DEBU[0000] nsexec[763590]: <= is_self_cloned, is_cloned = 0 DEBU[0000] nsexec[763590]: try_bindfd: 5 DEBU[0000] nsexec[763590]: re-exec itself... DEBU[0000] nsexec[763590]: => is_self_cloned DEBU[0000] nsexec[763590]: got seals 1 (want 15) DEBU[0000] nsexec[763590]: <= is_self_cloned, is_cloned = 0 DEBU[0000] nsexec[763590]: try_bindfd: -1 DEBU[0000] nsexec[763590]: fallback to make_execfd: 5 DEBU[0000] nsexec[763590]: re-exec itself... DEBU[0000] nsexec[763590]: => is_self_cloned DEBU[0000] nsexec[763590]: got seals 15 (want 15) DEBU[0000] nsexec[763590]: <= is_self_cloned, is_cloned = 1 From the above, it is seen that - `is_self_cloned` returns 0, - `try_bindfd` is called and succeeds, - runc re-execs itself, - the second call to `is_self_cloned` returns 0 again (because GET_SEALS returns 1), - runc falls back to `make_execfd`, and re-execs again, - finally, the third `is_self_cloned` returns 1. I guess that the code relied on the following (quoting fcntl(2)): > Currently, file seals can be applied only to a file descriptor > returned by memfd_create(2) (if the MFD_ALLOW_SEALING was employed). > On other filesystems, all fcntl() operations that operate on seals > will return EINVAL. It looks like in case of a file on tmpfs it returns 1 (F_SEAL_SEAL). With the fix: DEBU[0000] nsexec[768367]: => is_self_cloned DEBU[0000] nsexec[768367]: got seals 1 (want 15) DEBU[0000] nsexec[768367]: no CLONED_BINARY_ENV DEBU[0000] nsexec[768367]: <= is_self_cloned, is_cloned = 0 DEBU[0000] nsexec[768367]: try_bindfd: 5 DEBU[0000] nsexec[768367]: re-exec itself... DEBU[0000] nsexec[768367]: => is_self_cloned DEBU[0000] nsexec[768367]: got seals 1 (want 15) DEBU[0000] nsexec[768367]: fstatfs says ro = 1 DEBU[0000] nsexec[768367]: fstat says nlink = 1 DEBU[0000] nsexec[768367]: <= is_self_cloned, is_cloned = 1 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1 parent eddf35e commit 18e2862

File tree

2 files changed

+8
-5
lines changed

2 files changed

+8
-5
lines changed

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1111
* When Intel RDT feature is not available, its initialization is skipped,
1212
resulting in slightly faster `runc exec` and `runc run`. (#3306)
1313

14+
### Fixed
15+
16+
* In case the runc binary resides on tmpfs, `runc init` no longer re-execs
17+
itself twice. (#3342)
18+
1419
## [1.1.0] - 2022-01-14
1520

1621
> A plan depends as much upon execution as it does upon concept.

libcontainer/nsenter/cloned_binary.c

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -137,7 +137,7 @@ static void *must_realloc(void *ptr, size_t size)
137137
*/
138138
static int is_self_cloned(void)
139139
{
140-
int fd, ret, is_cloned = 0;
140+
int fd, is_cloned = 0;
141141
struct stat statbuf = { };
142142
struct statfs fsbuf = { };
143143

@@ -153,11 +153,9 @@ static int is_self_cloned(void)
153153
* sharing it isn't a bad thing -- and an admin could bind-mount a sealed
154154
* memfd to /usr/bin/runc to allow re-use).
155155
*/
156-
ret = fcntl(fd, F_GET_SEALS);
157-
if (ret >= 0) {
158-
is_cloned = (ret == RUNC_MEMFD_SEALS);
156+
is_cloned = (fcntl(fd, F_GET_SEALS) == RUNC_MEMFD_SEALS);
157+
if (is_cloned)
159158
goto out;
160-
}
161159

162160
/*
163161
* All other forms require CLONED_BINARY_ENV, since they are potentially

0 commit comments

Comments
 (0)