Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
bpo-34321: Add a trackfd parameter to mmap.mmap()
If *trackfd* is False, the file descriptor specified by *fileno*
will not be duplicated.
  • Loading branch information
ZackerySpytz committed Apr 15, 2021
commit d42762cd6cf9c076b8e5f56c85f03ac61b03115c
9 changes: 8 additions & 1 deletion Doc/library/mmap.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,8 @@ To map anonymous memory, -1 should be passed as the fileno along with the length

.. audit-event:: mmap.__new__ fileno,length,access,offset mmap.mmap

.. class:: mmap(fileno, length, flags=MAP_SHARED, prot=PROT_WRITE|PROT_READ, access=ACCESS_DEFAULT[, offset])
.. class:: mmap(fileno, length, flags=MAP_SHARED, prot=PROT_WRITE|PROT_READ, \
access=ACCESS_DEFAULT[, offset], trackfd=True)
:noindex:

**(Unix version)** Maps *length* bytes from the file specified by the file
Expand Down Expand Up @@ -100,10 +101,16 @@ To map anonymous memory, -1 should be passed as the fileno along with the length
defaults to 0. *offset* must be a multiple of :const:`ALLOCATIONGRANULARITY`
which is equal to :const:`PAGESIZE` on Unix systems.

If *trackfd* is ``False``, the file descriptor specified by *fileno* will
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to have some idea of why I might want to use this parameter. Right now it only describes the downsides.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows, the internally duplicated handle probably references an open that lacks delete access. It thus prevents deleting the file, even if the mapped section otherwise allows it (e.g. the section is mapped readonly). For example:

>>> f = open('spam.txt')
>>> m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
>>> f.close()
>>> os.remove('spam.txt')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'spam.txt'

>>> # I manually closed the internal handle via Process Explorer.
>>> os.remove('spam.txt')
>>> m[:]
b'spam'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a reason to at least add the argument for all platforms, which I'm generally in favour of anyway. It can have more appropriate semantics on Windows if needed (i.e. "doesn't hold an extra HANDLE" rather than "FD").

It's probably actually pretty useful to be able to immediately delete the file but keep the mapping open (which will keep the file on disk on Windows at least, so you can't reuse the name while it's in use). And it looks like the mapping doesn't lock out deletes, so I guess it'll work as intended.

I'm not going to hold up this PR for it though. All I'll say is that if we ever do add that option, it should be trackfd=False to "activate" it, for consistency between platforms.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a reason to at least add the argument for all platforms, which I'm generally in favour of anyway. It can have more appropriate semantics on Windows if needed (i.e. "doesn't hold an extra HANDLE" rather than "FD").

I think trackfd would be fine on Windows. The fileno parameter is a C file descriptor, not a native OS handle.

It's probably actually pretty useful to be able to immediately delete the file but keep the mapping open (which will keep the file on disk on Windows at least, so you can't reuse the name while it's in use). And it looks like the mapping doesn't lock out deletes, so I guess it'll work as intended.

NTFS supports POSIX delete, in which a deleted file gets renamed to a reserved system directory until all references to the file object have been closed. That includes the internal pointer reference to a file object that's held by the memory manager for the mapped section. The internal file reference doesn't count toward the file's share mode, i.e. a memory-mapped file can be deleted even if the source open didn't share delete access. Actually, I just checked that the delete is allowed nowadays even if the mapped section has write access to the file, so my assumption was wrong that it would only work for a readonly mapping.

You can observe this in Process Explorer. Switch the lower-pane view to DLLs (file- and pagefile-backed memory mappings), and add the name and path columns to the view. You'll see that the backing file gets moved to the "\$Extend\$Deleted" system directory on the volume after the file is 'deleted'.

not be duplicated.

To ensure validity of the created memory mapping the file specified
by the descriptor *fileno* is internally automatically synchronized
with physical backing store on Mac OS X and OpenVMS.

.. versionchanged:: 3.10
The *trackfd* parameter was added.

This example shows a simple way of using :class:`~mmap.mmap`::

import mmap
Expand Down
7 changes: 7 additions & 0 deletions Doc/whatsnew/3.10.rst
Original file line number Diff line number Diff line change
Expand Up @@ -831,6 +831,13 @@ linecache
When a module does not define ``__loader__``, fall back to ``__spec__.loader``.
(Contributed by Brett Cannon in :issue:`42133`.)

mmap
----

:class:`mmap.mmap` now has a *trackfd* parameter on Unix; if it is ``False``,
the file descriptor specified by *fileno* will not be duplicated.
(Contributed by Zackery Spytz in :issue:`34321`.)

os
--

Expand Down
10 changes: 10 additions & 0 deletions Lib/test/test_mmap.py
Original file line number Diff line number Diff line change
Expand Up @@ -246,6 +246,16 @@ def test_access_parameter(self):
self.assertRaises(TypeError, m.write_byte, 0)
m.close()

@unittest.skipIf(os.name == 'nt', 'trackfd not present on Windows')
def test_trackfd_parameter(self):
size = 64
with open(TESTFN, "wb") as f:
f.write(b"a"*size)
with open(TESTFN, "r+b") as f:
m = mmap.mmap(f.fileno(), size, trackfd=False)
self.assertEqual(len(m), size)
m.close()

def test_bad_file_desc(self):
# Try opening a bad file descriptor...
self.assertRaises(OSError, mmap.mmap, -2, 4096)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
:class:`mmap.mmap` now has a *trackfd* parameter on Unix; if it is
``False``, the file descriptor specified by *fileno* will not be duplicated.
15 changes: 10 additions & 5 deletions Modules/mmapmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -1136,15 +1136,17 @@ new_mmap_object(PyTypeObject *type, PyObject *args, PyObject *kwdict)
off_t offset = 0;
int fd, flags = MAP_SHARED, prot = PROT_WRITE | PROT_READ;
int devzero = -1;
int access = (int)ACCESS_DEFAULT;
int access = (int)ACCESS_DEFAULT, trackfd = 1;
static char *keywords[] = {"fileno", "length",
"flags", "prot",
"access", "offset", NULL};
"access", "offset", "trackfd", NULL};

if (!PyArg_ParseTupleAndKeywords(args, kwdict, "in|iii" _Py_PARSE_OFF_T, keywords,
if (!PyArg_ParseTupleAndKeywords(args, kwdict,
"in|iii" _Py_PARSE_OFF_T "p", keywords,
&fd, &map_size, &flags, &prot,
&access, &offset))
&access, &offset, &trackfd)) {
return NULL;
}
if (map_size < 0) {
PyErr_SetString(PyExc_OverflowError,
"memory mapped length must be positive");
Expand Down Expand Up @@ -1265,13 +1267,16 @@ new_mmap_object(PyTypeObject *type, PyObject *args, PyObject *kwdict)
}
#endif
}
else {
else if (trackfd) {
m_obj->fd = _Py_dup(fd);
if (m_obj->fd == -1) {
Py_DECREF(m_obj);
return NULL;
}
}
else {
m_obj->fd = -1;
}

m_obj->data = mmap(NULL, map_size,
prot, flags,
Expand Down