Commit 40aa5dd
committed
Make _bt_killitems drop pins it acquired itself.
Teach nbtree's _bt_killitems to leave the so->currPos page that it sets
LP_DEAD items on in whatever state it was in when _bt_killitems was
called. In particular, make sure that so->dropPin scans don't acquire a
pin whose reference is saved in so->currPos.buf.
Allowing _bt_killitems to change so->currPos.buf like this is wrong.
The immediate consequence of allowing it is that code in _bt_steppage
(that copies so->currPos into so->markPos) will behave as if the scan is
a !so->dropPin scan. so->markPos will therefore retain the buffer pin
indefinitely, even though _bt_killitems only needs to acquire a pin
(along with a lock) for long enough to mark known-dead items LP_DEAD.
This issue came to light following a report of a failure of an assertion
from recent commit e6eed40. The test case in question involves the use
of mark and restore. An initial call to _bt_killitems takes place that
leaves so->currPos.buf in a state that is inconsistent with the scan
being so->dropPin. A subsequent call to _bt_killitems for the same
position (following so->currPos being saved in so->markPos, and then
restored as so->currPos) resulted in the failure of an assertion that
tests that so->currPos.buf is InvalidBuffer when the scan is so->dropPin
(non-assert builds got a "resource was not closed" WARNING instead).
The same problem exists on earlier releases, though the issue is far
more subtle there. Recent commit e6eed40 introduced the so->dropPin
field as a partial replacement for testing so->currPos.buf directly.
Earlier releases won't get an assertion failure (or buffer pin leak),
but they will allow the second _bt_killitems call from the test case to
behave as if a buffer pin was consistently held since the original call
to _bt_readpage. This is wrong; there will have been an initial window
during which no pin was held on the so->currPos page, and yet the second
_bt_killitems call will neglect to check if so->currPos.lsn continues to
match the page's now-current LSN.
As a result of all this, it's just about possible that _bt_killitems
will set the wrong items LP_DEAD (on release branches). This could only
happen with merge joins (the sole user of nbtree mark/restore support),
when a concurrently inserted index tuple used a recently-recycled TID
(and only when the new tuple was inserted onto the same page as a
distinct concurrently-removed tuple with the same TID). This is exactly
the scenario that _bt_killitems' check of the page's now-current LSN
against the LSN stashed in currPos was supposed to prevent.
A follow-up commit will make nbtree completely stop conditioning whether
or not a position's pin needs to be dropped on whether the 'buf' field
is set. All call sites that might need to drop a still-held pin will be
taught to rely on the scan-level so->dropPin field recently introduced
by commit e6eed40. That will make bugs of the same general nature as
this one impossible (or make them much easier to detect, at least).
Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/545be1e5-3786-439a-9257-a90d30f8b849@gmail.com
Backpatch-through: 131 parent 30e0d9e commit 40aa5dd
1 file changed
+20
-14
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4144 | 4144 | | |
4145 | 4145 | | |
4146 | 4146 | | |
4147 | | - | |
4148 | | - | |
4149 | | - | |
| 4147 | + | |
| 4148 | + | |
| 4149 | + | |
4150 | 4150 | | |
4151 | 4151 | | |
4152 | 4152 | | |
| |||
4178 | 4178 | | |
4179 | 4179 | | |
4180 | 4180 | | |
4181 | | - | |
| 4181 | + | |
| 4182 | + | |
4182 | 4183 | | |
4183 | 4184 | | |
4184 | 4185 | | |
| |||
4197 | 4198 | | |
4198 | 4199 | | |
4199 | 4200 | | |
4200 | | - | |
4201 | | - | |
4202 | | - | |
| 4201 | + | |
| 4202 | + | |
4203 | 4203 | | |
4204 | 4204 | | |
4205 | 4205 | | |
4206 | | - | |
| 4206 | + | |
4207 | 4207 | | |
4208 | 4208 | | |
4209 | 4209 | | |
4210 | 4210 | | |
4211 | 4211 | | |
4212 | | - | |
4213 | | - | |
4214 | | - | |
4215 | | - | |
| 4212 | + | |
| 4213 | + | |
| 4214 | + | |
| 4215 | + | |
4216 | 4216 | | |
4217 | 4217 | | |
4218 | 4218 | | |
4219 | 4219 | | |
4220 | 4220 | | |
| 4221 | + | |
| 4222 | + | |
4221 | 4223 | | |
4222 | 4224 | | |
| 4225 | + | |
4223 | 4226 | | |
4224 | 4227 | | |
4225 | 4228 | | |
| |||
4336 | 4339 | | |
4337 | 4340 | | |
4338 | 4341 | | |
4339 | | - | |
| 4342 | + | |
4340 | 4343 | | |
4341 | 4344 | | |
4342 | | - | |
| 4345 | + | |
| 4346 | + | |
| 4347 | + | |
| 4348 | + | |
4343 | 4349 | | |
4344 | 4350 | | |
4345 | 4351 | | |
| |||
0 commit comments