Skip to content

Commit 6c61c69

Browse files
author
Amit Kapila
committed
Fix LOCK_TIMEOUT handling in slotsync worker.
Previously, the slotsync worker relied on SIGINT for graceful shutdown during promotion. However, SIGINT is also used by the LOCK_TIMEOUT handler to cancel queries. Since the slotsync worker can lock catalog tables while parsing libpq tuples, this overlap caused it to ignore LOCK_TIMEOUT signals and potentially wait indefinitely on locks. This patch replaces the slotsync worker's SIGINT handler with StatementCancelHandler to correctly process query-cancel interrupts. Additionally, the startup process now uses SIGUSR1 to signal the slotsync worker to stop during promotion. The worker exits after detecting that the shared memory flag stopSignaled is set. Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 17, here it was introduced Discussion: https://postgr.es/m/TY4PR01MB169078F33846E9568412D878C94A2A@TY4PR01MB16907.jpnprd01.prod.outlook.com
1 parent a59b039 commit 6c61c69

File tree

1 file changed

+10
-5
lines changed

1 file changed

+10
-5
lines changed

src/backend/replication/logical/slotsync.c

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1175,10 +1175,10 @@ ProcessSlotSyncInterrupts(WalReceiverConn *wrconn)
11751175
{
11761176
CHECK_FOR_INTERRUPTS();
11771177

1178-
if (ShutdownRequestPending)
1178+
if (SlotSyncCtx->stopSignaled)
11791179
{
11801180
ereport(LOG,
1181-
errmsg("replication slot synchronization worker is shutting down on receiving SIGINT"));
1181+
errmsg("replication slot synchronization worker is shutting down because promotion is triggered"));
11821182

11831183
proc_exit(0);
11841184
}
@@ -1409,7 +1409,7 @@ ReplSlotSyncWorkerMain(const void *startup_data, size_t startup_data_len)
14091409

14101410
/* Setup signal handling */
14111411
pqsignal(SIGHUP, SignalHandlerForConfigReload);
1412-
pqsignal(SIGINT, SignalHandlerForShutdownRequest);
1412+
pqsignal(SIGINT, StatementCancelHandler);
14131413
pqsignal(SIGTERM, die);
14141414
pqsignal(SIGFPE, FloatExceptionHandler);
14151415
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
@@ -1516,7 +1516,8 @@ ReplSlotSyncWorkerMain(const void *startup_data, size_t startup_data_len)
15161516

15171517
/*
15181518
* The slot sync worker can't get here because it will only stop when it
1519-
* receives a SIGINT from the startup process, or when there is an error.
1519+
* receives a stop request from the startup process, or when there is an
1520+
* error.
15201521
*/
15211522
Assert(false);
15221523
}
@@ -1601,8 +1602,12 @@ ShutDownSlotSync(void)
16011602

16021603
SpinLockRelease(&SlotSyncCtx->mutex);
16031604

1605+
/*
1606+
* Signal slotsync worker if it was still running. The worker will stop
1607+
* upon detecting that the stopSignaled flag is set to true.
1608+
*/
16041609
if (worker_pid != InvalidPid)
1605-
kill(worker_pid, SIGINT);
1610+
kill(worker_pid, SIGUSR1);
16061611

16071612
/* Wait for slot sync to end */
16081613
for (;;)

0 commit comments

Comments
 (0)