Alexey Ermakov <alexey.ermakov@dataegret.com> writes:
> On 9/23/19 01:06, Tom Lane wrote:
>> If you have a test case, could you collect stack traces from each
>> of the stuck processes? That would eliminate a lot of hypothesizing.
> I reproduced Sergei's test case on postgresql 11.5, replica hung up
> almost immediately after pgbench ran.
> stack trace of startup process 9907:
> #0 0x00007fdff4a025b3 in __select_nocancel () at
> ../sysdeps/unix/syscall-template.S:84
> #1 0x00005608fde0c6cd in pg_usleep (microsec=<optimized out>) at
> /build/postgresql-11-d6c2wG/postgresql-11-11.5/build/../src/port/pgsleep.c:56
> #2 0x00005608fdc94126 in WaitExceedsMaxStandbyDelay () at
> /build/postgresql-11-d6c2wG/postgresql-11-11.5/build/../src/backend/storage/ipc/standby.c:201
> #3 ResolveRecoveryConflictWithVirtualXIDs (waitlist=0x5608fe46a450,
> reason=reason@entry=PROCSIG_RECOVERY_CONFLICT_SNAPSHOT) at
> /build/postgresql-11-d6c2wG/postgresql-11-11.5/build/../src/backend/storage/ipc/standby.c:262
This does not look like a deadlock: the startup process is just biding
its time until the standby delay timeout elapses, after which it's
gonna kill the conflicting queries.
It is, perhaps, arguable that it's a damn bad idea to allow
max_standby_streaming_delay or max_standby_archive_delay to be
set to "forever".
regards, tom lane