Re: BUG #13657: Some kind of undetected deadlock between query and "startup process" on replica. - Mailing list pgsql-bugs

From Maxim Boguk
Subject Re: BUG #13657: Some kind of undetected deadlock between query and "startup process" on replica.
Date
Msg-id CAK-MWwR9q1EKh5=R7oSPqHgqu-uOcVRfjpxm4eFM_Jao7N9s4Q@mail.gmail.com
Whole thread Raw
In response to Re: BUG #13657: Some kind of undetected deadlock between query and "startup process" on replica.  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: BUG #13657: Some kind of undetected deadlock between query and "startup process" on replica.  (Maxim Boguk <maxim.boguk@gmail.com>)
List pgsql-bugs
On Fri, Oct 2, 2015 at 4:58 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:

>
>
> On Fri, Oct 2, 2015 at 2:14 PM, Maxim Boguk <maxim.boguk@gmail.com> wrote=
:
>
>> =E2=80=8B>=E2=80=8B
>> This backtrace is not indicating that this process is waiting on a
>> relation lock, it is resolving a recovery conflict while removing tuples=
,
>> killing the virtual transaction depending on if max_standby_streaming_de=
lay
>> or max_standby_archive_delay are set if the conflict gets longer. Did yo=
u
>> change the default of those parameters, which is 30s, to -1? This would
>> mean that the standby waits indefinitely.
>>
>>
>> =E2=80=8BProblem that startup process have confict with a query, which b=
locked
>> (waiting for) on the startup process itself (query could not process
>> because it waiting for lock which held by startup process, and startup
>> process waiting for finishing this query). So it's an undetected deadloc=
k
>> condtion here (as I understand situation).  =E2=80=8B
>>
>> PS: there are no other activity on the database during that problem
>> except blocked query.
>>
>
> Don't you have other queries running in parallel of the one you are
> defining as "stuck" on the standby that prevent replay to move on? Like a
> long-running transaction working on the relation involved? Are you sure
> that you did not set up
> =E2=80=8B=E2=80=8B
> max_standby_streaming_delay to -1?
> --
> Michael
>

During the problem period on the database had runned only one query (listed
in intial report) and nothing more (and this query had beed in waiting
state according to pg_stat_activity).
The pg_locks show that the query waiting for AccessShareLock on relation
17987, in the same time the startup process have AccessExclusiveLock on the
same relation and waiting for something. No other activity on the replica
going on.
And yes, the=E2=80=8B max_standby_streaming_delay to -1, as a result the
replication process had been stuck on query from external monitoring tool
forever until I killed that query, but situation repeated in few hours
again.

--=20
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/ <http://www.postgresql-consulting.com/=
>

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com
=D0=9C=D0=BE=D0=B9=D0=9A=D1=80=D1=83=D0=B3: http://mboguk.moikrug.ru/

"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."

pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BUG #13657: Some kind of undetected deadlock between query and "startup process" on replica.
Next
From: kmursk@rambler.ru
Date:
Subject: BUG #13661: Using word LIMIT