Re: [HACKERS] logical replication busy-waiting on a lock - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [HACKERS] logical replication busy-waiting on a lock
Date
Msg-id 90A0E15D-4D52-4197-BFF7-A1814699A2E4@anarazel.de
Whole thread Raw
In response to Re: [HACKERS] logical replication busy-waiting on a lock  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
Responses Re: [HACKERS] logical replication busy-waiting on a lock  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
List pgsql-hackers

On May 29, 2017 12:25:35 PM PDT, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote:
>On 29/05/17 21:21, Petr Jelinek wrote:
>> On 29/05/17 20:59, Andres Freund wrote:
>>>
>>>
>>> On May 29, 2017 11:58:05 AM PDT, Petr Jelinek
><petr.jelinek@2ndquadrant.com> wrote:
>>>> On 27/05/17 17:17, Andres Freund wrote:
>>>>>
>>>>>
>>>>> On May 27, 2017 9:48:22 AM EDT, Petr Jelinek
>>>> <petr.jelinek@2ndquadrant.com> wrote:
>>>>>> Actually, I guess it's the pid 47457 (COPY process) who is
>actually
>>>>>> running the xid 73322726. In that case that's the same thing
>>>> Masahiko
>>>>>> Sawada reported [1]. Which basically is result of snapshot
>builder
>>>>>> waiting for transaction to finish, that's normal if there is a
>long
>>>>>> transaction running when the snapshot is being created (and the
>COPY
>>>> is
>>>>>> a long transaction).
>>>>>
>>>>> Hm.  I suspect the issue is that the exported snapshot needs an
>xid
>>>> for some crosscheck, and that's what we're waiting for.  Could you
>>>> check what happens if you don't assign one and just content the
>error
>>>> checks out?   Not at my computer, just theorizing.
>>>>>
>>>>
>>>> I don't think that's it, in my opinion it's the parallelization of
>>>> table
>>>> data copy where we create snapshot for one process but then the
>next
>>>> one
>>>> has to wait for the first one to finish. Before we fixed the
>>>> snapshotting, the second one would just use the ondisk snapshot so
>it
>>>> would work fine (except the snapshot was corrupted of course). I
>wonder
>>>> if we could somehow give it a hint to ignore the read-only txes,
>but
>>>> then we have no way to enforce the txes to stay read-only so it
>does
>>>> not
>>>> seem safe.
>>>
>>> Read-only txs have no xid ...
>>>
>>
>> That's what I mean by hinting, normally they don't but building
>initial
>> snapshot in snapshot builder calls GetTopTransactionId() (see
>> SnapBuildInitialSnapshot()) which will assign it xid.
>>
>
>Looking at the code more, the xid is only used as parameter for
>SnapBuildBuildSnapshot() which never does anything with that parameter,
>I wonder if it's really needed then.

Not at a computer, but by memory that'll trigger the snapshot export routine to include it.  Import in turn requires
thexid to check if the source is still alive.  But there's better ways, e.g. using the virtual xactid. 

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.



pgsql-hackers by date:

Previous
From: Petr Jelinek
Date:
Subject: Re: [HACKERS] logical replication busy-waiting on a lock
Next
From: Christoph Berg
Date:
Subject: Re: [HACKERS] psql: Activate pager only for height, not width