Re: [HACKERS] snapbuild woes - Mailing list pgsql-hackers

From Petr Jelinek
Subject Re: [HACKERS] snapbuild woes
Date
Msg-id f41d64c7-1ae2-fac7-bf7f-f24dfe2d33bd@2ndquadrant.com
Whole thread Raw
In response to Re: [HACKERS] snapbuild woes  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
Responses Re: [HACKERS] snapbuild woes  (Erik Rijkers <er@xs4all.nl>)
Re: [HACKERS] snapbuild woes  (Simon Riggs <simon@2ndquadrant.com>)
Re: [HACKERS] snapbuild woes  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On 26/02/17 01:43, Petr Jelinek wrote:
> On 24/02/17 22:56, Petr Jelinek wrote:
>> On 22/02/17 03:05, Petr Jelinek wrote:
>>> On 13/12/16 00:38, Petr Jelinek wrote:
>>>> On 12/12/16 23:33, Andres Freund wrote:
>>>>> On 2016-12-12 23:27:30 +0100, Petr Jelinek wrote:
>>>>>> On 12/12/16 22:42, Andres Freund wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 2016-12-10 23:10:19 +0100, Petr Jelinek wrote:
>>>>>>>> Hi,
>>>>>>>> First one is outright bug, which has to do with how we track running
>>>>>>>> transactions. What snapbuild basically does while doing initial snapshot
>>>>>>>> is read the xl_running_xacts record, store the list of running txes and
>>>>>>>> then wait until they all finish. The problem with this is that
>>>>>>>> xl_running_xacts does not ensure that it only logs transactions that are
>>>>>>>> actually still running (to avoid locking PGPROC) so there might be xids
>>>>>>>> in xl_running_xacts that already committed before it was logged.
>>>>>>>
>>>>>>> I don't think that's actually true?  Notice how LogStandbySnapshot()
>>>>>>> only releases the lock *after* the LogCurrentRunningXacts() iff
>>>>>>> wal_level >= WAL_LEVEL_LOGICAL.  So the explanation for the problem you
>>>>>>> observed must actually be a bit more complex :(
>>>>>>>
>>>>>>
>>>>>> Hmm, interesting, I did see the transaction commit in the WAL before the
>>>>>> xl_running_xacts that contained the xid as running. I only seen it on
>>>>>> production system though, didn't really manage to easily reproduce it
>>>>>> locally.
>>>>>
>>>>> I suspect the reason for that is that RecordTransactionCommit() doesn't
>>>>> conflict with ProcArrayLock in the first place - only
>>>>> ProcArrayEndTransaction() does.  So they're still running in the PGPROC
>>>>> sense, just not the crash-recovery sense...
>>>>>
>>>>
>>>> That looks like reasonable explanation. BTW I realized my patch needs
>>>> bit more work, currently it will break the actual snapshot as it behaves
>>>> same as if the xl_running_xacts was empty which is not correct AFAICS.
>>>>
>>>
>>> Hi,
>>>
>>> I got to work on this again. Unfortunately I haven't found solution that
>>> I would be very happy with. What I did is in case we read
>>> xl_running_xacts which has all transactions we track finished, we start
>>> tracking from that new xl_running_xacts again with the difference that
>>> we clean up the running transactions based on previously seen committed
>>> ones. That means that on busy server we may wait for multiple
>>> xl_running_xacts rather than just one, but at least we have chance to
>>> finish unlike with current coding which basically waits for empty
>>> xl_running_xacts. I also removed the additional locking for logical
>>> wal_level in LogStandbySnapshot() since it does not work.
>>
>> Not hearing any opposition to this idea so I decided to polish this and
>> also optimize it a bit.
>>
>> That being said, thanks to testing from Erik Rijkers I've identified one
>> more bug in how we do the initial snapshot. Apparently we don't reserve
>> the global xmin when we start building the initial exported snapshot for
>> a slot (we only reserver catalog_xmin which is fine for logical decoding
>> but not for the exported snapshot) so the VACUUM and heap pruning will
>> happily delete old versions of rows that are still needed by anybody
>> trying to use that exported snapshot.
>>
> 
> Aaand I found one more bug in snapbuild. Apparently we don't protect the
> snapshot builder xmin from going backwards which can yet again result in
> corrupted exported snapshot.
> 
> Summary of attached patches:
> 0001 - Fixes the above mentioned global xmin tracking issues. Needs to
> be backported all the way to 9.4
> 
> 0002 - Removes use of the logical decoding saved snapshots for initial
> exported snapshot since those snapshots only work for catalogs and not
> user data. Also needs to be backported all the way to 9.4.

I've been bit overzealous about this one (removed the use of the saved
snapshots completely). So here is a fix for that (and rebase on top of
current HEAD).

> 
> 0003 - Makes sure snapshot builder xmin is not moved backwards by
> xl_running_xacts (which can otherwise happen during initial snapshot
> building). Also should be backported to 9.4.
> 
> 0004 - Changes handling of the xl_running_xacts in initial snapshot
> build to what I wrote above and removes the extra locking from
> LogStandbySnapshot introduced by logical decoding.
> 
> 0005 - Improves performance of initial snapshot building by skipping
> catalog snapshot build for transactions that don't do catalog changes.
> 


-- 
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: [HACKERS] brin autosummarization -- autovacuum "work items"
Next
From: Erik Rijkers
Date:
Subject: Re: [HACKERS] snapbuild woes