Re: [HACKERS] Logical replication - TRAP: FailedAssertion in pgstat.c - Mailing list pgsql-hackers

From Erik Rijkers
Subject Re: [HACKERS] Logical replication - TRAP: FailedAssertion in pgstat.c
Date
Msg-id 81b4087df137adee5745bb8f987c1fad@xs4all.nl
Whole thread Raw
In response to Re: [HACKERS] Logical replication - TRAP: FailedAssertion in pgstat.c  (Stas Kelvich <s.kelvich@postgrespro.ru>)
List pgsql-hackers
On 2017-04-17 15:59, Stas Kelvich wrote:
>> On 17 Apr 2017, at 10:30, Erik Rijkers <er@xs4all.nl> wrote:
>> 
>> On 2017-04-16 20:41, Andres Freund wrote:
>>> On 2017-04-16 10:46:21 +0200, Erik Rijkers wrote:
>>>> On 2017-04-15 04:47, Erik Rijkers wrote:
>>>> >
>>>> > 0001-Reserve-global-xmin-for-create-slot-snasphot-export.patch +
>>>> > 0002-Don-t-use-on-disk-snapshots-for-snapshot-export-in-l.patch+
>>>> > 0003-Prevent-snapshot-builder-xmin-from-going-backwards.patch  +
>>>> > 0004-Fix-xl_running_xacts-usage-in-snapshot-builder.patch      +
>>>> > 0005-Skip-unnecessary-snapshot-builds.patch
>>>> I am now using these newer patches:
>>>> https://www.postgresql.org/message-id/30242bc6-eca4-b7bb-670e-8d0458753a8c%402ndquadrant.com
>>>> > It builds fine, but when I run the old pbench-over-logical-replication
>>>> > test I get:
>>>> >
>>>> > TRAP: FailedAssertion("!(entry->trans == ((void *)0))", File:
>>>> > "pgstat.c", Line: 828)
>>>> To get that error:
>>> I presume this is the fault of
>>> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=139eb9673cb84c76f493af7e68301ae204199746
>>> if you git revert that individual commit, do things work again?
>> 
>> Yes, compiled from 67c2def11d4 with the above 4 patches, it runs 
>> flawlessly again. (flawlessly= a few hours without any error)
>> 
> 
> I’ve reproduced failure, this happens under tablesync worker and 
> putting
> pgstat_report_stat() under the previous condition block should help.
> 
> However for me it took about an hour of running this script to catch
> original assert.
> 
> Can you check with that patch applied?


Your patch on top of the 5 patches above seem to solve the matter too: 
no problems after running for 2 hours (previously it failed within half 
a minute).



Erik Rijkers





pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: [HACKERS] logical replication and PANIC during shutdowncheckpoint in publisher
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] Possible problem in Custom Scan API