Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions - Mailing list pgsql-hackers

From Erik Rijkers
Subject Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Date
Msg-id a98691f0d50701efc492e41b2e102eca@xs4all.nl
Whole thread Raw
In response to Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
List pgsql-hackers
On 2017-12-23 21:06, Tomas Vondra wrote:
> On 12/23/2017 03:03 PM, Erikjan Rijkers wrote:
>> On 2017-12-23 05:57, Tomas Vondra wrote:
>>> Hi all,
>>> 
>>> Attached is a patch series that implements two features to the 
>>> logical
>>> replication - ability to define a memory limit for the reorderbuffer
>>> (responsible for building the decoded transactions), and ability to
>>> stream large in-progress transactions (exceeding the memory limit).
>>> 
>> 
>> logical replication of 2 instances is OK but 3 and up fail with:
>> 
>> TRAP: FailedAssertion("!(last_lsn < change->lsn)", File:
>> "reorderbuffer.c", Line: 1773)
>> 
>> I can cobble up a script but I hope you have enough from the assertion
>> to see what's going wrong...
> 
> The assertion says that the iterator produces changes in order that 
> does
> not correlate with LSN. But I have a hard time understanding how that
> could happen, particularly because according to the line number this
> happens in ReorderBufferCommit(), i.e. the current (non-streaming) 
> case.
> 
> So instructions to reproduce the issue would be very helpful.

Using:

0001-Introduce-logical_work_mem-to-limit-ReorderBuffer-v2.patch
0002-Issue-XLOG_XACT_ASSIGNMENT-with-wal_level-logical-v2.patch
0003-Issue-individual-invalidations-with-wal_level-log-v2.patch
0004-Extend-the-output-plugin-API-with-stream-methods-v2.patch
0005-Implement-streaming-mode-in-ReorderBuffer-v2.patch
0006-Add-support-for-streaming-to-built-in-replication-v2.patch

As you expected the problem is the same with these new patches.

I have now tested more, and seen that it not always fails.  I guess that 
it here fails 3 times out of 4.  But the laptop I'm using at the moment 
is old and slow -- it may well be a factor as we've seen before [1].

Attached is the bash that I put together.  I tested with 
NUM_INSTANCES=2, which yields success, and NUM_INSTANCES=3, which fails 
often.  This same program run with HEAD never seems to fail (I tried a 
few dozen times).

thanks,

Erik Rijkers


[1] 
https://www.postgresql.org/message-id/3897361c7010c4ac03f358173adbcd60%40xs4all.nl


Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: parallel append vs. simple UNION ALL
Next
From: Tomas Vondra
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions