Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions - Mailing list pgsql-hackers

From Noah Misch
Subject Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date
Msg-id 20201209092625.GA1685466@rfd.leadboat.com
Whole thread Raw
In response to Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
List pgsql-hackers
On Wed, Dec 02, 2020 at 01:50:25PM +0530, Amit Kapila wrote:
> On Wed, Dec 2, 2020 at 1:20 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > On Mon, Nov 30, 2020 at 6:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > On Mon, Nov 30, 2020 at 3:14 AM Noah Misch <noah@leadboat.com> wrote:
> > > > > On Mon, Sep 07, 2020 at 12:00:41PM +0530, Amit Kapila wrote:
> > > > > > Thanks, I have pushed the last patch. Let's wait for a day or so to
> > > > > > see the buildfarm reports
> > > > >
> > > > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2020-09-08%2006%3A24%3A14
> > > > > failed the new 015_stream.pl test with the subscriber looping like this:
> > > > >
> > > > > 2020-09-08 11:22:49.848 UTC [13959252:1] LOG:  logical replication apply worker for subscription "tap_sub"
hasstarted
 
> > > > > 2020-09-08 11:22:54.045 UTC [13959252:2] ERROR:  could not open temporary file "16393-510.changes.0" from
BufFile"16393-510.changes": No such file or directory
 
> > > > > 2020-09-08 11:22:54.055 UTC [7602182:1] LOG:  logical replication apply worker for subscription "tap_sub" has
started
> > > > > 2020-09-08 11:22:54.101 UTC [31785284:4] LOG:  background worker "logical replication worker" (PID 13959252)
exitedwith exit code 1
 
> > > > > 2020-09-08 11:23:01.142 UTC [7602182:2] ERROR:  could not open temporary file "16393-510.changes.0" from
BufFile"16393-510.changes": No such file or directory
 
> > > > > ...

> > > > The above kind of error can happen due to the following reasons: (a)
> > > > the first time we sent the stream and created the file and that got
> > > > removed before the second stream reached the subscriber. (b) from the
> > > > publisher-side, we never sent the indication that it is the first
> > > > stream and the subscriber directly tries to open the file thinking it
> > > > is already there.

Further testing showed it was a file location problem, not a deletion problem.
The worker tried to open
base/pgsql_tmp/pgsql_tmp9896408.1.sharedfileset/16393-510.changes.0, but these
were the files actually existing:

[nm@power-aix 0:2 2020-12-08T13:56:35 64gcc 0]$ ls -la $(find src/test/subscription/tmp_check -name '*sharedfileset*')
src/test/subscription/tmp_check/t_015_stream_subscriber_data/pgdata/base/pgsql_tmp/pgsql_tmp9896408.0.sharedfileset:
total 408
drwx------    2 nm       usr             256 Dec 08 03:20 .
drwx------    4 nm       usr             256 Dec 08 03:20 ..
-rw-------    1 nm       usr          207806 Dec 08 03:20 16393-510.changes.0

src/test/subscription/tmp_check/t_015_stream_subscriber_data/pgdata/base/pgsql_tmp/pgsql_tmp9896408.1.sharedfileset:
total 0
drwx------    2 nm       usr             256 Dec 08 03:20 .
drwx------    4 nm       usr             256 Dec 08 03:20 ..
-rw-------    1 nm       usr               0 Dec 08 03:20 16393-511.changes.0

> > I have executed "make check" in the loop with only this file.  I have
> > repeated it 5000 times but no failure, I am wondering shall we try to
> > execute in the same machine in a loop where it failed once?
> 
> Yes, that might help. Noah, would it be possible for you to try that

The problem is xidhash using strcmp() to compare keys; it needs memcmp().  For
this to matter, xidhash must contain more than one element.  Existing tests
rarely exercise the multi-element scenario.  Under heavy load, on this system,
the test publisher can have two active transactions at once, in which case it
does exercise multi-element xidhash.  (The publisher is sensitive to timing,
but the subscriber is not; once WAL contains interleaved records of two XIDs,
the subscriber fails every time.)  This would be much harder to reproduce on a
little-endian system, where strcmp(&xid, &xid_plus_one)!=0.  On big-endian,
every small XID has zero in the first octet; they all look like empty strings.

The attached patch has the one-line fix and some test suite changes that make
this reproduce frequently on any big-endian system.  I'm currently planning to
drop the test suite changes from the commit, but I could keep them if folks
like them.  (They'd need more comments and timeout handling.)

Attachment

pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Next
From: Laurenz Albe
Date:
Subject: Re: Change default of checkpoint_completion_target