Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: Adding REPACK [concurrently]
Date
Msg-id 29157.1774029970@localhost
Whole thread Raw
In response to Re: Adding REPACK [concurrently]  (Antonin Houska <ah@cybertec.at>)
List pgsql-hackers
Antonin Houska <ah@cybertec.at> wrote:

> Antonin Houska <ah@cybertec.at> wrote:
>
> > Srinath Reddy Sadipiralla <srinath2133@gmail.com> wrote:
> >
> > > The concurrency test failed once. I tried to reproduce the below scenario
> > > but no luck,i think the reason the assert failure happened because
> > > after speculative insert there might be no spec CONFIRM or ABORT, thoughts?
> >
> > Perhaps, I'll try. I'm not sure the REPACK decoding worker does anthing
> > special regarding decoding. If you happen to see the problem again, please try
> > to preserve the related WAL segments - if this is a bug in PG executor,
> > pg_waldump might reveal that.
>
> I could not reproduce the failure, and have no idea how speculative insert can
> stay w/o CONFIRM / ABORT record. The only problem I could imagine is that
> change_useless_for_repack() filters out the CONFIRM / ABORT record
> accidentally, but neither code review nor debugger proves that
> theory. (Actually if this was the problem, the test failure probably wouldn't
> be that rare.)

I confirm that I was able to reproduce the crash using debugger and your more
recent diagnosis [1]. Indeed, filtering was the problem.

Unfortunately, I wasn't able to make the crash easily reproducible using
isolation tester. The problem is that the logical decoding is performed by a
background worker, and when the backend executing REPACK waits for the
background worker, which in turn waits on an injection point, the isolation
tester does not recognize that it's effectively the backend who is waiting on
the injection point. Therefore the isolation tester does not proceed to the
next step.

Anyway, thanks again for your testing!

[1] https://www.postgresql.org/message-id/CAFC%2Bb6qk3-DQTi43QMqvVLP%2BsudPV4vsLQm5iHfcCeObrNaVyA%40mail.gmail.com

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: unclear OAuth error message
Next
From: Corey Huinker
Date:
Subject: Re: meson: Make test output much more useful on failure (both in CI and locally)