Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data - Mailing list pgsql-bugs

From Andrey Borodin
Subject Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data
Date
Msg-id 64835CF9-D9AA-49BF-A685-01C23B1023C1@yandex-team.ru
Whole thread Raw
In response to Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data  (Noah Misch <noah@leadboat.com>)
Responses conchuela timeouts since 2021-10-09 system upgrade
List pgsql-bugs

> 24 окт. 2021 г., в 08:00, Noah Misch <noah@leadboat.com> написал(а):
>
> On Mon, Oct 18, 2021 at 08:02:12PM -0700, Noah Misch wrote:
>> On Mon, Oct 18, 2021 at 06:23:05PM +0500, Andrey Borodin wrote:
>>>> 17 окт. 2021 г., в 20:12, Noah Misch <noah@leadboat.com> написал(а):
>>>> I think the attached version is ready for commit.  Notable differences
>>>> vs. v14:
>
> Pushed.
Wow, that's great! Thank you!


>  Buildfarm member conchuela (DragonFly BSD 6.0) has gotten multiple
> "IPC::Run: timeout on timer" in the new tests.  No other animal has.
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=conchuela&dt=2021-10-24%2003%3A05%3A09
> is an example run.  The pgbench queries finished quickly, but the
> $pgbench_h->finish() apparently timed out after 180s.  I guess this would be
> consistent with pgbench blocking in write(), waiting for something to empty a
> pipe buffer so it can write more.  I thought finish() will drain any incoming
> I/O, though.  This phenomenon has been appearing regularly via
> src/test/recovery/t/017_shm.pl[1], so this thread doesn't have a duty to
> resolve it.  A stack trace of the stuck pgbench should be informative, though.

Some thoughts:
0. I doubt that psql\pgbench is stuck in these failures.
1. All observed similar failures seem to be related to finish() sub of IPC::Run harness
2. Finish must pump any pending data from process [0]. But it can hang if process is waiting for something.
3. There is reported bug of finish [1]. But the description is slightly different.


>
> Compared to my last post, the push included two more test changes.  I removed
> sleeps from a test.  They could add significant time on a system with coarse
> sleep granularity.  This did not change test sensitivity on my system.
> Second, I changed background_pgbench to include stderr lines in $stdout, as it
> had documented.  This becomes important during the back-patch to v11, where
> server errors don't cause a nonzero pgbench exit status.  background_psql
> still has the same bug, and I can fix it later.  (The background_psql version
> of the bug is not affecting current usage.)
>
> FYI, the non-2PC test is less sensitive in older branches.  It reproduces
> master's bug in 25-50% of runs, but it took about six minutes on v11 and v12.
It seem like loading Relation Descr to relcache becomes more expensive?

>>>> One thing not done here is to change the tests to use CREATE INDEX
>>>> CONCURRENTLY instead of REINDEX CONCURRENTLY, so they're back-patchable to v11
>>>> and earlier.  I may do that before pushing, or I may just omit the tests from
>>>> older branches.
>>>
>>> The tests refactors PostgresNode.pm and some tests. Back-patching this would be quite invasive.
>>
>> That's fine with me.  Back-patching a fix without its tests is riskier than
>> back-patching test infrastructure changes.
>
> Back-patching the tests did end up tricky, for other reasons.  Before v12
> (d3c09b9), a TAP suite in a pgxs module wouldn't run during check-world.
> Before v11 (7f563c0), amcheck lacks the heapallindexed feature that the tests
> rely on.  Hence, for v11, v10, and v9.6, I used a plpgsql implementation of
> the heapallindexed check, and I moved the tests to src/bin/pgbench.
Cool!

Thanks!

Best regards, Andrey Borodin.

[0] https://metacpan.org/dist/IPC-Run/source/lib/IPC/Run.pm#L3481
[1] https://github.com/toddr/IPC-Run/issues/57




pgsql-bugs by date:

Previous
From: Andrew Gierth
Date:
Subject: Re: BUG #17245: Index corruption involving deduplicated entries
Next
From: Noah Misch
Date:
Subject: conchuela timeouts since 2021-10-09 system upgrade