Re: AIO v2.5 - Mailing list pgsql-hackers

From Noah Misch
Subject Re: AIO v2.5
Date
Msg-id 20250402001324.e4.nmisch@google.com
Whole thread Raw
In response to Re: AIO v2.5  (Andres Freund <andres@anarazel.de>)
Responses Re: AIO v2.5
List pgsql-hackers
On Tue, Apr 01, 2025 at 06:25:28PM -0400, Andres Freund wrote:
> On 2025-04-01 17:47:51 -0400, Andres Freund wrote:
> > 3) Some subtests fail if RELCACHE_FORCE_RELEASE and CATCACHE_FORCE_RELEASE are defined:
> > 
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2025-04-01%2019%3A23%3A07
> > 
> > # +++ tap check in src/test/modules/test_aio +++
> > 
> > #   Failed test 'worker: batch_start() leak & cleanup in implicit xact: expected stderr'
> > #   at t/001_aio.pl line 318.
> > #                   'psql:<stdin>:4: ERROR:  starting batch while batch already in progress'
> > #     doesn't match '(?^:open AIO batch at end)'
> > 
> > 
> > The problem is basically that the test intentionally forgets to exit batchmode
> > - normally that would trigger an error at the end of the transaction, which
> > the test verifies.  However, with RELCACHE_FORCE_RELEASE and
> > CATCACHE_FORCE_RELEASE defined, we get other code entering batchmode and
> > erroring out because batchmode isn't allowed to be entered recursively.

> > I don't really have a good idea how to deal with that yet.
> 
> Hm. Making the query something like
> 
> SELECT * FROM (VALUES (NULL), (batch_start()));
> 
> avoids the wrong output, because the type lookup happens for the first row
> already. But that's pretty magical and probably fragile.

Hmm.  Some options:

a. VALUES() trick above.  For test code, it's hard to argue with something
   that seems to solve it in practice.

b. Encapsulate the test in a PROCEDURE, so perhaps less happens between the
   batch_start() and the procedure-managed COMMIT.  Maybe less fragile than
   (a), maybe more fragile.

c. Move RELCACHE_FORCE_RELEASE and CATCACHE_FORCE_RELEASE to be
   GUC-controlled, like how CLOBBER_CACHE_ALWAYS changed into the
   debug_discard_caches GUC.  Then disable them for relevant parts of
   test_aio.  This feels best long-term, but it's bigger.  I also wanted this
   in syscache-update-pruned.spec[1].

d. Have test_aio deduce whether these are set, probably by observing memory
   contexts or DEBUG messages.  Maybe have every postmaster startup print a
   DEBUG message about these settings being enabled.  Skip relevant parts of
   test_aio.  This sounds messy.

Each of those feels defensible to me.  I'd probably do (a) or (b) to start.


[1] For that spec, an alternative expected output sufficed.  Incidentally,
I'll soon fix that spec flaking on valgrind/skink.



pgsql-hackers by date:

Previous
From: Junwang Zhao
Date:
Subject: Re: general purpose array_sort
Next
From: Kwangwon Seo
Date:
Subject: Covering the comparison between date and timestamp, tz, type