Re: Performance degradation on concurrent COPY into a single relation in PG16. - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Performance degradation on concurrent COPY into a single relation in PG16.
Date
Msg-id 20231013173910.ns5jcnctajfxtj5y@awork3.anarazel.de
Whole thread Raw
In response to Re: Performance degradation on concurrent COPY into a single relation in PG16.  (Andres Freund <andres@anarazel.de>)
Responses Re: Performance degradation on concurrent COPY into a single relation in PG16.
List pgsql-hackers
Hi,

On 2023-10-12 09:24:19 -0700, Andres Freund wrote:
> On 2023-10-12 11:44:09 -0400, Tom Lane wrote:
> > Andres Freund <andres@anarazel.de> writes:
> > >> On 2023-09-25 15:42:26 -0400, Tom Lane wrote:
> > >>> I just did a git bisect run to discover when the failure documented
> > >>> in bug #18130 [1] started.  And the answer is commit 82a4edabd.
> > 
> > > Uh, huh.  The problem is that COPY uses a single BulkInsertState for multiple
> > > partitions. Which to me seems to run counter to the following comment:
> > >  *    The caller can also provide a BulkInsertState object to optimize many
> > >  *    insertions into the same relation.  This keeps a pin on the current
> > >  *    insertion target page (to save pin/unpin cycles) and also passes a
> > >  *    BULKWRITE buffer selection strategy object to the buffer manager.
> > >  *    Passing NULL for bistate selects the default behavior.
> > 
> > > The reason this doesn't cause straight up corruption due to reusing a pin from
> > > another relation is that b1ecb9b3fcfb added ReleaseBulkInsertStatePin() and a
> > > call to it. But I didn't make ReleaseBulkInsertStatePin() reset the bulk
> > > insertion state, which is what leads to the errors from the bug report.
> > 
> > > Resetting the relevant BulkInsertState fields fixes the problem. But I'm not
> > > sure that's the right fix. ISTM that independent of whether we fix this via
> > > ReleaseBulkInsertStatePin() resetting the fields or via not reusing
> > > BulkInsertState, we should add assertions defending against future issues like
> > > this (e.g. by adding a relation field to BulkInsertState in cassert builds,
> > > and asserting that the relation is the same as in prior calls unless
> > > ReleaseBulkInsertStatePin() has been called).
> > 
> > Ping?  We really ought to have a fix for this committed in time for
> > 16.1.
> 
> I kind of had hoped somebody would comment on the approach.  Given that nobody
> has, I'll push the minimal fix of resetting the state in
> ReleaseBulkInsertStatePin(), even though I think architecturally that's not
> great.

I spent some time working on a test that shows the problem more cheaply than
the case upthread. I think it'd be desirable to have a test that's likely to
catch an issue like this fairly quickly. We've had other problems in this
realm before - there's only a single test that fails if I remove the
ReleaseBulkInsertStatePin() call, and I don't think that's guaranteed at all.

I'm a bit on the fence on how large to make the relation. For me the bug
triggers when filling both relations up to the 3rd block, but how many rows
that takes is somewhat dependent on space utilization on the page and stuff.

Right now the test uses data/desc.data and ends up with 328kB and 312kB in two
partitions. Alternatively I could make the test create a new file to load with
copy that has fewer rows than data/desc.data - I didn't see another data file
that works conveniently and has fewer rows.  The copy is reasonably fast, even
under something as expensive as rr (~60ms). So I'm inclined to just go with
that?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: pg_stat_statements and "IN" conditions
Next
From: Andres Freund
Date:
Subject: Re: Performance degradation on concurrent COPY into a single relation in PG16.