Re: should INSERT SELECT use a BulkInsertState? - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: should INSERT SELECT use a BulkInsertState?
Date
Msg-id CANP8+j+Cenz5mmmoEEUNkBBKLKPCUw2ESRnOw1B7QQMdcp2k+A@mail.gmail.com
Whole thread Raw
In response to Re: should INSERT SELECT use a BulkInsertState?  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Thu, 4 Jun 2020 at 18:31, Andres Freund <andres@anarazel.de> wrote:

> On 2020-05-08 02:25:45 -0500, Justin Pryzby wrote:
> > Seems to me it should, at least conditionally.  At least if there's a function
> > scan or a relation or ..
>
> Well, the problem is that this can cause very very significant
> regressions. As in 10x slower or more. The ringbuffer can cause constant
> XLogFlush() calls (due to the lsn interlock), and the eviction from
> shared_buffers (regardless of actual available) will mean future vacuums
> etc will be much slower.  I think this is likely to cause pretty
> widespread regressions on upgrades.
>
> Now, it sucks that we have this problem in the general facility that's
> supposed to be used for this kind of bulk operation. But I don't really
> see it realistic as expanding use of bulk insert strategies unless we
> have some more fundamental fixes.

Are you saying that *anything* that uses the BulkInsertState is
generally broken? We use it for VACUUM and COPY writes, so you are
saying they are broken??

When we put that in, the use of the ringbuffer for writes required a
much larger number of blocks to smooth out the extra XLogFlush()
calls, but overall it was a clear win in those earlier tests. Perhaps
the ring buffer needs to be increased, or made configurable. The
eviction behavior was/is deliberate, to avoid large data loads
spoiling cache - perhaps that could also be configurable for the case
where data fits in shared buffers.

Anyway, if we can discuss what you see as broken, we can fix that and
then extend the usage to other cases, such as INSERT SELECT.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Resetting spilled txn statistics in pg_stat_replication
Next
From: Ashutosh Bapat
Date:
Subject: Re: Improper use about DatumGetInt32