Thread: ALTER TABLE uses a bistate but not for toast tables

ALTER TABLE uses a bistate but not for toast tables

From
Justin Pryzby
Date:
ATRewriteTable() calls table_tuple_insert() with a bistate, to avoid clobbering
and polluting the buffers.

But heap_insert() then calls 
heap_prepare_insert() >
heap_toast_insert_or_update >
toast_tuple_externalize >
toast_save_datum >
heap_insert(toastrel, toasttup, mycid, options, NULL /* without bistate:( */);

I came up with this patch.  I'm not sure but maybe it should be implemented at
the tableam layer and not inside heap.  Maybe the BulkInsertState should have a
2nd strategy buffer for toast tables.

CREATE TABLE t(i int, a text, b text, c text,d text,e text,f text,g text);
INSERT INTO t SELECT 0, array_agg(a),array_agg(a),array_agg(a),array_agg(a),array_agg(a),array_agg(a) FROM
generate_series(1,999)n,repeat(n::text,99)a,generate_series(1,99)bGROUP BY b;
 
INSERT INTO t SELECT * FROM t;
INSERT INTO t SELECT * FROM t;
INSERT INTO t SELECT * FROM t;
INSERT INTO t SELECT * FROM t;

ALTER TABLE t ALTER i TYPE smallint;
SELECT COUNT(1), relname, COUNT(1) FILTER(WHERE isdirty) FROM pg_buffercache b JOIN pg_class c ON c.oid=b.relfilenode
GROUPBY 2 ORDER BY 1 DESC LIMIT 9;
 

Without this patch:
postgres=# SELECT COUNT(1), relname, COUNT(1) FILTER(WHERE isdirty) FROM pg_buffercache b JOIN pg_class c ON
c.oid=b.relfilenodeGROUP BY 2 ORDER BY 1 DESC LIMIT 9;
 
 10283 | pg_toast_55759                  |  8967

With this patch:
  1418 | pg_toast_16597                  |  1418

-- 
Justin

Attachment

Re: ALTER TABLE uses a bistate but not for toast tables

From
"Drouvot, Bertrand"
Date:

Hi,

On 6/22/22 4:38 PM, Justin Pryzby wrote:
ATRewriteTable() calls table_tuple_insert() with a bistate, to avoid clobbering
and polluting the buffers.

But heap_insert() then calls
heap_prepare_insert() >
heap_toast_insert_or_update >
toast_tuple_externalize >
toast_save_datum >
heap_insert(toastrel, toasttup, mycid, options, NULL /* without bistate:( */);

Good catch!

I came up with this patch.  

+       /* Release pin after main table, before switching to write to toast table */
+       if (bistate)
+               ReleaseBulkInsertStatePin(bistate);

I'm not sure we should release and reuse here the bistate of the main table: it looks like that with the patch ReadBufferBI() on the main relation wont have the desired block already pinned (then would need to perform a read).

What do you think about creating earlier a new dedicated bistate for the toast table?

+       if (bistate)
+       {
+               table_finish_bulk_insert(toastrel, options); // XXX

I think it's too early, as it looks to me that at this stage we may have not finished the whole bulk insert yet.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: ALTER TABLE uses a bistate but not for toast tables

From
Michael Paquier
Date:
On Wed, Sep 07, 2022 at 10:48:39AM +0200, Drouvot, Bertrand wrote:
> +       if (bistate)
> +       {
> +               table_finish_bulk_insert(toastrel, options); // XXX
>
> I think it's too early, as it looks to me that at this stage we may have not
> finished the whole bulk insert yet.

Yeah, that feels fishy.  Not sure what's the idea behind the XXX
comment, either.  I have marked this patch as RwF, following the lack
of reply.
--
Michael

Attachment

Re: ALTER TABLE uses a bistate but not for toast tables

From
Justin Pryzby
Date:
On Wed, Sep 07, 2022 at 10:48:39AM +0200, Drouvot, Bertrand wrote:
> Hi,
> 
> On 6/22/22 4:38 PM, Justin Pryzby wrote:
> > ATRewriteTable() calls table_tuple_insert() with a bistate, to avoid clobbering
> > and polluting the buffers.
> > 
> > But heap_insert() then calls
> > heap_prepare_insert() >
> > heap_toast_insert_or_update >
> > toast_tuple_externalize >
> > toast_save_datum >
> > heap_insert(toastrel, toasttup, mycid, options, NULL /* without bistate:( */);
> 
> What do you think about creating earlier a new dedicated bistate for the
> toast table?

Yes, but I needed to think about what data structure to put it in...

Here, I created a 2nd bistate for toast whenever creating a bistate for
heap.  That avoids the need to add arguments to tableam's
table_tuple_insert(), in addition to the 6 other functions in the call
stack.

I also updated rewriteheap.c to handle the same problem in CLUSTER:

postgres=# DROP TABLE t; CREATE TABLE t AS SELECT i, repeat((5555555+i)::text, 123456)t FROM generate_series(1,9999)i;
postgres=# VACUUM FULL VERBOSE t ; SELECT COUNT(1), datname, coalesce(c.relname,b.relfilenode::text), d.relname FROM
pg_buffercacheb LEFT JOIN pg_class c ON b.relfilenode=pg_relation_filenode(c.oid) LEFT JOIN pg_class d ON
d.reltoastrelid=c.oidLEFT JOIN pg_database db ON db.oid=b.reldatabase GROUP BY 2,3,4 ORDER BY 1 DESC LIMIT 22;
 

Unpatched:
  5000 | postgres | pg_toast_96188840       | t
  => 40MB of shared buffers

Patched:
  2048 | postgres | pg_toast_17097      | t

Note that a similar problem seems to exist in COPY ... but I can't see
how to fix that one.

-- 
Justin

Attachment

Re: ALTER TABLE uses a bistate but not for toast tables

From
Nikita Malakhov
Date:
Hi!

Found this discussion for our experiments with TOAST, I'd have to check it under [1].
I'm not sure, what behavior is expected when the main table is unpinned, bulk insert
to the TOAST table is in progress, and the second query with a heavy bulk insert to
the same TOAST table comes in?

Thank you!


On Sun, Nov 27, 2022 at 11:15 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Wed, Sep 07, 2022 at 10:48:39AM +0200, Drouvot, Bertrand wrote:
> Hi,
>
> On 6/22/22 4:38 PM, Justin Pryzby wrote:
> > ATRewriteTable() calls table_tuple_insert() with a bistate, to avoid clobbering
> > and polluting the buffers.
> >
> > But heap_insert() then calls
> > heap_prepare_insert() >
> > heap_toast_insert_or_update >
> > toast_tuple_externalize >
> > toast_save_datum >
> > heap_insert(toastrel, toasttup, mycid, options, NULL /* without bistate:( */);
>
> What do you think about creating earlier a new dedicated bistate for the
> toast table?

Yes, but I needed to think about what data structure to put it in...

Here, I created a 2nd bistate for toast whenever creating a bistate for
heap.  That avoids the need to add arguments to tableam's
table_tuple_insert(), in addition to the 6 other functions in the call
stack.

I also updated rewriteheap.c to handle the same problem in CLUSTER:

postgres=# DROP TABLE t; CREATE TABLE t AS SELECT i, repeat((5555555+i)::text, 123456)t FROM generate_series(1,9999)i;
postgres=# VACUUM FULL VERBOSE t ; SELECT COUNT(1), datname, coalesce(c.relname,b.relfilenode::text), d.relname FROM pg_buffercache b LEFT JOIN pg_class c ON b.relfilenode=pg_relation_filenode(c.oid) LEFT JOIN pg_class d ON d.reltoastrelid=c.oid LEFT JOIN pg_database db ON db.oid=b.reldatabase GROUP BY 2,3,4 ORDER BY 1 DESC LIMIT 22;

Unpatched:
  5000 | postgres | pg_toast_96188840       | t
  => 40MB of shared buffers

Patched:
  2048 | postgres | pg_toast_17097      | t

Note that a similar problem seems to exist in COPY ... but I can't see
how to fix that one.

--
Justin


--
Regards,
Nikita Malakhov
Postgres Professional 

Re: ALTER TABLE uses a bistate but not for toast tables

From
Matthias van de Meent
Date:
Hi Justin,

This patch has gone stale quite some time ago; CFbot does not seem to
have any history of a successful apply attemps, nor do we have any
succesful build history (which was introduced some time ago already).

Are you planning on rebasing this patch?

Kind regards,

Matthias van de Meent



Re: ALTER TABLE uses a bistate but not for toast tables

From
Justin Pryzby
Date:
@cfbot: rebased

Attachment