Thread: ALTER TABLE uses a bistate but not for toast tables

ALTER TABLE uses a bistate but not for toast tables

From

Justin Pryzby

Date:

22 June 2022, 14:38:41

ATRewriteTable() calls table_tuple_insert() with a bistate, to avoid clobbering
and polluting the buffers.

But heap_insert() then calls 
heap_prepare_insert() >
heap_toast_insert_or_update >
toast_tuple_externalize >
toast_save_datum >
heap_insert(toastrel, toasttup, mycid, options, NULL /* without bistate:( */);

I came up with this patch.  I'm not sure but maybe it should be implemented at
the tableam layer and not inside heap.  Maybe the BulkInsertState should have a
2nd strategy buffer for toast tables.

CREATE TABLE t(i int, a text, b text, c text,d text,e text,f text,g text);
INSERT INTO t SELECT 0, array_agg(a),array_agg(a),array_agg(a),array_agg(a),array_agg(a),array_agg(a) FROM
generate_series(1,999)n,repeat(n::text,99)a,generate_series(1,99)bGROUP BY b;
 
INSERT INTO t SELECT * FROM t;
INSERT INTO t SELECT * FROM t;
INSERT INTO t SELECT * FROM t;
INSERT INTO t SELECT * FROM t;

ALTER TABLE t ALTER i TYPE smallint;
SELECT COUNT(1), relname, COUNT(1) FILTER(WHERE isdirty) FROM pg_buffercache b JOIN pg_class c ON c.oid=b.relfilenode
GROUPBY 2 ORDER BY 1 DESC LIMIT 9;
 

Without this patch:
postgres=# SELECT COUNT(1), relname, COUNT(1) FILTER(WHERE isdirty) FROM pg_buffercache b JOIN pg_class c ON
c.oid=b.relfilenodeGROUP BY 2 ORDER BY 1 DESC LIMIT 9;
 
 10283 | pg_toast_55759                  |  8967

With this patch:
  1418 | pg_toast_16597                  |  1418

-- 
Justin

Attachment

0001-WIP-use-BulkInsertState-for-toast-tuples-too.patch

Re: ALTER TABLE uses a bistate but not for toast tables

From

"Drouvot, Bertrand"

Date:

07 September 2022, 08:48:39

Hi,

On 6/22/22 4:38 PM, Justin Pryzby wrote:

ATRewriteTable() calls table_tuple_insert() with a bistate, to avoid clobbering
and polluting the buffers.

But heap_insert() then calls
heap_prepare_insert() >
heap_toast_insert_or_update >
toast_tuple_externalize >
toast_save_datum >
heap_insert(toastrel, toasttup, mycid, options, NULL /* without bistate:( */);

Good catch!

I came up with this patch.

+       /* Release pin after main table, before switching to write to toast table */
+       if (bistate)
+               ReleaseBulkInsertStatePin(bistate);

I'm not sure we should release and reuse here the bistate of the main table: it looks like that with the patch ReadBufferBI() on the main relation wont have the desired block already pinned (then would need to perform a read).

What do you think about creating earlier a new dedicated bistate for the toast table?

+       if (bistate)
+       {
+               table_finish_bulk_insert(toastrel, options); // XXX

I think it's too early, as it looks to me that at this stage we may have not finished the whole bulk insert yet.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: ALTER TABLE uses a bistate but not for toast tables

From

Michael Paquier

Date:

12 October 2022, 06:52:56

On Wed, Sep 07, 2022 at 10:48:39AM +0200, Drouvot, Bertrand wrote:
> +       if (bistate)
> +       {
> +               table_finish_bulk_insert(toastrel, options); // XXX
>
> I think it's too early, as it looks to me that at this stage we may have not
> finished the whole bulk insert yet.

Yeah, that feels fishy.  Not sure what's the idea behind the XXX
comment, either.  I have marked this patch as RwF, following the lack
of reply.
--
Michael

Attachment

signature.asc

Re: ALTER TABLE uses a bistate but not for toast tables

From

Justin Pryzby

Date:

27 November 2022, 20:15:12

On Wed, Sep 07, 2022 at 10:48:39AM +0200, Drouvot, Bertrand wrote:
> Hi,
> 
> On 6/22/22 4:38 PM, Justin Pryzby wrote:
> > ATRewriteTable() calls table_tuple_insert() with a bistate, to avoid clobbering
> > and polluting the buffers.
> > 
> > But heap_insert() then calls
> > heap_prepare_insert() >
> > heap_toast_insert_or_update >
> > toast_tuple_externalize >
> > toast_save_datum >
> > heap_insert(toastrel, toasttup, mycid, options, NULL /* without bistate:( */);
> 
> What do you think about creating earlier a new dedicated bistate for the
> toast table?

Yes, but I needed to think about what data structure to put it in...

Here, I created a 2nd bistate for toast whenever creating a bistate for
heap.  That avoids the need to add arguments to tableam's
table_tuple_insert(), in addition to the 6 other functions in the call
stack.

I also updated rewriteheap.c to handle the same problem in CLUSTER:

postgres=# DROP TABLE t; CREATE TABLE t AS SELECT i, repeat((5555555+i)::text, 123456)t FROM generate_series(1,9999)i;
postgres=# VACUUM FULL VERBOSE t ; SELECT COUNT(1), datname, coalesce(c.relname,b.relfilenode::text), d.relname FROM
pg_buffercacheb LEFT JOIN pg_class c ON b.relfilenode=pg_relation_filenode(c.oid) LEFT JOIN pg_class d ON
d.reltoastrelid=c.oidLEFT JOIN pg_database db ON db.oid=b.reldatabase GROUP BY 2,3,4 ORDER BY 1 DESC LIMIT 22;
 

Unpatched:
  5000 | postgres | pg_toast_96188840       | t
  => 40MB of shared buffers

Patched:
  2048 | postgres | pg_toast_17097      | t

Note that a similar problem seems to exist in COPY ... but I can't see
how to fix that one.

-- 
Justin

Attachment

v3-0001-WIP-use-BulkInsertState-for-toast-tuples-too.patch

Re: ALTER TABLE uses a bistate but not for toast tables

From

Nikita Malakhov

Date:

12 December 2022, 21:26:15

Hi!

Found this discussion for our experiments with TOAST, I'd have to check it under [1].

I'm not sure, what behavior is expected when the main table is unpinned, bulk insert

to the TOAST table is in progress, and the second query with a heavy bulk insert to

the same TOAST table comes in?

Thank you!

[1] https://www.postgresql.org/message-id/flat/224711f9-83b7-a307-b17f-4457ab73aa0a@sigaev.ru

On Sun, Nov 27, 2022 at 11:15 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

On Wed, Sep 07, 2022 at 10:48:39AM +0200, Drouvot, Bertrand wrote:
> Hi,
>
> On 6/22/22 4:38 PM, Justin Pryzby wrote:
> > ATRewriteTable() calls table_tuple_insert() with a bistate, to avoid clobbering
> > and polluting the buffers.
> >
> > But heap_insert() then calls
> > heap_prepare_insert() >
> > heap_toast_insert_or_update >
> > toast_tuple_externalize >
> > toast_save_datum >
> > heap_insert(toastrel, toasttup, mycid, options, NULL /* without bistate:( */);
>
> What do you think about creating earlier a new dedicated bistate for the
> toast table?

Yes, but I needed to think about what data structure to put it in...

Here, I created a 2nd bistate for toast whenever creating a bistate for
heap. That avoids the need to add arguments to tableam's
table_tuple_insert(), in addition to the 6 other functions in the call
stack.

I also updated rewriteheap.c to handle the same problem in CLUSTER:

postgres=# DROP TABLE t; CREATE TABLE t AS SELECT i, repeat((5555555+i)::text, 123456)t FROM generate_series(1,9999)i;
postgres=# VACUUM FULL VERBOSE t ; SELECT COUNT(1), datname, coalesce(c.relname,b.relfilenode::text), d.relname FROM pg_buffercache b LEFT JOIN pg_class c ON b.relfilenode=pg_relation_filenode(c.oid) LEFT JOIN pg_class d ON d.reltoastrelid=c.oid LEFT JOIN pg_database db ON db.oid=b.reldatabase GROUP BY 2,3,4 ORDER BY 1 DESC LIMIT 22;

Unpatched:
5000 | postgres | pg_toast_96188840 | t
=> 40MB of shared buffers

Patched:
2048 | postgres | pg_toast_17097 | t

Note that a similar problem seems to exist in COPY ... but I can't see
how to fix that one.

--
Justin

Regards,

Nikita Malakhov

Postgres Professional

https://postgrespro.ru/

Re: ALTER TABLE uses a bistate but not for toast tables

From

Matthias van de Meent

Date:

07 November 2023, 16:17:06

Hi Justin,

This patch has gone stale quite some time ago; CFbot does not seem to
have any history of a successful apply attemps, nor do we have any
succesful build history (which was introduced some time ago already).

Are you planning on rebasing this patch?

Kind regards,

Matthias van de Meent

Re: ALTER TABLE uses a bistate but not for toast tables

From

Justin Pryzby

Date:

16 November 2023, 17:40:20

@cfbot: rebased

Attachment

v4-0001-WIP-use-BulkInsertState-for-toast-tuples-too.patch

Re: ALTER TABLE uses a bistate but not for toast tables

From

Justin Pryzby

Date:

15 July 2024, 20:43:24

@cfbot: rebased

Attachment

0001-WIP-use-BulkInsertState-for-toast-tuples-too.patch

Re: ALTER TABLE uses a bistate but not for toast tables

From

Dmitry Dolgov

Date:

19 November 2024, 17:45:19

> On Mon, Jul 15, 2024 at 03:43:24PM GMT, Justin Pryzby wrote:
> @cfbot: rebased

Hey Justin,

Thanks for rebasing. To help with review, could you also describe
current status of the patch? I have to admit, currently the commit
message doesn't tell much, and looks more like notes for the future you.
The patch numbering is somewhat confusing as well, should it be v5 now?
From what I understand, the new patch does address the review feedback,
but you want to do more, something with copy to / copy from?

Since it's in the performance category, I'm also curious how much
overhead does this shave off? I mean, I get it that bulk insert strategy
helps with buffers usage, as you've implied in the thread -- but how
does it look like in benchmark numbers?

Re: ALTER TABLE uses a bistate but not for toast tables

From

Justin Pryzby

Date:

20 November 2024, 15:43:58

On Tue, Nov 19, 2024 at 03:45:19PM +0100, Dmitry Dolgov wrote:
> > On Mon, Jul 15, 2024 at 03:43:24PM GMT, Justin Pryzby wrote:
> > @cfbot: rebased
> 
> Thanks for rebasing. To help with review, could you also describe
> current status of the patch? I have to admit, currently the commit
> message doesn't tell much, and looks more like notes for the future you.

The patch does what it aims to do and AFAIK in a reasonable way.  I'm
not aware of any issue with it.  It's, uh, waiting for review.

I'm happy to expand on the message to describe something like design
choices, but the goal here is really simple: why should wide column
values escape the intention of the ring buffer?  AFAICT it's fixing an
omission.  If you have a question, please ask; that would help to
indicate what needs to be explained.

> The patch numbering is somewhat confusing as well, should it be v5 now?

The filename was 0001-WIP-use-BulkInsertState-for-toast-tuples-too.patch.
I guess you're referring to the previous filename: v4-*.
That shouldn't be so confusing -- I just didn't specify a version,
either by choice or by omission.

> From what I understand, the new patch does address the review feedback,
> but you want to do more, something with copy to / copy from?

If I were to do more, it'd be for a future patch, if the current patch
were to ever progress.

> Since it's in the performance category, I'm also curious how much
> overhead does this shave off? I mean, I get it that bulk insert strategy
> helps with buffers usage, as you've implied in the thread -- but how
> does it look like in benchmark numbers?

The intent of using a bistate isn't to help the performance of the
process using the bistate.  Rather, the intent is to avoid harming the
performance of other processes.  If anything, I expect it could slow
down the process using bistate -- the same as for non-toast data.

https://www.postgresql.org/message-id/CA%2BTgmobC6RD2N8kbPPTvATpUY1kisY2wJLh2jsg%3DHGoCp2RiXw%40mail.gmail.com

-- 
Justin

Re: ALTER TABLE uses a bistate but not for toast tables

From

Dmitry Dolgov

Date:

20 November 2024, 23:11:00

> On Wed, Nov 20, 2024 at 06:43:58AM -0600, Justin Pryzby wrote:
>
> > Thanks for rebasing. To help with review, could you also describe
> > current status of the patch? I have to admit, currently the commit
> > message doesn't tell much, and looks more like notes for the future you.
>
> The patch does what it aims to do and AFAIK in a reasonable way.  I'm
> not aware of any issue with it.  It's, uh, waiting for review.
>
> I'm happy to expand on the message to describe something like design
> choices, but the goal here is really simple: why should wide column
> values escape the intention of the ring buffer?  AFAICT it's fixing an
> omission.  If you have a question, please ask; that would help to
> indicate what needs to be explained.

Here is what I see in the commit message:

    DONE: ALTER, CLUSTER
    TODO: copyto, copyfrom?

    slot_getsomeattrs slot_deform_heap_tuple fetchatt
    heap_getnextslot => heapgettup => heapgetpage => ReadBufferExtended
    initscan
    table_beginscan table_scan_getnextslot
    RelationCopyStorageUsingBuffer ReadBufferWithoutRelcache

    (gdb) bt
     #0  table_open (relationId=relationId@entry=16390, lockmode=lockmode@entry=1) at table.c:40
     #1  0x000056444cb23d3c in toast_fetch_datum (attr=attr@entry=0x7f67933cc6cc) at detoast.c:372
     #2  0x000056444cb24217 in detoast_attr (attr=attr@entry=0x7f67933cc6cc) at detoast.c:123
     #3  0x000056444d07a4c8 in pg_detoast_datum_packed (datum=datum@entry=0x7f67933cc6cc) at fmgr.c:1743
     #4  0x000056444d042c8d in text_to_cstring (t=0x7f67933cc6cc) at varlena.c:224
     #5  0x000056444d0434f9 in textout (fcinfo=<optimized out>) at varlena.c:573
     #6  0x000056444d078f10 in FunctionCall1Coll (flinfo=flinfo@entry=0x56444e4706b0, collation=collation@entry=0,
arg1=arg1@entry=140082828592844)at fmgr.c:1124
 
     #7  0x000056444d079d7f in OutputFunctionCall (flinfo=flinfo@entry=0x56444e4706b0, val=val@entry=140082828592844)
atfmgr.c:1561
 
     #8  0x000056444ccb1665 in CopyOneRowTo (cstate=cstate@entry=0x56444e470898, slot=slot@entry=0x56444e396d20) at
copyto.c:975
     #9  0x000056444ccb2c7d in DoCopyTo (cstate=cstate@entry=0x56444e470898) at copyto.c:891
     #10 0x000056444ccab4c2 in DoCopy (pstate=pstate@entry=0x56444e396bb0, stmt=stmt@entry=0x56444e3759b0,
stmt_location=0,stmt_len=48, processed=processed@entry=0x7ffc212a6310) at copy.c:308
 

    cluster:
    heapam_relation_copy_for_cluster
    reform_and_rewrite_tuple
    rewrite_heap_tuple
    raw_heap_insert

This gave me an impression, that the patch is deeply WIP, and it doesn't
make any sense to review it. I can imagine chances are good, that I'm
not alone who get such impression, and you loose potential reviewers.
Thus, shaping up a meaningful message might be helpful.

> > Since it's in the performance category, I'm also curious how much
> > overhead does this shave off? I mean, I get it that bulk insert strategy
> > helps with buffers usage, as you've implied in the thread -- but how
> > does it look like in benchmark numbers?
>
> The intent of using a bistate isn't to help the performance of the
> process using the bistate.  Rather, the intent is to avoid harming the
> performance of other processes.  If anything, I expect it could slow
> down the process using bistate -- the same as for non-toast data.
>
> https://www.postgresql.org/message-id/CA%2BTgmobC6RD2N8kbPPTvATpUY1kisY2wJLh2jsg%3DHGoCp2RiXw%40mail.gmail.com

Right, but the question is still there, how much does it bring? My point
is that if you demonstrate "under this and that load, we get so and so
many percents boost", this will hopefully attract more attention to the
patch.