TOAST condition for column size - Mailing list pgsql-hackers

From torikoshia
Subject TOAST condition for column size
Date
Msg-id 6a621b00e44babb19d04ac1c00ccaab7@oss.nttdata.com
Whole thread Raw
Responses Re: TOAST condition for column size
Re: TOAST condition for column size
List pgsql-hackers
Hi,

When I created a table consisting of 400 VARCHAR columns and tried
to INSERT a record which rows were all the same size, there were
cases where I got an error due to exceeding the size limit per
row.

   =# -- create a table consisting of 400 VARCHAR columns
   =# CREATE TABLE t1 (c1 VARCHAR(100),
                       c2 VARCHAR(100),
                       ...
                       c400 VARCHAR(100));

   =# -- insert one record which rows are all 20 bytes
   =# INSERT INTO t1 VALUES (repeat('a', 20),
                             repeat('a', 20),
                             ...
                             repeat('a', 20));
     ERROR:  row is too big: size 8424, maximum size 8160

What is interesting is that it failed only when the size of each
column was 20~23 bytes, as shown below.

   size of each column  |  result
   -------------------------------
           18 bytes     |  success
           19 bytes     |  success
           20 bytes     |  failure
           21 bytes     |  failure
           22 bytes     |  failure
           23 bytes     |  failure
           24 bytes     |  success
           25 bytes     |  success


When the size of each column was 19 bytes or less, it succeeds
because the row size is within a page size.
When the size of each column was 24 bytes or more, it also
succeeds because columns are TOASTed and the row size is reduced
to less than one page size.
OTOH, when it's more than 19 bytes and less than 24 bytes,
columns aren't TOASTed because it doesn't meet the condition of
the following if statement.

  --src/backend/access/table/toast_helper.c

    toast_tuple_find_biggest_attribute(ToastTupleContext *ttc,
                          bool for_compression, bool check_main)
        ...(snip)...
        int32        biggest_size = MAXALIGN(TOAST_POINTER_SIZE);
        ...(snip)...
        if (ttc->ttc_attr[i].tai_size > biggest_size) // <- here
        {
            biggest_attno = i;
            biggest_size = ttc->ttc_attr[i].tai_size;
        }


Since TOAST_POINTER_SIZE is 18 bytes but
MAXALIGN(TOAST_POINTER_SIZE) is 24 bytes, columns are not TOASTed
until its size becomes larger than 24 bytes.

I confirmed these sizes in my environment but AFAIU they would be
the same size in any environment.

So, as a result of adjusting the alignment, 20~23 bytes seems to
fail.

I wonder if it might be better not to adjust the alignment here
as an attached patch because it succeeded in inserting 20~23
bytes records.
Or is there reasons to add the alignment here?

I understand that TOAST is not effective for small data and it's
not recommended to create a table containing hundreds of columns,
but I think cases that can be successful should be successful.

Any thoughts?


Regards,

--
Atsushi Torikoshi
Attachment

pgsql-hackers by date:

Previous
From: Zhihong Yu
Date:
Subject: Re: Transactions involving multiple postgres foreign servers, take 2
Next
From: Erik Rijkers
Date:
Subject: Re: proposal: schema variables