Re: TOAST condition for column size - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: TOAST condition for column size
Date
Msg-id CAFiTN-t4rZ2zjoM17QPTVDo46e+Epac-+2nQ3hDOKmwTXDbokA@mail.gmail.com
Whole thread Raw
In response to TOAST condition for column size  (torikoshia <torikoshia@oss.nttdata.com>)
Responses Re: TOAST condition for column size  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Mon, Jan 18, 2021 at 7:53 PM torikoshia <torikoshia@oss.nttdata.com> wrote:
>
> Hi,
>
> When I created a table consisting of 400 VARCHAR columns and tried
> to INSERT a record which rows were all the same size, there were
> cases where I got an error due to exceeding the size limit per
> row.
>
>    =# -- create a table consisting of 400 VARCHAR columns
>    =# CREATE TABLE t1 (c1 VARCHAR(100),
>                        c2 VARCHAR(100),
>                        ...
>                        c400 VARCHAR(100));
>
>    =# -- insert one record which rows are all 20 bytes
>    =# INSERT INTO t1 VALUES (repeat('a', 20),
>                              repeat('a', 20),
>                              ...
>                              repeat('a', 20));
>      ERROR:  row is too big: size 8424, maximum size 8160
>
> What is interesting is that it failed only when the size of each
> column was 20~23 bytes, as shown below.
>
>    size of each column  |  result
>    -------------------------------
>            18 bytes     |  success
>            19 bytes     |  success
>            20 bytes     |  failure
>            21 bytes     |  failure
>            22 bytes     |  failure
>            23 bytes     |  failure
>            24 bytes     |  success
>            25 bytes     |  success
>
>
> When the size of each column was 19 bytes or less, it succeeds
> because the row size is within a page size.
> When the size of each column was 24 bytes or more, it also
> succeeds because columns are TOASTed and the row size is reduced
> to less than one page size.
> OTOH, when it's more than 19 bytes and less than 24 bytes,
> columns aren't TOASTed because it doesn't meet the condition of
> the following if statement.
>
>   --src/backend/access/table/toast_helper.c
>
>     toast_tuple_find_biggest_attribute(ToastTupleContext *ttc,
>                           bool for_compression, bool check_main)
>         ...(snip)...
>         int32        biggest_size = MAXALIGN(TOAST_POINTER_SIZE);
>         ...(snip)...
>         if (ttc->ttc_attr[i].tai_size > biggest_size) // <- here
>         {
>             biggest_attno = i;
>             biggest_size = ttc->ttc_attr[i].tai_size;
>         }
>
>
> Since TOAST_POINTER_SIZE is 18 bytes but
> MAXALIGN(TOAST_POINTER_SIZE) is 24 bytes, columns are not TOASTed
> until its size becomes larger than 24 bytes.
>
> I confirmed these sizes in my environment but AFAIU they would be
> the same size in any environment.
>
> So, as a result of adjusting the alignment, 20~23 bytes seems to
> fail.
>
> I wonder if it might be better not to adjust the alignment here
> as an attached patch because it succeeded in inserting 20~23
> bytes records.
> Or is there reasons to add the alignment here?
>
> I understand that TOAST is not effective for small data and it's
> not recommended to create a table containing hundreds of columns,
> but I think cases that can be successful should be successful.
>
> Any thoughts?

How this can be correct? because while forming the tuple you might
need the alignment.  So basically while computing the size we are not
considering alignment and later while actually forming the tuple you
might have to align it so seems like it can create corruption while
forming the tuple.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Printing backtrace of postgres processes
Next
From: James Hilliard
Date:
Subject: Re: [PATCH 1/1] Initial mach based shared memory support.