Thread: Removed extra memory allocations from create_list_bounds

Removed extra memory allocations from create_list_bounds

From
Nitin Jadhav
Date:
Hi,

While working on [1], I observed that extra memory is allocated in 'create_list_bounds' 
function which can be avoided. So the attached patch removes extra memory 
allocations done inside 'create_list_bounds' function and also removes the 
unused variable 'cell'. 

In the existing code, in create_list_bounds(),
  1. It iterates through all the partitions and for each partition,
    • It iterates through the list of datums named 'listdatums'.
      • For each non null value of 'listdatums', it allocates a memory for 'list_value' whose type is 'PartitionListValue' and stores value and index information.
      • Appends 'list_value' to a list named 'non_null_values'.
  2. Allocates memory to 'all_values' variable which contains information of all the list bounds of all the partitions. The count allocated for 'all_values' is nothing but the total number of non null values which is populated from the previous step (1).
  3. Iterates through each item of 'non_null_values' list.
    • It allocates a memory for 'all_values[i]' whose type is 'PartitionListValue' and copies the information from 'list_value'.
 The above logic is changed to following,
  1. Call function 'get_non_null_count_list_bounds()' which iterates through all the partitions and for each partition, it iterates through a list of datums and calculates the count of all non null bound values.
  2. Allocates memory to 'all_values' variable which contains information of all the list bounds of all the partitions. The count allocated for 'all_values' is nothing but the total number of non null values which is populated from the previous step (1).
  3. Iterates through all the partitions and for each partition,
    • It iterates through the list of datums named 'listdatums'.
      • For each non null value of 'listdatums', it allocates a memory for 'all_values[i]' whose type is 'PartitionListValue' and stores value and index information directly.
The above fix, removes the extra memory allocations. Let's consider an example.
If there are 10 partitions and each partition contains 11 bounds including NULL value.

ParametersExisting codeWith patch
Memory allocation of 'PartitionListValue'100+100 = 200 times100 times
Total number of iterations110 + 100 = 210110 + 110 = 220

As we can see in the above data, the total number of iterations are increased slightly
(When it contains NULL values. Otherwise no change) but it improves in case of 
memory allocations. As memory allocations are costly operations, I feel we should
consider changing the existing code.

Please share your thoughts.


Thanks & Regards,
Nitin Jadhav

Attachment

Re: Removed extra memory allocations from create_list_bounds

From
Justin Pryzby
Date:
On Sat, May 15, 2021 at 02:40:45PM +0530, Nitin Jadhav wrote:
> While working on [1], I observed that extra memory is allocated in
> [1]
https://mail.google.com/mail/u/2/#search/multi+column+list/KtbxLxgZZTjRxNrBWvmHzDTHXCHLssSprg?compose=CllgCHrjDqKgWCBNMmLqhzKhmrvHhSRlRVZxPCVcLkLmFQwrccpTpqLNgbWqKkTkTFCHMtZjWnV

This is a link to your gmail, not to anything public.

If it's worth counting list elements in advance, then you can also allocate the
PartitionListValue as a single chunk, rather than palloc in a loop.
This may help large partition heirarchies.

And the same thing in create_hash_bounds with hbounds.

create_range_bounds() already doesn't call palloc in a loop.  However, then
there's an asymmetry in create_range_bounds, which is still takes a
double-indirect pointer.

I'm not able to detect that this is saving more than about ~1% less RAM, to
create or select from 1000 partitions, probably because other data structures
are using much more, and savings here are relatively small.

I'm going to add to the next CF.  You can add yourself as an author, and watch
that the patch passes tests in cfbot.
https://commitfest.postgresql.org/
http://cfbot.cputube.org/

Thanks,
-- 
Justin

Attachment

Re: Removed extra memory allocations from create_list_bounds

From
Zhihong Yu
Date:


On Sun, May 16, 2021 at 10:00 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Sat, May 15, 2021 at 02:40:45PM +0530, Nitin Jadhav wrote:
> While working on [1], I observed that extra memory is allocated in
> [1] https://mail.google.com/mail/u/2/#search/multi+column+list/KtbxLxgZZTjRxNrBWvmHzDTHXCHLssSprg?compose=CllgCHrjDqKgWCBNMmLqhzKhmrvHhSRlRVZxPCVcLkLmFQwrccpTpqLNgbWqKkTkTFCHMtZjWnV

This is a link to your gmail, not to anything public.

If it's worth counting list elements in advance, then you can also allocate the
PartitionListValue as a single chunk, rather than palloc in a loop.
This may help large partition heirarchies.

And the same thing in create_hash_bounds with hbounds.

create_range_bounds() already doesn't call palloc in a loop.  However, then
there's an asymmetry in create_range_bounds, which is still takes a
double-indirect pointer.

I'm not able to detect that this is saving more than about ~1% less RAM, to
create or select from 1000 partitions, probably because other data structures
are using much more, and savings here are relatively small.

I'm going to add to the next CF.  You can add yourself as an author, and watch
that the patch passes tests in cfbot.
https://commitfest.postgresql.org/
http://cfbot.cputube.org/

Thanks,
--
Justin
Hi,
For 0001-Removed-extra-memory-allocations-from-create_list_bo.patch :

+static int
+get_non_null_count_list_bounds(PartitionBoundSpec **boundspecs, int nparts)

Since the function returns the total number of non null bound values, should it be named get_non_null_list_bounds_count ?
It can also be named get_count_of_... but that's longer.

+   all_values = (PartitionListValue **)
+       palloc(ndatums * sizeof(PartitionListValue *));

The palloc() call would take place even if ndatums is 0. I think in that case, palloc() doesn't need to be called.

Cheers

Re: Removed extra memory allocations from create_list_bounds

From
Nitin Jadhav
Date:
> > While working on [1], I observed that extra memory is allocated in
> > [1]
https://mail.google.com/mail/u/2/#search/multi+column+list/KtbxLxgZZTjRxNrBWvmHzDTHXCHLssSprg?compose=CllgCHrjDqKgWCBNMmLqhzKhmrvHhSRlRVZxPCVcLkLmFQwrccpTpqLNgbWqKkTkTFCHMtZjWnV

I am really sorry for this. I wanted to point to the thread subjected
to 'Multi-Column List Partitioning'.

> If it's worth counting list elements in advance, then you can also allocate the
> PartitionListValue as a single chunk, rather than palloc in a loop.
> This may help large partition heirarchies.
>
> And the same thing in create_hash_bounds with hbounds.

I agree and thanks for creating those patches. I am not able to apply
the patch on the latest HEAD. Kindly check and upload the modified
patches.

> I'm not able to detect that this is saving more than about ~1% less RAM, to
> create or select from 1000 partitions, probably because other data structures
> are using much more, and savings here are relatively small.

Yes it does not save huge memory but it's an improvement.

> I'm going to add to the next CF.  You can add yourself as an author, and watch
> that the patch passes tests in cfbot.
> https://commitfest.postgresql.org/
> http://cfbot.cputube.org/

Thanks for creating the commitfest entry.

> Since the function returns the total number of non null bound values, should it be named
get_non_null_list_bounds_count?
 
> It can also be named get_count_of_... but that's longer.

Changed it to 'get_non_null_list_bounds_count'.

> The palloc() call would take place even if ndatums is 0. I think in that case, palloc() doesn't need to be called.

I feel there is no such case where the 'ndatums' is 0 because as we
can see below, there is an assert in the 'partition_bounds_create'
function from where we call the 'create_list_bounds' function. Kindly
provide such a case if I am wrong.

PartitionBoundInfo
partition_bounds_create(PartitionBoundSpec **boundspecs, int nparts,
                        PartitionKey key, int **mapping)
{
    int         i;

    Assert(nparts > 0);

Thanks & Regards,
Nitin Jadhav
On Sun, May 16, 2021 at 10:52 PM Zhihong Yu <zyu@yugabyte.com> wrote:
>
>
>
> On Sun, May 16, 2021 at 10:00 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
>>
>> On Sat, May 15, 2021 at 02:40:45PM +0530, Nitin Jadhav wrote:
>> > While working on [1], I observed that extra memory is allocated in
>> > [1]
https://mail.google.com/mail/u/2/#search/multi+column+list/KtbxLxgZZTjRxNrBWvmHzDTHXCHLssSprg?compose=CllgCHrjDqKgWCBNMmLqhzKhmrvHhSRlRVZxPCVcLkLmFQwrccpTpqLNgbWqKkTkTFCHMtZjWnV
>>
>> This is a link to your gmail, not to anything public.
>>
>> If it's worth counting list elements in advance, then you can also allocate the
>> PartitionListValue as a single chunk, rather than palloc in a loop.
>> This may help large partition heirarchies.
>>
>> And the same thing in create_hash_bounds with hbounds.
>>
>> create_range_bounds() already doesn't call palloc in a loop.  However, then
>> there's an asymmetry in create_range_bounds, which is still takes a
>> double-indirect pointer.
>>
>> I'm not able to detect that this is saving more than about ~1% less RAM, to
>> create or select from 1000 partitions, probably because other data structures
>> are using much more, and savings here are relatively small.
>>
>> I'm going to add to the next CF.  You can add yourself as an author, and watch
>> that the patch passes tests in cfbot.
>> https://commitfest.postgresql.org/
>> http://cfbot.cputube.org/
>>
>> Thanks,
>> --
>> Justin
>
> Hi,
> For 0001-Removed-extra-memory-allocations-from-create_list_bo.patch :
>
> +static int
> +get_non_null_count_list_bounds(PartitionBoundSpec **boundspecs, int nparts)
>
> Since the function returns the total number of non null bound values, should it be named
get_non_null_list_bounds_count?
 
> It can also be named get_count_of_... but that's longer.
>
> +   all_values = (PartitionListValue **)
> +       palloc(ndatums * sizeof(PartitionListValue *));
>
> The palloc() call would take place even if ndatums is 0. I think in that case, palloc() doesn't need to be called.
>
> Cheers
>

Attachment

Re: Removed extra memory allocations from create_list_bounds

From
Zhihong Yu
Date:


On Mon, May 17, 2021 at 7:52 AM Nitin Jadhav <nitinjadhavpostgres@gmail.com> wrote:
> > While working on [1], I observed that extra memory is allocated in
> > [1] https://mail.google.com/mail/u/2/#search/multi+column+list/KtbxLxgZZTjRxNrBWvmHzDTHXCHLssSprg?compose=CllgCHrjDqKgWCBNMmLqhzKhmrvHhSRlRVZxPCVcLkLmFQwrccpTpqLNgbWqKkTkTFCHMtZjWnV

I am really sorry for this. I wanted to point to the thread subjected
to 'Multi-Column List Partitioning'.

> If it's worth counting list elements in advance, then you can also allocate the
> PartitionListValue as a single chunk, rather than palloc in a loop.
> This may help large partition heirarchies.
>
> And the same thing in create_hash_bounds with hbounds.

I agree and thanks for creating those patches. I am not able to apply
the patch on the latest HEAD. Kindly check and upload the modified
patches.

> I'm not able to detect that this is saving more than about ~1% less RAM, to
> create or select from 1000 partitions, probably because other data structures
> are using much more, and savings here are relatively small.

Yes it does not save huge memory but it's an improvement.

> I'm going to add to the next CF.  You can add yourself as an author, and watch
> that the patch passes tests in cfbot.
> https://commitfest.postgresql.org/
> http://cfbot.cputube.org/

Thanks for creating the commitfest entry.

> Since the function returns the total number of non null bound values, should it be named get_non_null_list_bounds_count ?
> It can also be named get_count_of_... but that's longer.

Changed it to 'get_non_null_list_bounds_count'.

> The palloc() call would take place even if ndatums is 0. I think in that case, palloc() doesn't need to be called.

I feel there is no such case where the 'ndatums' is 0 because as we
can see below, there is an assert in the 'partition_bounds_create'
function from where we call the 'create_list_bounds' function. Kindly
provide such a case if I am wrong.

PartitionBoundInfo
partition_bounds_create(PartitionBoundSpec **boundspecs, int nparts,
                        PartitionKey key, int **mapping)
{
    int         i;

    Assert(nparts > 0);

Hi,
Thanks for pointing out the assertion.
My corresponding comment can be dropped.

Cheers
 

Thanks & Regards,
Nitin Jadhav
On Sun, May 16, 2021 at 10:52 PM Zhihong Yu <zyu@yugabyte.com> wrote:
>
>
>
> On Sun, May 16, 2021 at 10:00 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
>>
>> On Sat, May 15, 2021 at 02:40:45PM +0530, Nitin Jadhav wrote:
>> > While working on [1], I observed that extra memory is allocated in
>> > [1] https://mail.google.com/mail/u/2/#search/multi+column+list/KtbxLxgZZTjRxNrBWvmHzDTHXCHLssSprg?compose=CllgCHrjDqKgWCBNMmLqhzKhmrvHhSRlRVZxPCVcLkLmFQwrccpTpqLNgbWqKkTkTFCHMtZjWnV
>>
>> This is a link to your gmail, not to anything public.
>>
>> If it's worth counting list elements in advance, then you can also allocate the
>> PartitionListValue as a single chunk, rather than palloc in a loop.
>> This may help large partition heirarchies.
>>
>> And the same thing in create_hash_bounds with hbounds.
>>
>> create_range_bounds() already doesn't call palloc in a loop.  However, then
>> there's an asymmetry in create_range_bounds, which is still takes a
>> double-indirect pointer.
>>
>> I'm not able to detect that this is saving more than about ~1% less RAM, to
>> create or select from 1000 partitions, probably because other data structures
>> are using much more, and savings here are relatively small.
>>
>> I'm going to add to the next CF.  You can add yourself as an author, and watch
>> that the patch passes tests in cfbot.
>> https://commitfest.postgresql.org/
>> http://cfbot.cputube.org/
>>
>> Thanks,
>> --
>> Justin
>
> Hi,
> For 0001-Removed-extra-memory-allocations-from-create_list_bo.patch :
>
> +static int
> +get_non_null_count_list_bounds(PartitionBoundSpec **boundspecs, int nparts)
>
> Since the function returns the total number of non null bound values, should it be named get_non_null_list_bounds_count ?
> It can also be named get_count_of_... but that's longer.
>
> +   all_values = (PartitionListValue **)
> +       palloc(ndatums * sizeof(PartitionListValue *));
>
> The palloc() call would take place even if ndatums is 0. I think in that case, palloc() doesn't need to be called.
>
> Cheers
>

Re: Removed extra memory allocations from create_list_bounds

From
Justin Pryzby
Date:
On Mon, May 17, 2021 at 08:22:25PM +0530, Nitin Jadhav wrote:
> I agree and thanks for creating those patches. I am not able to apply
> the patch on the latest HEAD. Kindly check and upload the modified
> patches.

The CFBOT had no issues with the patches, so I suspect an issue on your side.
http://cfbot.cputube.org/nitin-jadhav.html

-- 
Justin



Re: Removed extra memory allocations from create_list_bounds

From
Nitin Jadhav
Date:
> The CFBOT had no issues with the patches, so I suspect an issue on your side.
> http://cfbot.cputube.org/nitin-jadhav.html

I am getting the following error when I try to apply in my machine.

$ git apply ../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch
../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch:18:
trailing whitespace.
/*
../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch:19:
trailing whitespace.
 * get_non_null_count_list_bounds
../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch:20:
trailing whitespace.
 * Calculates the total number of non null bound values of
../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch:21:
trailing whitespace.
 * all the partitions.
../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch:22:
trailing whitespace.
 */
error: patch failed: src/backend/partitioning/partbounds.c:432
error: src/backend/partitioning/partbounds.c: patch does not apply

However I was able to apply it by adding '--reject --whitespace=fix'.

$ git apply --reject --whitespace=fix
../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch
../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch:18:
trailing whitespace.
/*
../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch:19:
trailing whitespace.
 * get_non_null_count_list_bounds
../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch:20:
trailing whitespace.
 * Calculates the total number of non null bound values of
../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch:21:
trailing whitespace.
 * all the partitions.
../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch:22:
trailing whitespace.
 */
Checking patch src/backend/partitioning/partbounds.c...
Applied patch src/backend/partitioning/partbounds.c cleanly.
warning: squelched 30 whitespace errors
warning: 35 lines add whitespace errors.

I have rebased all the patches on top of
'v2_0001-removed_extra_mem_alloc_from_create_list_bounds.patch'.
Attaching all the patches here.

--
Thanks & Regards,
Nitin Jadhav

On Mon, May 17, 2021 at 8:33 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Mon, May 17, 2021 at 08:22:25PM +0530, Nitin Jadhav wrote:
> > I agree and thanks for creating those patches. I am not able to apply
> > the patch on the latest HEAD. Kindly check and upload the modified
> > patches.
>
> The CFBOT had no issues with the patches, so I suspect an issue on your side.
> http://cfbot.cputube.org/nitin-jadhav.html
>
> --
> Justin

Attachment

Re: Removed extra memory allocations from create_list_bounds

From
Robert Haas
Date:
On Tue, May 18, 2021 at 1:29 PM Nitin Jadhav
<nitinjadhavpostgres@gmail.com> wrote:
> > The CFBOT had no issues with the patches, so I suspect an issue on your side.
> > http://cfbot.cputube.org/nitin-jadhav.html
>
> I am getting the following error when I try to apply in my machine.
>
> $ git apply ../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch
> ../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch:18:
> trailing whitespace.

'git apply' is very picky. Use 'patch -p1' to apply your patches instead.

Also, use 'git diff --check' or 'git log --check' before generating
patches to send, and fix any whitespace errors before submitting.

I see that you have made a theoretical argument for why this should be
good for performance, but it would be better to have some test results
that show that it works out in practice. Sometimes things seem like
they ought to be more efficient but turn out to be less efficient when
they are actually tried.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Removed extra memory allocations from create_list_bounds

From
Nitin Jadhav
Date:
> 'git apply' is very picky. Use 'patch -p1' to apply your patches instead.
>
> Also, use 'git diff --check' or 'git log --check' before generating
> patches to send, and fix any whitespace errors before submitting.

Thanks for the suggestion. I will follow these.

> I see that you have made a theoretical argument for why this should be
> good for performance, but it would be better to have some test results
> that show that it works out in practice. Sometimes things seem like
> they ought to be more efficient but turn out to be less efficient when
> they are actually tried.

Created a table with one column of type 'int' and partitioned by that
column. Created 1 million partitions using following queries.

create table t(a int) partition by list(a);
select 'create table t_' || i || ' partition of t for
values in (' || i || ');'
from generate_series(1, 10000) i
\gexec

After this I tried to create 10 partitions and observed the time taken
to execute. Here is the data for 'with patch'.

postgres@34077=#select 'create table t_' || i || ' partition of t for
postgres'# values in (' || i || ');'
postgres-# from generate_series(10001, 10010) i
postgres-# \gexec
CREATE TABLE
Time: 36.863 ms
CREATE TABLE
Time: 46.645 ms
CREATE TABLE
Time: 44.915 ms
CREATE TABLE
Time: 39.660 ms
CREATE TABLE
Time: 42.188 ms
CREATE TABLE
Time: 43.163 ms
CREATE TABLE
Time: 44.374 ms
CREATE TABLE
Time: 45.117 ms
CREATE TABLE
Time: 40.340 ms
CREATE TABLE
Time: 38.604 ms

The data for 'without patch' looks like this.

postgres@31718=#select 'create table t_' || i || ' partition of t for
postgres'# values in (' || i || ');'
postgres-# from generate_series(10001, 10010) i
postgres-# \gexec
CREATE TABLE
Time: 45.917 ms
CREATE TABLE
Time: 46.815 ms
CREATE TABLE
Time: 44.180 ms
CREATE TABLE
Time: 48.163 ms
CREATE TABLE
Time: 45.884 ms
CREATE TABLE
Time: 48.330 ms
CREATE TABLE
Time: 48.614 ms
CREATE TABLE
Time: 48.376 ms
CREATE TABLE
Time: 46.380 ms
CREATE TABLE
Time: 48.233 ms

If we observe above data, we can see the improvement with the patch.
The average time taken to execute for the last 10 partitions is.
With patch - 42.1869 ms
Without patch - 47.0892 ms.

With respect to memory usage, AFAIK the allocated memory gets cleaned
during deallocation of the memory used by the memory context. So at
the end of the query, we may see no difference in the memory usage but
during the query execution it tries to get less memory than before.
Maybe during some worst case scenario, if there is less memory
available, we may see 'out of memory' errors without the patch but it
may work with the patch. I have not done experiments in this angle. I
am happy to do it if required.

Please share your thoughts.

--
Thanks & Regards,
Nitin Jadhav

On Tue, May 18, 2021 at 11:19 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, May 18, 2021 at 1:29 PM Nitin Jadhav
> <nitinjadhavpostgres@gmail.com> wrote:
> > > The CFBOT had no issues with the patches, so I suspect an issue on your side.
> > > http://cfbot.cputube.org/nitin-jadhav.html
> >
> > I am getting the following error when I try to apply in my machine.
> >
> > $ git apply ../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch
> > ../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch:18:
> > trailing whitespace.
>
> 'git apply' is very picky. Use 'patch -p1' to apply your patches instead.
>
> Also, use 'git diff --check' or 'git log --check' before generating
> patches to send, and fix any whitespace errors before submitting.
>
> I see that you have made a theoretical argument for why this should be
> good for performance, but it would be better to have some test results
> that show that it works out in practice. Sometimes things seem like
> they ought to be more efficient but turn out to be less efficient when
> they are actually tried.
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com



Re: Removed extra memory allocations from create_list_bounds

From
Nitin Jadhav
Date:
> Created a table with one column of type 'int' and partitioned by that
> column. Created 1 million partitions using following queries.

Sorry. It's not 1 million. Its 10,000 partitions.

--
Thanks & Regards,
Nitin Jadhav

On Thu, May 20, 2021 at 12:21 AM Nitin Jadhav
<nitinjadhavpostgres@gmail.com> wrote:
>
> > 'git apply' is very picky. Use 'patch -p1' to apply your patches instead.
> >
> > Also, use 'git diff --check' or 'git log --check' before generating
> > patches to send, and fix any whitespace errors before submitting.
>
> Thanks for the suggestion. I will follow these.
>
> > I see that you have made a theoretical argument for why this should be
> > good for performance, but it would be better to have some test results
> > that show that it works out in practice. Sometimes things seem like
> > they ought to be more efficient but turn out to be less efficient when
> > they are actually tried.
>
> Created a table with one column of type 'int' and partitioned by that
> column. Created 1 million partitions using following queries.
>
> create table t(a int) partition by list(a);
> select 'create table t_' || i || ' partition of t for
> values in (' || i || ');'
> from generate_series(1, 10000) i
> \gexec
>
> After this I tried to create 10 partitions and observed the time taken
> to execute. Here is the data for 'with patch'.
>
> postgres@34077=#select 'create table t_' || i || ' partition of t for
> postgres'# values in (' || i || ');'
> postgres-# from generate_series(10001, 10010) i
> postgres-# \gexec
> CREATE TABLE
> Time: 36.863 ms
> CREATE TABLE
> Time: 46.645 ms
> CREATE TABLE
> Time: 44.915 ms
> CREATE TABLE
> Time: 39.660 ms
> CREATE TABLE
> Time: 42.188 ms
> CREATE TABLE
> Time: 43.163 ms
> CREATE TABLE
> Time: 44.374 ms
> CREATE TABLE
> Time: 45.117 ms
> CREATE TABLE
> Time: 40.340 ms
> CREATE TABLE
> Time: 38.604 ms
>
> The data for 'without patch' looks like this.
>
> postgres@31718=#select 'create table t_' || i || ' partition of t for
> postgres'# values in (' || i || ');'
> postgres-# from generate_series(10001, 10010) i
> postgres-# \gexec
> CREATE TABLE
> Time: 45.917 ms
> CREATE TABLE
> Time: 46.815 ms
> CREATE TABLE
> Time: 44.180 ms
> CREATE TABLE
> Time: 48.163 ms
> CREATE TABLE
> Time: 45.884 ms
> CREATE TABLE
> Time: 48.330 ms
> CREATE TABLE
> Time: 48.614 ms
> CREATE TABLE
> Time: 48.376 ms
> CREATE TABLE
> Time: 46.380 ms
> CREATE TABLE
> Time: 48.233 ms
>
> If we observe above data, we can see the improvement with the patch.
> The average time taken to execute for the last 10 partitions is.
> With patch - 42.1869 ms
> Without patch - 47.0892 ms.
>
> With respect to memory usage, AFAIK the allocated memory gets cleaned
> during deallocation of the memory used by the memory context. So at
> the end of the query, we may see no difference in the memory usage but
> during the query execution it tries to get less memory than before.
> Maybe during some worst case scenario, if there is less memory
> available, we may see 'out of memory' errors without the patch but it
> may work with the patch. I have not done experiments in this angle. I
> am happy to do it if required.
>
> Please share your thoughts.
>
> --
> Thanks & Regards,
> Nitin Jadhav
>
> On Tue, May 18, 2021 at 11:19 PM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > On Tue, May 18, 2021 at 1:29 PM Nitin Jadhav
> > <nitinjadhavpostgres@gmail.com> wrote:
> > > > The CFBOT had no issues with the patches, so I suspect an issue on your side.
> > > > http://cfbot.cputube.org/nitin-jadhav.html
> > >
> > > I am getting the following error when I try to apply in my machine.
> > >
> > > $ git apply ../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch
> > > ../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch:18:
> > > trailing whitespace.
> >
> > 'git apply' is very picky. Use 'patch -p1' to apply your patches instead.
> >
> > Also, use 'git diff --check' or 'git log --check' before generating
> > patches to send, and fix any whitespace errors before submitting.
> >
> > I see that you have made a theoretical argument for why this should be
> > good for performance, but it would be better to have some test results
> > that show that it works out in practice. Sometimes things seem like
> > they ought to be more efficient but turn out to be less efficient when
> > they are actually tried.
> >
> > --
> > Robert Haas
> > EDB: http://www.enterprisedb.com



Re: Removed extra memory allocations from create_list_bounds

From
Nitin Jadhav
Date:
> I see this as a code cleanup more than an performance optimization.

I agree with this. This is like a code cleanup but it also improves performance.
I have done the performance testing, Just to confirm whether it really improves 
performance.

> I think some of my patches could *increase* memory use, due to power-of-two
> allocation logic.  I think it's still a good idea, since it doesn't seem to be
> the dominant memory allocation.

I don't think that it will increase performance rather it adds to the improvement.

> I think you should be sure to do this within a transaction, without cassert,
> and maybe with fsync=off full_page_writes=off

Thanks for sharing this. I have done the above settings and collected the
below data.

> It'd be interesting to know which patch(es) contributed to the improvement.
> It's possible that (say) patch 0001 alone gives all the gain, or that 0002
> diminishes the gains.
>
> I think there'll be an interest in committing the smallest possible patch to
> realize the gains, to minimize code churn an unrelated changes.

In order to answer the above points, I have divided the patches into 2 sets.
1. Only 0001 and 0002 - These are related to list partitioning and do not contain 
changes related power-of-two allocation logic.
2. This contains all the 5 patches.

I have used the same testing procedure as explained in the previous mail.
Please find the timing information of the last 10 creation of partitioned
tables as given below.

Without patchWith 0001 and 0002With all patch
17.10514.72213.878
15.89714.42713.493
15.99115.42414.751
17.96516.48719.491
19.70419.04221.278
18.9818.94918.123
18.98621.71317.585
21.27320.59619.623
18.83918.52117.605
20.72418.77419.242
18.546417.865517.5069

As we can see in the above data, there is an improvement with both of the
patch sets.

> You can check MAXRSS (at least on linux) if you enable log_executor_stats,
> like:
>
> \set QUIET \\ SET client_min_messages=debug; SET log_executor_stats=on; DROP TABLE p; CREATE TABLE p(i int) PARTITION BY LIST(i); CREATE TABLE pd PARTITION OF p DEFAULT;
> SELECT format('CREATE TABLE p%s PARTITION OF p FOR VALUES IN (%s)', a,a) FROM generate_series(1,999)a;\gexec \\ SELECT;

Thanks a lot for sharing this information. It was really helpful.
I have collected the stat information for creation of 1000 
partitions. Please find the stat information for the 'without patch'
case below.

LOG:  EXECUTOR STATISTICS
DETAIL:  ! system usage stats:
! 0.000012 s user, 0.000000 s system, 0.000011 s elapsed
! [71.599426 s user, 4.362552 s system total]
! 63872 kB max resident size
! 0/0 [0/231096] filesystem blocks in/out
! 0/0 [0/2388074] page faults/reclaims, 0 [0] swaps
! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
! 0/0 [9982/6709] voluntary/involuntary context switches

Please find the stat information for 'with patch (all 5 patches)' case below.

LOG:  EXECUTOR STATISTICS
DETAIL:  ! system usage stats:
! 0.000018 s user, 0.000002 s system, 0.000013 s elapsed
! [73.529715 s user, 4.219172 s system total]
! 63152 kB max resident size
! 0/0 [0/204840] filesystem blocks in/out
! 0/0 [0/2066377] page faults/reclaims, 0 [0] swaps
! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
! 0/0 [9956/4129] voluntary/involuntary context switches

Please share your thoughts.

--
Thanks & Regards,
Nitin Jadhav



On Thu, May 20, 2021 at 12:47 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Tue, May 18, 2021 at 01:49:12PM -0400, Robert Haas wrote:
> I see that you have made a theoretical argument for why this should be
> good for performance, but it would be better to have some test results
> that show that it works out in practice. Sometimes things seem like
> they ought to be more efficient but turn out to be less efficient when
> they are actually tried.

I see this as a code cleanup more than an performance optimization.
I couldn't see a measurable difference in my tests, involving CREATE TABLE and
SELECT.

I think some of my patches could *increase* memory use, due to power-of-two
allocation logic.  I think it's still a good idea, since it doesn't seem to be
the dominant memory allocation.

On Thu, May 20, 2021 at 12:21:19AM +0530, Nitin Jadhav wrote:
> > I see that you have made a theoretical argument for why this should be
> > good for performance, but it would be better to have some test results
> > that show that it works out in practice. Sometimes things seem like
> > they ought to be more efficient but turn out to be less efficient when
> > they are actually tried.
>
> After this I tried to create 10 partitions and observed the time taken
> to execute. Here is the data for 'with patch'.
>
> postgres@34077=#select 'create table t_' || i || ' partition of t for
> postgres'# values in (' || i || ');'
> postgres-# from generate_series(10001, 10010) i
> postgres-# \gexec

I think you should be sure to do this within a transaction, without cassert,
and maybe with fsync=off full_page_writes=off.

> If we observe above data, we can see the improvement with the patch.
> The average time taken to execute for the last 10 partitions is.

It'd be interesting to know which patch(es) contributed to the improvement.
It's possible that (say) patch 0001 alone gives all the gain, or that 0002
diminishes the gains.

I think there'll be an interest in committing the smallest possible patch to
realize the gains, to minimize code churn an unrelated changes.

LIST and RANGE might need to be checked separately..

> With respect to memory usage, AFAIK the allocated memory gets cleaned
> during deallocation of the memory used by the memory context. So at
> the end of the query, we may see no difference in the memory usage but
> during the query execution it tries to get less memory than before.

You can check MAXRSS (at least on linux) if you enable log_executor_stats,
like:

\set QUIET \\ SET client_min_messages=debug; SET log_executor_stats=on; DROP TABLE p; CREATE TABLE p(i int) PARTITION BY LIST(i); CREATE TABLE pd PARTITION OF p DEFAULT;
SELECT format('CREATE TABLE p%s PARTITION OF p FOR VALUES IN (%s)', a,a) FROM generate_series(1,999)a;\gexec \\ SELECT;

--
Justin

Re: Removed extra memory allocations from create_list_bounds

From
Nitin Jadhav
Date:
> > I think some of my patches could *increase* memory use, due to power-of-two
> > allocation logic.  I think it's still a good idea, since it doesn't seem to be
> > the dominant memory allocation.
>
> I don't think that it will increase performance rather it adds to the improvement.

Sorry. Kindly ignore the above comment. I had misunderstood the statement.

Thanks & Regards,
Nitin Jadhav

On Sun, May 23, 2021 at 10:40 PM Nitin Jadhav <nitinjadhavpostgres@gmail.com> wrote:
> I see this as a code cleanup more than an performance optimization.

I agree with this. This is like a code cleanup but it also improves performance.
I have done the performance testing, Just to confirm whether it really improves 
performance.

> I think some of my patches could *increase* memory use, due to power-of-two
> allocation logic.  I think it's still a good idea, since it doesn't seem to be
> the dominant memory allocation.

I don't think that it will increase performance rather it adds to the improvement.

> I think you should be sure to do this within a transaction, without cassert,
> and maybe with fsync=off full_page_writes=off

Thanks for sharing this. I have done the above settings and collected the
below data.

> It'd be interesting to know which patch(es) contributed to the improvement.
> It's possible that (say) patch 0001 alone gives all the gain, or that 0002
> diminishes the gains.
>
> I think there'll be an interest in committing the smallest possible patch to
> realize the gains, to minimize code churn an unrelated changes.

In order to answer the above points, I have divided the patches into 2 sets.
1. Only 0001 and 0002 - These are related to list partitioning and do not contain 
changes related power-of-two allocation logic.
2. This contains all the 5 patches.

I have used the same testing procedure as explained in the previous mail.
Please find the timing information of the last 10 creation of partitioned
tables as given below.

Without patchWith 0001 and 0002With all patch
17.10514.72213.878
15.89714.42713.493
15.99115.42414.751
17.96516.48719.491
19.70419.04221.278
18.9818.94918.123
18.98621.71317.585
21.27320.59619.623
18.83918.52117.605
20.72418.77419.242
18.546417.865517.5069

As we can see in the above data, there is an improvement with both of the
patch sets.

> You can check MAXRSS (at least on linux) if you enable log_executor_stats,
> like:
>
> \set QUIET \\ SET client_min_messages=debug; SET log_executor_stats=on; DROP TABLE p; CREATE TABLE p(i int) PARTITION BY LIST(i); CREATE TABLE pd PARTITION OF p DEFAULT;
> SELECT format('CREATE TABLE p%s PARTITION OF p FOR VALUES IN (%s)', a,a) FROM generate_series(1,999)a;\gexec \\ SELECT;

Thanks a lot for sharing this information. It was really helpful.
I have collected the stat information for creation of 1000 
partitions. Please find the stat information for the 'without patch'
case below.

LOG:  EXECUTOR STATISTICS
DETAIL:  ! system usage stats:
! 0.000012 s user, 0.000000 s system, 0.000011 s elapsed
! [71.599426 s user, 4.362552 s system total]
! 63872 kB max resident size
! 0/0 [0/231096] filesystem blocks in/out
! 0/0 [0/2388074] page faults/reclaims, 0 [0] swaps
! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
! 0/0 [9982/6709] voluntary/involuntary context switches

Please find the stat information for 'with patch (all 5 patches)' case below.

LOG:  EXECUTOR STATISTICS
DETAIL:  ! system usage stats:
! 0.000018 s user, 0.000002 s system, 0.000013 s elapsed
! [73.529715 s user, 4.219172 s system total]
! 63152 kB max resident size
! 0/0 [0/204840] filesystem blocks in/out
! 0/0 [0/2066377] page faults/reclaims, 0 [0] swaps
! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
! 0/0 [9956/4129] voluntary/involuntary context switches

Please share your thoughts.

--
Thanks & Regards,
Nitin Jadhav



On Thu, May 20, 2021 at 12:47 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Tue, May 18, 2021 at 01:49:12PM -0400, Robert Haas wrote:
> I see that you have made a theoretical argument for why this should be
> good for performance, but it would be better to have some test results
> that show that it works out in practice. Sometimes things seem like
> they ought to be more efficient but turn out to be less efficient when
> they are actually tried.

I see this as a code cleanup more than an performance optimization.
I couldn't see a measurable difference in my tests, involving CREATE TABLE and
SELECT.

I think some of my patches could *increase* memory use, due to power-of-two
allocation logic.  I think it's still a good idea, since it doesn't seem to be
the dominant memory allocation.

On Thu, May 20, 2021 at 12:21:19AM +0530, Nitin Jadhav wrote:
> > I see that you have made a theoretical argument for why this should be
> > good for performance, but it would be better to have some test results
> > that show that it works out in practice. Sometimes things seem like
> > they ought to be more efficient but turn out to be less efficient when
> > they are actually tried.
>
> After this I tried to create 10 partitions and observed the time taken
> to execute. Here is the data for 'with patch'.
>
> postgres@34077=#select 'create table t_' || i || ' partition of t for
> postgres'# values in (' || i || ');'
> postgres-# from generate_series(10001, 10010) i
> postgres-# \gexec

I think you should be sure to do this within a transaction, without cassert,
and maybe with fsync=off full_page_writes=off.

> If we observe above data, we can see the improvement with the patch.
> The average time taken to execute for the last 10 partitions is.

It'd be interesting to know which patch(es) contributed to the improvement.
It's possible that (say) patch 0001 alone gives all the gain, or that 0002
diminishes the gains.

I think there'll be an interest in committing the smallest possible patch to
realize the gains, to minimize code churn an unrelated changes.

LIST and RANGE might need to be checked separately..

> With respect to memory usage, AFAIK the allocated memory gets cleaned
> during deallocation of the memory used by the memory context. So at
> the end of the query, we may see no difference in the memory usage but
> during the query execution it tries to get less memory than before.

You can check MAXRSS (at least on linux) if you enable log_executor_stats,
like:

\set QUIET \\ SET client_min_messages=debug; SET log_executor_stats=on; DROP TABLE p; CREATE TABLE p(i int) PARTITION BY LIST(i); CREATE TABLE pd PARTITION OF p DEFAULT;
SELECT format('CREATE TABLE p%s PARTITION OF p FOR VALUES IN (%s)', a,a) FROM generate_series(1,999)a;\gexec \\ SELECT;

--
Justin

Re: Removed extra memory allocations from create_list_bounds

From
Nitin Jadhav
Date:
> You checked LIST but not HASH (patches 3-4) or RANGE (patch 4-5), right?

Yes. I did not check about HASH and RANGE partitioning related patches
as the changes are mostly similar to the list partitioning related
changes.

> Another test is to show the time/memory used by SELECT.  That's far more
> important than DDL, but I think the same results would apply here, so I think
> it's not needed to test each of LIST/RANGE/HASH, nor to test every combination
> of patches.

Yes. I also feel that the same result would apply there as well.

> Note that for the MAXRSS test, you must a different postgres backend process
> for each of the tests (or else each test would never show a lower number than
> the previous test).

I have used different backend processes for each of the tests.

> Mostly it's nice to see if the memory use is more visibly
> different, or if there's an impressive improvement for this case.

I did not get this point. Kindly explain for which scenario the memory
usage test has to be done.

Thanks & Regards,
Nitin Jadhav


On Sun, May 23, 2021 at 11:16 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Sun, May 23, 2021 at 10:40:16PM +0530, Nitin Jadhav wrote:
> > I have used the same testing procedure as explained in the previous mail.
> > Please find the timing information of the last 10 creation of partitioned
> > tables as given below.
>
> > Without patch With 0001 and 0002 With all patch
> ...
> > 18.5464 17.8655 17.5069
>
> For anyone reading non-HTML email, the last line shows the averages of the
> previous 10 lines.
>
> >> LIST and RANGE might need to be checked separately..
>
> You checked LIST but not HASH (patches 3-4) or RANGE (patch 4-5), right?
>
> Another test is to show the time/memory used by SELECT.  That's far more
> important than DDL, but I think the same results would apply here, so I think
> it's not needed to test each of LIST/RANGE/HASH, nor to test every combination
> of patches.  Mostly it's nice to see if the memory use is more visibly
> different, or if there's an impressive improvement for this case.
>
> Note that for the MAXRSS test, you must a different postgres backend process
> for each of the tests (or else each test would never show a lower number than
> the previous test).
>
> Thanks,
> --
> Justin



Re: Removed extra memory allocations from create_list_bounds

From
David Rowley
Date:
On Wed, 19 May 2021 at 05:28, Nitin Jadhav
<nitinjadhavpostgres@gmail.com> wrote:
> I have rebased all the patches on top of
> 'v2_0001-removed_extra_mem_alloc_from_create_list_bounds.patch'.
> Attaching all the patches here.

I had a look over these and I think what's being done here is fine.

I think this will help speed up building the partition bound.
Unfortunately, it won't help any for speeding up things like finding
the correct partition during SELECT  or DML on partitioned tables.
The reason for this is that RelationBuildPartitionDesc first builds
the partition bound using the functions being modified here, but it
then copies it into the relcache into a memory context for the
partition using partition_bounds_copy().  It looks like
partition_bounds_copy() suffers from the same palloc explosion type
problem as is being fixed in each of the create_*_bounds() functions
here.   The good news is, we can just give partition_bounds_copy() the
same treatment. 0004 does that.

I tried to see if the better cache locality of the
PartitionBoundInfo's datum array would help speed up inserts into a
partitioned table. I figured a fairly small binary search in a LIST
partitioned table of 10 partitions might have all Datum visited all on
the same cache line. However, I was unable to see any performance
gains. I think the other work being done is likely just going to drown
out any gains in cache efficiency in the binary search.  COPY into a
partitioned table might have more chance of becoming a little faster,
but I didn't try.

I've attached another set of patches. I squashed all the changes to
each create_*_bounds function into a patch of their own. Perhaps 0002
and 0003 which handle range and hash partitioning can be one patch
since Justin seemed to write that one.  I kept 0001 separate since
that's Nitin's patch plus Justin's extra parts. It seems easier to
credit properly having the patches broken out like this.  I think it's
excessive to break down 0001 into Nitin and Justin's individual parts.

I did make a few adjustments to the patches renaming a variable or two
and I changed how we assign the boundinfo->datums[i] pointers to take
the address of the Nth element rather than incrementing the variable
pointing to the array each item.  I personally like p = &array[i];
more than p = array; array++, others may not feel the same.

Nitin and Justin, are you both able to have another look over these
and let me know what you think.  If all is well I'd like to push all 4
patches.

David

Attachment

Re: Removed extra memory allocations from create_list_bounds

From
Justin Pryzby
Date:
On Tue, Jul 06, 2021 at 01:48:52AM +1200, David Rowley wrote:
> On Wed, 19 May 2021 at 05:28, Nitin Jadhav <nitinjadhavpostgres@gmail.com> wrote:
> > I have rebased all the patches on top of
> > 'v2_0001-removed_extra_mem_alloc_from_create_list_bounds.patch'.
> > Attaching all the patches here.
> 
> I had a look over these and I think what's being done here is fine.

Thanks for loooking.

0001 is missing a newline before create_list_bounds()

0003 is missing pfree(all_bounds), which I had as 0005.
It 1) allocates all_bounds; 2) allocates rbounds; 3) copies all_bounds into
rbounds; 4) allocates boundDatums; 5) copies rbounds into boundDatums; 6) frees
rbounds; 7) returns boundInfo with boundinfo->datums.

> The good news is, we can just give partition_bounds_copy() the same
> treatment. 0004 does that.

+1

> I've attached another set of patches. I squashed all the changes to
> each create_*_bounds function into a patch of their own. Perhaps 0002
> and 0003 which handle range and hash partitioning can be one patch
> since Justin seemed to write that one.  I kept 0001 separate since
> that's Nitin's patch plus Justin's extra parts. It seems easier to
> credit properly having the patches broken out like this.  I think it's
> excessive to break down 0001 into Nitin and Justin's individual parts.

If you wanted to further squish the patches together, I don't mind being a
co-author.

Cheers,
-- 
Justin



Re: Removed extra memory allocations from create_list_bounds

From
Justin Pryzby
Date:
Also, if you're going to remove the initializations here, maybe you'd also
change i and j to C99 "for" declarations like "for (int i=0, j=0; ...)"

-       PartitionListValue **all_values = NULL;
-       ListCell   *cell;
-       int                     i = 0;
-       int                     ndatums = 0;
+       PartitionListValue *all_values;
+       int                     i;
+       int                     j;
+       int                     ndatums;

Same in get_non_null_list_datum_count()

-- 
Justin



Re: Removed extra memory allocations from create_list_bounds

From
David Rowley
Date:
On Tue, 6 Jul 2021 at 04:45, Justin Pryzby <pryzby@telsasoft.com> wrote:
> If you wanted to further squish the patches together, I don't mind being a
> co-author.

Thanks for looking at the patches.

I fixed the couple of things that you mentioned and pushed all 4
patches as a single commit (53d86957e)

David



Re: Removed extra memory allocations from create_list_bounds

From
David Rowley
Date:
On Tue, 6 Jul 2021 at 05:03, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> Also, if you're going to remove the initializations here, maybe you'd also
> change i and j to C99 "for" declarations like "for (int i=0, j=0; ...)"
>
> -       PartitionListValue **all_values = NULL;
> -       ListCell   *cell;
> -       int                     i = 0;
> -       int                     ndatums = 0;
> +       PartitionListValue *all_values;
> +       int                     i;
> +       int                     j;
> +       int                     ndatums;
>
> Same in get_non_null_list_datum_count()

I tend to only get motivated to use that for new code that does not
exist in back-branches.  I'll maybe stop doing that when we no longer
have to support the pre-C99 versions of the code.

David