Thread: reorganizing partitioning code (was: Re: [HACKERS] path toward fasterpartition pruning)

Re-posting my earlier email to start a new thread.

On 2018/02/09 2:58, Alvaro Herrera wrote:> Robert Haas wrote:
>> On Wed, Feb 7, 2018 at 3:42 AM, Ashutosh Bapat
>> <ashutosh.bapat@enterprisedb.com> wrote:
>
>>> partition.c seems to have two kinds of functions 1. that build and
>>> manage relcache, creates quals from bounds etc. which are metadata
>>> management kind 2. partition bound comparison functions, and other
>>> optimizer related functions. May be we should divide the file that
>>> way. The first category code remains in catalog/ as it is today. The
>>> second catagory functions move to optimizer/.
>>
>> It would be sensible to separate functions that build and manage data
>> in the relcache from other functions.  I think we should consider
>> moving the existing functions of that type from partition.c to
>> src/backend/utils/cache/partcache.c.
>
> FWIW I've been thinking that perhaps we need some other separation of
> code better than statu quo.  The current partition.c file includes stuff
> for several modules and ISTM all these new patches are making more and
> more of a mess.  So +1 to the general idea of splitting things up.
> Maybe partcache.c is not ambitious enough, but it seems a good first
> step.

Agree with the proposed reorganizing and adding a partcache.c, which I
tried to do in the attached patch.

* The new src/backend/utils/cache/partcache.c contains functions that
initialize relcache's partitioning related fields.  Various partition
bound comparison and search functions (and then some) that work off of the
cached information are moved.  Also, since we cache partition qual,
interface functions RelationGetPartitioQual(Relation) and
get_partition_qual_relid(Oid) are moved too.

* The new src/include/utils/partcache.h contains various struct
definitions that are moved from backend/catalog/partition.c,
include/catalog/partition.h, and include/utils/rel.h.  Also, declarations
of interface functions of partcache.c.

Thoughts?

Thanks,
Amit

Attachment
On 2018/02/13 23:08, Ashutosh Bapat wrote:
> On Tue, Feb 13, 2018 at 2:17 PM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>>
>> Agree with the proposed reorganizing and adding a partcache.c, which I
>> tried to do in the attached patch.
>>
>> * The new src/backend/utils/cache/partcache.c contains functions that
>> initialize relcache's partitioning related fields.  Various partition
>> bound comparison and search functions (and then some) that work off of the
>> cached information are moved.
>
> Are you moving partition bound comparison functions to partcache.c?
> They will also used by optimizer, so may be leave them out of
> partcache.c?

Yes, I moved the partition bound comparison and search functions, because
I thought that they should live with the other code that manages the
cached information.  So, I moved not only the code that reads the catalogs
and builds the partition key, partition (bound) descriptor, and partition
qual (for individual partitions), but also the code that uses those
structures.

So, with the new arrangement, optimizer will include utils/partcache.h,
instead of catalog/partition.h.

Thanks,
Amit



On 2018/02/14 10:00, Amit Langote wrote:
> Agree with the proposed reorganizing and adding a partcache.c, which I
> tried to do in the attached patch.
> 
> * The new src/backend/utils/cache/partcache.c contains functions that
> initialize relcache's partitioning related fields.  Various partition
> bound comparison and search functions (and then some) that work off of the
> cached information are moved.  Also, since we cache partition qual,
> interface functions RelationGetPartitioQual(Relation) and
> get_partition_qual_relid(Oid) are moved too.
> 
> * The new src/include/utils/partcache.h contains various struct
> definitions that are moved from backend/catalog/partition.c,
> include/catalog/partition.h, and include/utils/rel.h.  Also, declarations
> of interface functions of partcache.c.

Attached updated version, where I removed #include "catalog/partition.h"
from even more places and also moved map_partition_varattnos() next to
map_variable_attnos() in rewriteManip.c.

Now, after applying the patch -

Files #including partition.h

  File                                Line
0 src/backend/catalog/heap.c          44 #include "catalog/partition.h"
1 src/backend/catalog/partition.c     22 #include "catalog/partition.h"
2 src/backend/commands/indexcmds.c    26 #include "catalog/partition.h"
3 src/backend/commands/tablecmds.c    33 #include "catalog/partition.h"
4 src/backend/utils/cache/partcache.c 23 #include "catalog/partition.h"


Files #including partcache.h

  File                                   Line
0 src/backend/optimizer/path/joinrels.c  24 #include "utils/partcache.h"
1 src/backend/optimizer/prep/prepunion.c 51 #include "utils/partcache.h"
2 src/backend/optimizer/util/relnode.c   30 #include "utils/partcache.h"
3 src/backend/utils/cache/partcache.c    37 #include "utils/partcache.h"
4 src/backend/utils/cache/relcache.c     83 #include "utils/partcache.h"
5 src/include/executor/execPartition.h   19 #include "utils/partcache.h"
6 src/include/utils/rel.h                27 #include "utils/partcache.h"

Thanks,
Amit

Attachment
This is looking attractive.

Please don't #include postgres.h in partcache.h.  That's per policy.

Why do you need to #include parsenodes.h in partcache.h?

I think rewriteManip.h can do with just relcache.h rather than rel.h
(probably partition.h can do likewise)

thanks

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


BTW may I suggest that

    git config diff.algorithm=histogram

results in better diffs?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: reorganizing partitioning code

From
Amit Langote
Date:
On 2018/02/15 5:30, Alvaro Herrera wrote:
> BTW may I suggest that
> 
>     git config diff.algorithm=histogram
> 
> results in better diffs?

Aha!  That's much better.

Thanks,
Amit



Re: reorganizing partitioning code

From
Amit Langote
Date:
Thanks for the review.

On 2018/02/15 5:25, Alvaro Herrera wrote:
> This is looking attractive.
> 
> Please don't #include postgres.h in partcache.h.  That's per policy.

Oops, fixed.

> Why do you need to #include parsenodes.h in partcache.h?

I thought it was needed because there was this:

extern void check_new_partition_bound(char *relname, Relation parent,
                          PartitionBoundSpec *spec);

in partcache.h and PartitionBoundSpec is defined in parsenodes.h.
Removing the #include turned out to be fine, that is, after I put it in
partition.h instead.

> I think rewriteManip.h can do with just relcache.h rather than rel.h
> (probably partition.h can do likewise)

Hmm.  map_partition_varattnos() that I moved to rewriteManip.c wants to
use RelationGetDescr(), but partition.h was fine.

Attached updated patch.

Thanks,
Amit

Attachment

Re: reorganizing partitioning code

From
Amit Langote
Date:
Added to CF here: https://commitfest.postgresql.org/17/1520/

Thanks,
Amit



Re: reorganizing partitioning code

From
Kyotaro HORIGUCHI
Date:
Hello. I'd like to make a humble comment.

At Thu, 15 Feb 2018 19:31:47 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<8906c861-ea47-caee-c860-eff8d7e1dbc0@lab.ntt.co.jp>
> Added to CF here: https://commitfest.postgresql.org/17/1520/

The reorganization adds/removes header files to/from .[ch]
files. That can easily leave useless includes and they're hardly
noticed. Following files touched in this patch have such useless
includes after this patch.

On the other hand, naive decision of this kind of cleanup can
lead to curruption. [1] So, I don't insisit that the all of the
following *should* amended, especially for rel.h.

[1] https://www.postgresql.org/message-id/6748.1518711125@sss.pgh.pa.us


==== nodeModifyTable.c:
> +#include "rewrite/rewriteManip.h"

rewriteManip.h is changed to include rel.h so rel.h is no longer
need to be included in the file. (It is also included in lmgr.h
so it was needless since before this patch, though.)

==== hba.c:
> +#include "catalog/objectaddress.h"

objectaddress.h includes acl.h so acl.h is no longer useful.

==== joinrels.c:
> +#include "utils/partcache.h"

partcache.h includes lsyscache.h.

==== prepunion.c:
> +#include "utils/partcache.h"

partcache.h includes lsyscache.h and partcache.h is included in
rel.h. So partcache.h and lsyscache.h are not required.

==== utility.c:
> +#include "utils/rel.h"

rel.h includes xlog.h (and xlog.h includes fd.h). The last two
are removable.

==== partcache.c:
parsenodes.h is included from some other files.
rel.h is included from rewriteManip.h.
partcache.h is included from rel.h
As the result, parsenodes.h, rel.h, partcache.h are not required.

==== relcache.c:
> +#include "utils/partcache.h"

lsyscache.h is included by partcache.h.

==== rel.h:
> +#include "utils/partcache.h"

partcache.h includes fmgr.h and relcache.h so the last two are
no longer useful.


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: reorganizing partitioning code

From
Amit Langote
Date:
Horiguchi-san,

On 2018/02/16 14:07, Kyotaro HORIGUCHI wrote:
> Hello. I'd like to make a humble comment.

Thanks a lot for taking a look.

> The reorganization adds/removes header files to/from .[ch]
> files. That can easily leave useless includes and they're hardly
> noticed. Following files touched in this patch have such useless
> includes after this patch.

Yes, I agree.

> On the other hand, naive decision of this kind of cleanup can
> lead to curruption. [1] So, I don't insisit that the all of the
> following *should* amended, especially for rel.h.
> 
> [1] https://www.postgresql.org/message-id/6748.1518711125@sss.pgh.pa.us

I was initially trying limit this patch's #include churn so that it only
touches .[ch] files that are somehow involved with partitioning, but ended
up with a lot of files touched.  To prevent confusion and concerns, I have
updated the patch to keep the churn to minimum as best as I could.

So, while the v3 patch looked like this:

24 files changed, 2373 insertions(+), 2312 deletions(-)
 create mode 100644 src/backend/utils/cache/partcache.c
 create mode 100644 src/include/utils/partcache.h

v4 looks like this:

15 files changed, 2314 insertions(+), 2263 deletions(-)
 create mode 100644 src/backend/utils/cache/partcache.c
 create mode 100644 src/include/utils/partcache.h

> ==== nodeModifyTable.c:
>> +#include "rewrite/rewriteManip.h"
> 
> rewriteManip.h is changed to include rel.h so rel.h is no longer
> need to be included in the file. (It is also included in lmgr.h
> so it was needless since before this patch, though.)

On second thought, I decided to hold back on moving
map_partition_varattnos() to rewriteManip.c, because its interface is
unlike any other functions in that file, requiring us to include rel.h in
rewriteManip.h.  So, I removed this #include from nodeModifyTable.c.

> ==== hba.c:
>> +#include "catalog/objectaddress.h"
> 
> objectaddress.h includes acl.h so acl.h is no longer useful.

This and a few others were necessitated by removing partition.h from
executor.h.  For now, I'm putting partition.h back into executor.h,
although it would be nice to remove it eventually.

> ==== joinrels.c:
>> +#include "utils/partcache.h"
> 
> partcache.h includes lsyscache.h.

Moved lsyscache.h from partcache.h to partcache.c.

> ==== prepunion.c:
>> +#include "utils/partcache.h"
> 
> partcache.h includes lsyscache.h and partcache.h is included in
> rel.h. So partcache.h and lsyscache.h are not required.

Oops, why did I even include partcache.h in prepunion.c!

Removed.

> ==== utility.c:
>> +#include "utils/rel.h"
> 
> rel.h includes xlog.h (and xlog.h includes fd.h). The last two
> are removable.

I've reverted the change that necessitated this and so this one.

> ==== partcache.c:
> parsenodes.h is included from some other files.
> rel.h is included from rewriteManip.h.
> partcache.h is included from rel.h
> As the result, parsenodes.h, rel.h, partcache.h are not required.

Removed.

> ==== relcache.c:
>> +#include "utils/partcache.h"
> 
> lsyscache.h is included by partcache.h.

lsyscache.h was moved from partcache.h per above, so keeping here.

> ==== rel.h:
>> +#include "utils/partcache.h"
> 
> partcache.h includes fmgr.h and relcache.h so the last two are
> no longer useful.

Removed both.

Attached updated version.

Thanks,
Amit

Attachment

Re: Re: reorganizing partitioning code

From
David Steele
Date:
Hi Amit,

On 2/16/18 3:36 AM, Amit Langote wrote:
> 
> Attached updated version.

This patch no longer applies and the conflicts do not appear to be trivial.

I'm a bit confused about your comment in [1]:

> I gave up on rebasing this patch yesterday as I couldn't finish it in
> 5 minutes, but maybe I will try later this month.  Gotta focus on
> thefaster pruning stuff for now...

How much later are we talking about?

Marked Waiting on Author.

-- 
-David
david@pgmasters.net

[1]
https://www.postgresql.org/message-id/33098109-9ef1-9594-e7d5-0977a50f8cfa%40lab.ntt.co.jp


Re: Re: reorganizing partitioning code

From
Amit Langote
Date:
Hi David.

On Fri, Mar 2, 2018 at 11:53 PM, David Steele <david@pgmasters.net> wrote:
> Hi Amit,
>
> On 2/16/18 3:36 AM, Amit Langote wrote:
>>
>> Attached updated version.
>
> This patch no longer applies and the conflicts do not appear to be trivial.
>
> I'm a bit confused about your comment in [1]:
>
>> I gave up on rebasing this patch yesterday as I couldn't finish it in
>> 5 minutes, but maybe I will try later this month.  Gotta focus on
>> thefaster pruning stuff for now...
>
> How much later are we talking about?

Sorry about the confusing comment.  It could be sometime later half of
the next week.

Thanks,
Amit


Re: Re: Re: reorganizing partitioning code

From
David Steele
Date:
Hi Amit,

On 3/2/18 11:17 AM, Amit Langote wrote:
> 
> On Fri, Mar 2, 2018 at 11:53 PM, David Steele <david@pgmasters.net> wrote:
>> Hi Amit,
>>
>> On 2/16/18 3:36 AM, Amit Langote wrote:
>>>
>>> Attached updated version.
>>
>> This patch no longer applies and the conflicts do not appear to be trivial.
>>
>> I'm a bit confused about your comment in [1]:
>>
>>> I gave up on rebasing this patch yesterday as I couldn't finish it in
>>> 5 minutes, but maybe I will try later this month.  Gotta focus on
>>> thefaster pruning stuff for now...
>>
>> How much later are we talking about?
> 
> Sorry about the confusing comment.  It could be sometime later half of
> the next week.

We are now three weeks into the CF with no new patch.

Are you planning to update this patch?  If not, I think it should be
marked as Returned with Feedback and submitted to the next CF once it
has been updated.

Regards,
-- 
-David
david@pgmasters.net


Re: Re: Re: reorganizing partitioning code

From
Alvaro Herrera
Date:
David Steele wrote:

> > Sorry about the confusing comment.  It could be sometime later half of
> > the next week.
> 
> We are now three weeks into the CF with no new patch.
> 
> Are you planning to update this patch?  If not, I think it should be
> marked as Returned with Feedback and submitted to the next CF once it
> has been updated.

This is no new development, only code movement.  I think it would be
worse to have three different branches of partitioning code, v10
"basic", v11 "powerful but not reorganized", v12 "reorganized".  I'd
rather have only v10 "basic" and v11+ "powerful".

Let's keep this entry open till the last minute.  

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: reorganizing partitioning code

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> David Steele wrote:
>> Are you planning to update this patch?  If not, I think it should be
>> marked as Returned with Feedback and submitted to the next CF once it
>> has been updated.

> This is no new development, only code movement.  I think it would be
> worse to have three different branches of partitioning code, v10
> "basic", v11 "powerful but not reorganized", v12 "reorganized".  I'd
> rather have only v10 "basic" and v11+ "powerful".

> Let's keep this entry open till the last minute.  

Nonetheless, it's March 21.  David's point is that it's time to get a
move on.

            regards, tom lane


Re: reorganizing partitioning code

From
Amit Langote
Date:
On 2018/03/22 2:33, Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
>> David Steele wrote:
>>> Are you planning to update this patch?  If not, I think it should be
>>> marked as Returned with Feedback and submitted to the next CF once it
>>> has been updated.
> 
>> This is no new development, only code movement.  I think it would be
>> worse to have three different branches of partitioning code, v10
>> "basic", v11 "powerful but not reorganized", v12 "reorganized".  I'd
>> rather have only v10 "basic" and v11+ "powerful".
> 
>> Let's keep this entry open till the last minute.  
> 
> Nonetheless, it's March 21.  David's point is that it's time to get a
> move on.

I'm sorry it took me a while to reply on this thread.

Due to quite a few changes to the partitioning-related code recently and
also considering some pending patches which might touch the code moved
around by this patch, I'd been putting off rebasing this patch.  Although,
I should have said that before without waiting until today to do so.  Sorry.

FWIW, I did manage to rebase it this morning and posting it here.

Thanks,
Amit

Attachment

Re: Re: Re: reorganizing partitioning code

From
Robert Haas
Date:
On Wed, Mar 21, 2018 at 10:20 AM, Alvaro Herrera
<alvherre@alvh.no-ip.org> wrote:
> Let's keep this entry open till the last minute.

Ugh, I don't like keeping things open till the last minute all that
much, especially if they're not being updated.  But since this has
been updated I guess it's somewhat moot.

Are you going to pick this up RSN?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Re: Re: reorganizing partitioning code

From
Alvaro Herrera
Date:
Robert Haas wrote:
> On Wed, Mar 21, 2018 at 10:20 AM, Alvaro Herrera
> <alvherre@alvh.no-ip.org> wrote:
> > Let's keep this entry open till the last minute.
> 
> Ugh, I don't like keeping things open till the last minute all that
> much, especially if they're not being updated.  But since this has
> been updated I guess it's somewhat moot.
> 
> Are you going to pick this up RSN?

If during next week qualifies as RSN, then yes.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: reorganizing partitioning code

From
Amit Langote
Date:
On 2018/03/22 11:45, Amit Langote wrote:
> FWIW, I did manage to rebase it this morning and posting it here.

Rebased again.

I started wondering if we should separate out stuff related to partition
bounds.  That is create a utils/partbound.h and put PartitionBoundInfo and
related comparison and search functions into a utils/adt/partbound.c.  I
had started thinking about that when I looked at the code added by the
patch submitted on the "advanced partition matching algorithm for
partition-wise join" thread [1].  I haven't done anything about that though.

Thanks,
Amit

[1] https://commitfest.postgresql.org/17/1553/

Attachment

Re: reorganizing partitioning code

From
Robert Haas
Date:
On Wed, Mar 28, 2018 at 12:07 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On 2018/03/22 11:45, Amit Langote wrote:
>> FWIW, I did manage to rebase it this morning and posting it here.
>
> Rebased again.
>
> I started wondering if we should separate out stuff related to partition
> bounds.  That is create a utils/partbound.h and put PartitionBoundInfo and
> related comparison and search functions into a utils/adt/partbound.c.  I
> had started thinking about that when I looked at the code added by the
> patch submitted on the "advanced partition matching algorithm for
> partition-wise join" thread [1].  I haven't done anything about that though.

adt = Abstract Data Type, which I think we've interpreted up until now
to mean an SQL-visible data type, so that seems like an odd choice.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company