Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
Date
Msg-id CA+Tgmoa9o2YmZwJgiW_a73mQFg36j+6ndaexOmfvEPHkFD5JWQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
List pgsql-hackers
So I am looking at this part of 0008:

+       /*
+        * Do not copy parent_rinfo and child_rinfos because 1. they create a
+        * circular dependency between child and parent RestrictInfo 2. dropping
+        * those links just means that we loose some memory
optimizations. 3. There
+        * is a possibility that the child and parent RestrictInfots
themselves may
+        * have got copied and thus the old links may no longer be valid. The
+        * caller may set up those links itself, if needed.
+        */

I don't think that it's very clear whether or not this is safe.  I
experimented with making _copyRestrictInfo PANIC, which,
interestingly, does not affect the core regression tests at all, but
does trip on this bit from the postgres_fdw tests:

-- subquery using stable function (can't be sent to remote)
PREPARE st2(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN
(SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c4) =
'1970-01-17'::date) ORDER BY c1;
EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st2(10, 20);

I'm not sure why this particular case is affected when so many others
are not, and the comment doesn't help me very much in figuring it out.

Why do we need this cache in the RestrictInfo, anyway?  Aside from the
comment above, I looked at the comment in the RestrictInfo struct, and
I looked at the comment in build_child_restrictinfo, and I looked at
the comment in build_child_clauses, and I looked at the place where
build_child_clauses is called in set_append_rel_size, and none of
those places explain why we need this cache.  I would assume we'd need
a separate translation of the RestrictInfo for every separate
child-join, so how does the cache help?

Maybe the answer is that build_child_clauses() is also called from
try_partition_wise_join() and add_paths_to_child_joinrel(), and those
three call sights all end up producing the same set of translated
RestrictInfos.  But if that's the case, somehow it seems like we ought
to be producing these in one place where we can get convenient access
to them from each child join, rather than having to search through
this cache to find it.  It's a pretty inefficient cache: it takes O(n)
time to search it, I think, where n is the number of partitions.  And
you do O(n) searches.  So it's an O(n^2) algorithm, which is a little
unfortunate.  Can't we affix the translated RestrictInfos someplace
where they can be found more efficiently?

Yet another thing that the comments don't explain is why the existing
adjust_appendrel_attrs call needs to be replaced with
build_child_clauses.

So I feel, overall, that the point of all of this is not explained well at all.

...Robert



pgsql-hackers by date:

Previous
From: Haribabu Kommi
Date:
Subject: Re: [HACKERS] pg_stat_wal_write statistics view
Next
From: David Rowley
Date:
Subject: Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock