Re: [HACKERS] Adding support for Default partition in partitioning - Mailing list pgsql-hackers

From Jeevan Ladhe
Subject Re: [HACKERS] Adding support for Default partition in partitioning
Date
Msg-id CAOgcT0N0UnhvsrTmjENHF2hVTtSgY-NYNHxDrimxJuRZF6rx4A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Adding support for Default partition in partitioning  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] Adding support for Default partition in partitioning
List pgsql-hackers
Hi Robert,

I have tried to address your comments in the V22 set of patches[1].
Please find my feedback inlined on your comments.


On Thu, Jun 15, 2017 at 1:21 AM, Robert Haas <robertmhaas@gmail.com> wrote:
- Needs to be rebased over b08df9cab777427fdafe633ca7b8abf29817aa55.

Rebased on master latest commit: ca793c59a51e94cedf8cbea5c29668bf8fa298f3
 
- Still no documentation.
Yes, this is long pending, and I will make this is a priority to get it included
in next set of my patches.

- Should probably be merged with the patch to add default partitioning
for ranges.

Beena is already rebasing her patch on my latest patches, so I think getting
them merged here won't be an issue, mostly will be just like one more patch
on top my patches.
  
Other stuff I noticed:

- The regression tests don't seem to check that the scan-skipping
logic works as expected.  We have regression tests for that case for
attaching regular partitions, and it seems like it would be worth
testing the default-partition case as well.

Added a test case for default in alter_table.sql.

- check_default_allows_bound() assumes that if
canSkipPartConstraintValidation() fails for the default partition, it
will also fail for every subpartition of the default partition.  That
is, once we commit to scanning one child partition, we're committed to
scanning them all.  In practice, that's probably not a huge
limitation, but if it's not too much code, we could keep the top-level
check but also check each partitioning individually as we reach it,
and skip the scan for any individual partitions for which the
constraint can be proven.  For example, suppose the top-level table is
list-partitioned with a partition for each of the most common values,
and then we range-partition the default partition.
 
I have tried to address this in patch 0007, please let me know your views on
that patch.

- The changes to the regression test results in 0004 make the error
messages slightly worse.  The old message names the default partition,
whereas the new one does not.  Maybe that's worth avoiding.
 
The only way for this, I can think of to achieve this is to store the name of
the default relation in AlteredTableInfo, currently I am using a flag for
realizing if the scanned table is a default partition to throw specific error.
But, IMO storing a string in AlteredTableInfo just for error purpose might be
overkill. Your suggestions? 
 
Specific comments:

+ * Also, invalidate the parent's and a sibling default partition's relcache,
+ * so that the next rebuild will load the new partition's info into parent's
+ * partition descriptor and default partition constraints(which are dependent
+ * on other partition bounds) are built anew.

I find this a bit unclear, and it also repeats the comment further
down.  Maybe something like: Also, invalidate the parent's relcache
entry, so that the next rebuild will load he new partition's info into
its partition descriptor.  If there is a default partition, we must
invalidate its relcache entry as well.

Done.
 
+    /*
+     * The default partition constraints depend upon the partition bounds of
+     * other partitions. Adding a new(or even removing existing) partition
+     * would invalidate the default partition constraints. Invalidate the
+     * default partition's relcache so that the constraints are built anew and
+     * any plans dependent on those constraints are invalidated as well.
+     */

Here, I'd write: The partition constraint for the default partition
depends on the partition bounds of every other partition, so we must
invalidate the relcache entry for that partition every time a
partition is added or removed.

Done.
 
+                    /*
+                     * Default partition cannot be added if there already
+                     * exists one.
+                     */
+                    if (spec->is_default)
+                    {
+                        overlap = partition_bound_has_default(boundinfo);
+                        with = boundinfo->default_index;
+                        break;
+                    }

To support default partitioning for range, this is going to have to be
moved above the switch rather than done inside of it.  And there's
really no downside to putting it there.

Done.
 
+ * constraint, by *proving* that the existing constraints of the table
+ * *imply* the given constraints.  We include the table's check constraints and

Both the comma and the asterisks are unnecessary.

Done.
 
+ * Check whether all rows in the given table (scanRel) obey given partition

obey the given

I think the larger comment block could be tightened up a bit, like
this:  Check whether all rows in the given table obey the given
partition constraint; if so, it can be attached as a partition.  We do
this by scanning the table (or all of its leaf partitions) row by row,
except when the existing constraints are sufficient to prove that the
new partitioning constraint must already hold.

Done.
 
+    /* Check if we can do away with having to scan the table being attached. */

If possible, skip the validation scan.

Fixed.
 
+     * Set up to have the table be scanned to validate the partition
+     * constraint If it's a partitioned table, we instead schedule its leaf
+     * partitions to be scanned.

I suggest: Prepare to scan the default partition (or, if it is itself
partitioned, all of its leaf partitions).

Done.
 
+    int         default_index;  /* Index of the default partition if any; -1
+                                 * if there isn't one */

"if any" is a bit redundant with "if there isn't one"; note the
phrasing of the preceding entry.

Done.
 
+        /*
+         * Skip if it's a partitioned table. Only RELKIND_RELATION relations
+         * (ie, leaf partitions) need to be scanned.
+         */
+        if (part_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE ||
+            part_rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE)

The comment talks about what must be included in our list of things to
scan, but the code tests for the things that can be excluded.  I
suspect the comment has the right idea and the code should be adjusted
to match, but anyway they should be consistent.  Also, the correct way
to punctuate i.e. is like this: (i.e. leaf partitions) You should have
a period after each letter, but no following comma.

Done.
 
+     * The default partition must be already having an AccessExclusiveLock.

I think we should instead change DefineRelation to open (rather than
just lock) the default partition and pass the Relation as an argument
here so that we need not reopen it.

I have fixed this as a part of patch 0006.
 
+            /* Construct const from datum */
+            val = makeConst(key->parttypid[0],
+                            key->parttypmod[0],
+                            key->parttypcoll[0],
+                            key->parttyplen[0],
+                            *boundinfo->datums[i],
+                            false,      /* isnull */
+                            key->parttypbyval[0] /* byval */ );

The /* byval */ comment looks a bit redundant, but I think this could
use a comment along the lines of: /* Only single-column list
partitioning is supported, so we only need to worry about the
partition key with index 0. */  And I'd also add an Assert() verifying
the the partition key has exactly 1 column, so that this breaks a bit
more obviously if someone removes that restriction in the future.
 
Removed the /* byval */ comment.
The assert is taken care as part of commit 5efccc1cb43005a832776ed9158d2704fd976f8f. 


+         * Handle NULL partition key here if there's a null-accepting list
+         * partition, else later it will be routed to the default partition if
+         * one exists.

This isn't a great update of the existing comment -- it's drifted from
explaining the code to which it is immediately attached to a more
general discussion of NULL handling.  I'd just say something like: If
this is a NULL, send it to the null-accepting partition.  Otherwise,
route by searching the array of partition bounds.

Done.
 
+                if (tab->is_default_partition)
+                    ereport(ERROR,
+                            (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+                             errmsg("updated partition constraint for
default partition would be violated by some row")));
+                else
+                    ereport(ERROR,
+                            (errcode(ERRCODE_CHECK_VIOLATION),

While there's room for debate about the correct error code here, it's
hard for me to believe that it's correct to use one error code for the
is_default_partition case and a different error code the rest of the
time.

Per discussion here[2], I had changed this error code, but as of now I have
restored this to ERRCODE_CHECK_VIOLATION to be consistent with the error when
non-default partition being attached has some existing row that violates
partition constraints. Similarly, for consistency I have changed this in
check_default_allows_bound() too.
I agree that there is still a room for debate here after this change too, and
also this change reverts the suggestion by Ashutosh.

+         * previously cached default partition constraints; those constraints
+         * won't stand correct after addition(or even removal) of a partition.

won't be correct after addition or removal

Done.
 

+         * allow any row that qualifies for this new partition. So, check if
+         * the existing data in the default partition satisfies this *would be*
+         * default partition constraint.

check that the existing data in the default partition satisfies the
constraint as it will exist after adding this partition

Done.
 

+     * Need to take a lock on the default partition, refer comment for locking
+     * the default partition in DefineRelation().

I'd say: We must also lock the default partition, for the same reasons
explained in DefineRelation().

And similarly in the other places that refer to that same comment.

Done.
 

+    /*
+     * In case of the default partition, the constraint is of the form
+     * "!(result)" i.e. one of the following two forms:
+     * 1. NOT ((keycol IS NULL) OR (keycol = ANY (arr)))
+     * 2. NOT ((keycol IS NOT NULL) AND (keycol = ANY (arr))), where arr is an
+     * array of datums in boundinfo->datums.
+     */

Does this survive pgindent?  You might need to surround the comment
with dashes to preserve formatting.

Yes, this din't survive pg_indent, but even adding dashes '--' did not make
the deal(may be I misunderstood the workaround), I have instead added
blank line in the bullets.

I think it would be worth adding a little more text this comment,
something like this: Note that, in general, applying NOT to a
constraint expression doesn't necessarily invert the set of rows it
accepts, because NOT NULL is NULL.  However, the partition constraints
we construct here never evaluate to NULL, so applying NOT works as
intended.

Added.
 
+     * Check whether default partition has a row that would fit the partition
+     * being attached by negating the partition constraint derived from the
+     * bounds. Since default partition is already part of the partitioned
+     * table, we don't need to validate the constraints on the partitioned
+     * table.

Here again, I'd add to the end of the first sentence a parenthetical
note, like this: ...from the bounds (the partition constraint never
evaluates to NULL, so negating it like this is safe).

Done.
 
I don't understand the second sentence.  It seems to contradict the first one.

Fixed, I removed the second sentence.
 
+extern List *get_default_part_validation_constraint(List *new_part_constaints);
 #endif   /* PARTITION_H */

There should be a blank line after the last prototype and before the #endif.

+-- default partition table when it is being used in cahced plan.

Typo.
 
Fixed.

Thanks,
Jeevan Ladhe
 

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: [HACKERS] [WIP] Zipfian distribution in pgbench
Next
From: Alexander Korotkov
Date:
Subject: Re: [HACKERS] Double shared memory allocation for SLRU LWLocks