Re: Declarative partitioning - another take - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: Declarative partitioning - another take |
Date | |
Msg-id | 1b9d028b-26fe-a91a-7d6d-234f4665dee9@lab.ntt.co.jp Whole thread Raw |
In response to | Re: Declarative partitioning - another take (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Declarative partitioning - another take
Re: Declarative partitioning - another take |
List | pgsql-hackers |
On 2016/11/02 2:34, Robert Haas wrote: > On Fri, Oct 28, 2016 at 3:53 AM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> [ new patches ] > > Reviewing 0006: Thanks for the review! > This patch seems scary. I sort of assumed from the title -- "Teach a > few places to use partition check quals." -- that this was an optional > thing, some kind of optimization from which we could reap further > advantage once the basic infrastructure was in place. But it's not > that at all. It's absolutely necessary that we do this, or data > integrity is fundamentally compromised. How do we know that we've > found all of the places that need to be taught about these new, > uncatalogued constraints? Making this a separate commit from 0003 was essentially to avoid this getting lost among all of its other changes. In fact, it was to bring to notice for closer scrutiny whether all the sites in the backend code that are critical for data integrity in face of the implicit partition constraints are being informed about those constraints. > I'm feeling fairly strongly like you should rewind and make the > partitioning constraints normal catalogued constraints. That's got a > number of advantages, most notably that we can be sure they will be > properly enforced by the entire system (modulo existing bugs, of > course). Also, they'll show up automatically in tools like psql's \d > output, pgAdmin, and anything else that is accustomed to being able to > find constraints in the catalog. We do need to make sure that those > constraints can't be dropped (or altered?) inappropriately, but that's > a relatively small problem. If we stick with the design you've got > here, every client tool in the world needs to be updated, and I'm not > seeing nearly enough advantage in this system to justify that kind of > upheaval. As for which parts of the system need to know about these implicit partition constraints to *enforce* them for data integrity, we could say that it's really just one site - ExecConstraints() called from ExecInsert()/ExecUpdate(). Admittedly, the current error message style as in this patch exposes the implicit constraint approach to a certain criticism: "ERROR: new row violates the partition boundary specification of \"%s\"". It would say the following if it were a named constraint: "ERROR: new row for relation \"%s\" violates check constraint \"%s\"" For constraint exclusion optimization, we teach get_relation_constraints() to look at these constraints. Although greatly useful, it's not the case of being absolutely critical. Beside the above two cases, there is bunch of code (relcache, DDL) that looks at regular constraints, but IMHO, we need not let any of that code concern itself with the implicit partition constraints. Especially, I wonder why the client tools should want the implicit partitioning constraint to be shown as a CHECK constraint? As the proposed patch 0004 (psql) currently does, isn't it better to instead show the partition bounds as follows? +CREATE TABLE part_b PARTITION OF parted ( + b WITH OPTIONS NOT NULL DEFAULT 1 CHECK (b >= 0), + CONSTRAINT check_a CHECK (length(a) > 0) +) FOR VALUES IN ('b'); +\d part_b + Table "public.part_b" + Column | Type | Modifiers +--------+---------+-------------------- + a | text | + b | integer | not null default 1 +Partition of: parted FOR VALUES IN ('b') +Check constraints: + "check_a" CHECK (length(a) > 0) + "part_b_b_check" CHECK (b >= 0) Needless to say, that could save a lot of trouble thinking about generating collision-free names of these constraints, their dependency handling, unintended altering of these constraints, pg_dump, etc. > In fact, as far as I can see, the only advantage of this approach is > that when the insert arrives through the parent and is routed to the > child by whatever tuple-routing code we end up with (I guess that's > what 0008 does), we get to skip checking the constraint, saving CPU > cycles. That's probably an important optimization, but I don't think > that putting the partitioning constraint in the catalog in any way > rules out the possibility of performing that optimization. It's just > that instead of having the partitioning excluded-by-default and then > sometimes choosing to include it, you'll have it included-by-default > and then sometimes choose to exclude it. Hmm, doesn't it seem like we would be making *more* modifications to the existing code (backend or otherwise) to teach it about excluding the implicit partition constraints than the other way around? The other way around being to modify the existing code to include the implicit constraints which is what this patch is about. Having said all that, I am open to switching to the catalogued partition constraints if the arguments I make above in favor of this patch are not all that sound. Thanks, Amit
pgsql-hackers by date: