Re: [HACKERS] [POC] hash partitioning - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] [POC] hash partitioning
Date
Msg-id CA+Tgmob7RsN5A=ehgYbLPx--c5CmptrK-dB=Y-v--o+TKyfteA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [POC] hash partitioning  (Yugo Nagata <nagata@sraoss.co.jp>)
Responses Re: [HACKERS] [POC] hash partitioning  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Apr 17, 2017 at 10:50 AM, Yugo Nagata <nagata@sraoss.co.jp> wrote:
> I thought that the partition constraint could be decided every
> time a new partition is created or attached, and that it woule be
> needed to relocate records automatically when the partition configuration
> changes. However, I have come to think that the automatic relocation
> might not be needed at this point.

Great!  I am glad that we are in agreement about this point.  However,
actually I think the problem is worse than you are supposing.  If
you're restoring from a database dump created by pg_dump, then we will
try to load data into each individual partition using COPY.  Direct
insertions into individual partitions are not subject to tuple routing
-- that only affects inserts into the parent table.  So if the
partition constraint is not correct immediately after creating the
table, the COPY which tries to repopulate that partition will probably
fail with an ERROR, because there will likely be at least one row
(probably many) which match the "final" partition constraint but not
the "interim" partition constraint that we'd have after recreating
some but not all of the hash partitions.  For example, if we had
created 2 partitions so far out of a total of 3, we'd think the
constraint ought to be (hashvalue % 2) == 1 rather than (hashvalue %
3) == 1, which obviously will likely lead to the dump failing to
restore properly.

So, I think we really need something like the syntax in Amul's patch
in order for this to work at all.  Of course, the details can be
changed according to what seems best but I think the overall picture
is about right.

There is another point that I think also needs thought; not sure if
either your patch or Amit's patch handles it: constraint exclusion
will not work for hash partitioning.  For example, if the partitioning
constraint for each partition is of the form (hash(partcol) % 6) ==
SOME_VALUE_BETWEEN_0_AND_5, and the query contains the predicate
partcol == 37, constraint exclusion will not be able to prove anything
about which partitions need to be scanned.  Amit Langote has noted a
few times that partitioning relies on constraint exclusion *for now*,
which implies, I think, that he's thought about changing it to work
differently.  I think that would be a good idea.  For range
partitioning or list partitioning, a special-purpose mechanism for
partitioning could be much faster than constraint exclusion, since it
knows that partcol == 37 can only be true for one partition and can
reuse the tuple-routing infrastructure to figure out which one it is.
And that approach can also work for hash partitioning, where
constraint exclusion is useless.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] OK, so culicidae is *still* broken