Re: [HACKERS] Hash Functions - Mailing list pgsql-hackers
From | Jeff Davis |
---|---|
Subject | Re: [HACKERS] Hash Functions |
Date | |
Msg-id | CAMp0ubcQ3VYdU1kNUCOmpj225U4hk6ZEoaUVeReP8h60p+mv1Q@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Hash Functions (David Fetter <david@fetter.org>) |
Responses |
Re: [HACKERS] Hash Functions
(Robert Haas <robertmhaas@gmail.com>)
Re: [HACKERS] Hash Functions (David Fetter <david@fetter.org>) Re: [HACKERS] Hash Functions (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>) Re: [HACKERS] Hash Functions (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>) |
List | pgsql-hackers |
On Mon, May 15, 2017 at 1:04 PM, David Fetter <david@fetter.org> wrote: > As the discussion has devolved here, it appears that there are, at > least conceptually, two fundamentally different classes of partition: > public, which is to say meaningful to DB clients, and "private", used > for optimizations, but otherwise opaque to DB clients. > > Mashing those two cases together appears to cause more problems than > it solves. I concur at this point. I originally thought hash functions might be made portable, but I think Tom and Andres showed that to be too problematic -- the issue with different encodings is the real killer. But I also believe hash partitioning is important and we shouldn't give up on it yet. That means we need to have a concept of hash partitions that's different from range/list partitioning. The terminology "public"/"private" does not seem appropriate. Logical/physical or external/internal might be better. With hash partitioning: * User only specifies number of partitions of the parent table; does not specify individual partition properties (modulus, etc.) * Dump/reload goes through the parent table (though we may provide options so pg_dump/restore can optimize this) * We could provide syntax to adjust the number of partitions, which would be expensive but still useful sometimes. * All DDL should be on the parent table, including check constraints, FKs, unique constraints, exclusion constraints, indexes, etc. - Unique and exclusion constraints would only be permittedif the keys are a superset of the partition keys. - FKs would only be permitted if the two table's partition schemes match and the keys are members of the same hash opfamily (this could be relaxed slightly, but it gets a little confusing if so) * No attach/detach of partitions * All partitions have the same permissions * Individual partitions would only be individually-addressable for maintenance (like reindex and vacuum), but not for arbitrary queries - perhaps also COPY for bulk loading/dumping, in casewe get clients smart enough to do their own hashing. The only real downside is that it could surprise users -- why can I add a CHECK constraint on my range-partitioned table but not the hash-partitioned one? We should try to document this so users don't find that out too far along. As long as they aren't surprised, I think users will understand why these aren't quite the same concepts. Regards, Jeff Davis
pgsql-hackers by date: