Re: [HACKERS] Hash Functions - Mailing list pgsql-hackers

From David Fetter
Subject Re: [HACKERS] Hash Functions
Date
Msg-id 20170516165943.GB11076@fetter.org
Whole thread Raw
In response to Re: [HACKERS] Hash Functions  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
On Tue, May 16, 2017 at 08:10:39AM -0700, Jeff Davis wrote:
> On Mon, May 15, 2017 at 1:04 PM, David Fetter <david@fetter.org> wrote:
> > As the discussion has devolved here, it appears that there are, at
> > least conceptually, two fundamentally different classes of partition:
> > public, which is to say meaningful to DB clients, and "private", used
> > for optimizations, but otherwise opaque to DB clients.
> >
> > Mashing those two cases together appears to cause more problems than
> > it solves.
> 
> I concur at this point. I originally thought hash functions might be
> made portable, but I think Tom and Andres showed that to be too
> problematic -- the issue with different encodings is the real killer.
> 
> But I also believe hash partitioning is important and we shouldn't
> give up on it yet.
> 
> That means we need to have a concept of hash partitions that's
> different from range/list partitioning. The terminology
> "public"/"private" does not seem appropriate. Logical/physical or
> external/internal might be better.

I'm not attached to any particular terminology.

> With hash partitioning:
> * User only specifies number of partitions of the parent table; does
> not specify individual partition properties (modulus, etc.)

Maybe this is over-thinking it, but I'm picturing us ending up with
something along the lines of:
   PARTITION BY INTERNAL(EXPRESSION)   [ WITH ( [PARAMETERS] [, AS, APPROPRIATE] ) ]

i.e. it's not clear that we should wire in "number of partitions" as a
parameter.

In a not that distant future, ANALYZE and similar could have a say in
determining both the "how" and the "whether" of partitioning.

> * Dump/reload goes through the parent table (though we may provide
> options so pg_dump/restore can optimize this)

Would it be simplest to default to routing through the immediate
ancestor for now?

It occurs to me that with the opaque partition system we're designing
here, internal partitions would necessarily be leaves in the tree.

> * We could provide syntax to adjust the number of partitions, which
> would be expensive but still useful sometimes.

Yep.  I suspect that techniques for this are described in literature,
and possibly even in code bases.  Any pointers?

> * All DDL should be on the parent table, including check constraints,
> FKs, unique constraints, exclusion constraints, indexes, etc.

Necessarily.

>   - Unique and exclusion constraints would only be permitted if the
> keys are a superset of the partition keys.

"Includes either all of the partition expression or none of it,"
maybe?

>   - FKs would only be permitted if the two table's partition schemes
> match and the keys are members of the same hash opfamily (this could
> be relaxed slightly, but it gets a little confusing if so)

Relaxing sounds like a not-in-the-first-cut feature, and subtle.

> * No attach/detach of partitions

Since they're opaque, this is the only sane thing.

> * All partitions have the same permissions

Since they're opaque, this is the only sane thing.

> * Individual partitions would only be individually-addressable for
> maintenance (like reindex and vacuum), but not for arbitrary queries

Since they're opaque, this is the only sane thing.

>   - perhaps also COPY for bulk loading/dumping, in case we get clients
> smart enough to do their own hashing.

This is appealing from a resource allocation point of view in the
sense of deciding where the hash computing resources are spent.  Do we
want something like the NOT VALID/VALIDATE infrastructure to support
it?

> The only real downside is that it could surprise users -- why can I
> add a CHECK constraint on my range-partitioned table but not the
> hash-partitioned one? We should try to document this so users don't
> find that out too far along. As long as they aren't surprised, I think
> users will understand why these aren't quite the same concepts.

+1

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] [POC] hash partitioning