Home > mailing lists

Re: pg_dump versus hash partitioning - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: pg_dump versus hash partitioning
Date	February 1, 2023 23:38:44
Msg-id	CA+TgmoYfPuZKqvLtV_DP2fp+JotOPVUsG=ytEvhg9mq4iayF_w@mail.gmail.com Whole thread Raw
In response to	Re: pg_dump versus hash partitioning (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: pg_dump versus hash partitioning Re: pg_dump versus hash partitioning
List	pgsql-hackers

Tree view

On Wed, Feb 1, 2023 at 1:23 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Well, that was what I thought too to start with, but I now think that
> it is far too narrow-minded a view of the problem.  The real issue
> is something I said that you trimmed:
>
> >> In general, we've never thought that hash values are
> >> required to be consistent across platforms.
>
> hashenum() is doing something that is perfectly fine in the context
> of hash indexes, which have traditionally been our standard of how
> reproducible hash values need to be.  Hash partitioning has moved
> those goalposts, and if there was any discussion of the consequences
> of that, I didn't see it.

Right, I don't think it was ever discussed, and I think that was a
mistake on my part.

> In the meantime, I think we need to recognize that hash values are
> not very portable.  I do not think we do our users a service by
> letting them discover the corner cases the hard way.

I think you're not really engaging with the argument that "not
completely portable" and "totally broken" are two different things,
and I still think that's an important point here. One idea that I had
is to add a flag somewhere to indicate whether a particular opclass or
opfamily is suitable for hash partitioning, or perhaps better, an
alternative to opcdefault that sets the default for partitioning,
which could be different from the default for indexing. Then we could
either prohibit this case, or make it work. Of course we would have to
define what "suitable for hash partitioning" means, but "would be
likely to survive a dump and reload on the same machine without any
software changes" is probably a reasonable minimum standard.

I don't think the fact that our *traditional* standard for how stable
a hash function needs to be has been XYZ carries any water. Needs
change over time, and we adapt the code to meet the new needs. Since
we have no system for type properties in PostgreSQL -- a design
decision I find questionable -- we tie all such properties to operator
classes. That's why, for example, we have HASHEXTENDED_PROC, even
though hash indexes don't need 64-bit hash values or a seed. We added
that for hash partitioning, and it's now used in other places too,
because 32-bits aren't enough for everything just because they're
enough for hash indexes, and seeds are handy. That's also why we have
BTINRANGE_PROC, which doesn't exist to support btree indexes, but
rather window frames. The fact that a certain feature requires us to
graft some additional stuff into the operator class/family mechanism,
or that it doesn't quite work with everything that's already part of
that mechanism, isn't an argument against the feature. That's just how
we do things around here. Indeed, if somebody, instead of implementing
hash partitioning by tying it into hash opfamilies, were to make up
some completely new hashing infrastructure that had exactly the
properties they wanted for partitioning, that would be *totally*
unacceptable and surely a reason for rejecting such a feature
outright. The fact that it tries to make use of the existing
infrastructure is a good thing about that feature, not a bad thing,
even though it is turning out that there are some problems.

On the question of whether hash partitioning is a good feature in
general, I can only say that I disagree with what seems to be your
position, which as best as I can tell is "it sucks and we should kill
it with fire". I do think that partitioning in general leaves a lot to
be desired in PostgreSQL in general, and perhaps the issues are even
worse for hash partitioning than they are elsewhere. However, I think
that the solution to that is for people to keep putting more work into
making it better, not to give up and say "ah, partitioning (or hash
partitioning specifically) is a stupid feature that nobody wants". To
think that, you have to be living in a bubble. It's unfortunate that
with all the work that so many people have put into this area we don't
have better results to show for it, but AFAICS there's no help for
that but to keep hacking. Amit Langote's work on locking before
pruning, for example, will be hugely impactful for some kinds of
queries if it gets committed, and it's been a long time coming, partly
because so many other problems needed to be sorted out first. But you
can't run the simplest workload with any kind of partitioning, range,
hash, whatever, and not run into that problem immediately.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Andres Freund
Date: 01 February 2023, 23:37:28
Subject: Re: Non-superuser subscription owners

From: Jacob Champion
Date: 01 February 2023, 23:44:53
Subject: Re: [PATCH] CF app: add "Returned: Needs more interest"

Re: pg_dump versus hash partitioning - Mailing list pgsql-hackers

Previous

Next