Re: When to use PARTITION BY HASH? - Mailing list pgsql-general

From David G. Johnston
Subject Re: When to use PARTITION BY HASH?
Date
Msg-id CAKFQuwaDAP=sN=ds0HoJeds=T23D5=6Gq1TK8v+F7h6ctGvTvA@mail.gmail.com
Whole thread Raw
In response to When to use PARTITION BY HASH?  (Oleksandr Shulgin <oleksandr.shulgin@zalando.de>)
List pgsql-general
On Tue, Jun 2, 2020 at 10:17 AM Oleksandr Shulgin <oleksandr.shulgin@zalando.de> wrote:
That *might* turn out to be the case with a small number of distinct values in the partitioning column(s), but then why rely on hash assignment instead of using PARTITION BY LIST in the first place?


Why the cross-posting? (-performance is oriented toward problem solving, not theory, so -general is the one and only PostgreSQL list this should have been sent to)

Anyway, quoting the documentation you linked to:

"When choosing how to partition your table, it's also important to consider what changes may occur in the future. For example, if you choose to have one partition per customer and you currently have a small number of large customers, consider the implications if in several years you instead find yourself with a large number of small customers. In this case, it may be better to choose to partition by HASH and choose a reasonable number of partitions rather than trying to partition by LIST and hoping that the number of customers does not increase beyond what it is practical to partition the data by."

Hashing does indeed preclude some of the benefits and introduces others.

I suspect that having a hash function that turns its input into a different output and checking for equality on the output would be better than trying to "OR" a partition list together in order to combine multiple inputs onto the same table.

David J.

pgsql-general by date:

Previous
From: MichaelDBA
Date:
Subject: Re: When to use PARTITION BY HASH?
Next
From: Michel Pelletier
Date:
Subject: Re: When to use PARTITION BY HASH?