Thread: HASH partitioning not working properly
Hi,
After seeing the below, I feel partitioning is not working properly or it maybe case that my understanding is wrong. Can somebody explain me what is happening?
postgres=# create table busbar_version (objectid int, ver_id int) partition by hash(ver_id);
CREATE TABLE
postgres=# CREATE TABLE busbar_version0 PARTITION OF busbar_version FOR VALUES WITH (MODULUS 10, REMAINDER 0);
CREATE TABLE
postgres=# CREATE TABLE busbar_version5 PARTITION OF busbar_version FOR VALUES WITH (MODULUS 10, REMAINDER 5);
CREATE TABLE
postgres=# CREATE TABLE busbar_version6 PARTITION OF busbar_version FOR VALUES WITH (MODULUS 10, REMAINDER 6);
CREATE TABLE
postgres=# CREATE TABLE busbar_version7 PARTITION OF busbar_version FOR VALUES WITH (MODULUS 10, REMAINDER 7);
CREATE TABLE
CREATE TABLE
postgres=# CREATE TABLE busbar_version0 PARTITION OF busbar_version FOR VALUES WITH (MODULUS 10, REMAINDER 0);
CREATE TABLE
postgres=# CREATE TABLE busbar_version5 PARTITION OF busbar_version FOR VALUES WITH (MODULUS 10, REMAINDER 5);
CREATE TABLE
postgres=# CREATE TABLE busbar_version6 PARTITION OF busbar_version FOR VALUES WITH (MODULUS 10, REMAINDER 6);
CREATE TABLE
postgres=# CREATE TABLE busbar_version7 PARTITION OF busbar_version FOR VALUES WITH (MODULUS 10, REMAINDER 7);
CREATE TABLE
I did insert using following:
postgres=# insert into busbar_version(objectid,ver_id) values (5,5);
INSERT 0 1
postgres=# insert into busbar_version(objectid,ver_id) values (6,6);
INSERT 0 1
INSERT 0 1
postgres=# insert into busbar_version(objectid,ver_id) values (6,6);
INSERT 0 1
I was of the opinion that the above rows were inserted into busbar_version5 and busbar_version6, but I'm wrong.
postgres=# select * from busbar_version;
objectid | ver_id
----------+--------
5 | 5
6 | 6
(2 rows)
postgres=# select * from busbar_version5;
objectid | ver_id
----------+--------
5 | 5
(1 row)
objectid | ver_id
----------+--------
5 | 5
6 | 6
(2 rows)
postgres=# select * from busbar_version5;
objectid | ver_id
----------+--------
5 | 5
(1 row)
postgres=# select * from busbar_version6;
objectid | ver_id
----------+--------
(0 rows)
objectid | ver_id
----------+--------
(0 rows)
postgres=# select * from busbar_version7;
objectid | ver_id
----------+--------
6 | 6
(1 row)
objectid | ver_id
----------+--------
6 | 6
(1 row)
Why second insert has gone to table busbar_version7 instead of busbar_version6?
If it helps, I am trying on the "psql (12.3 (Debian 12.3-1.pgdg100+1))" container.
Regards,
Seenu.
On Fri, 19 Jun 2020 at 17:42, Srinivasa T N <seenutn@gmail.com> wrote: > After seeing the below, I feel partitioning is not working properly or it maybe case that my understanding is wrong. Can somebody explain me what is happening? > postgres=# select * from busbar_version6; > objectid | ver_id > ----------+-------- > (0 rows) > > postgres=# select * from busbar_version7; > objectid | ver_id > ----------+-------- > 6 | 6 > (1 row) > > Why second insert has gone to table busbar_version7 instead of busbar_version6? It's your understanding that's not correct. The value of is passed through a hash function and the partition is selected based partition matching the remainder value after dividing the return value of the hash function by the largest modulus of any partition. That might surprise you, but how would you select which partition a varchar value should go into if you didn't use a hash function. David
On Fri, Jun 19, 2020 at 11:44 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 19 Jun 2020 at 17:42, Srinivasa T N <seenutn@gmail.com> wrote:
> After seeing the below, I feel partitioning is not working properly or it maybe case that my understanding is wrong. Can somebody explain me what is happening?
> postgres=# select * from busbar_version6;
> objectid | ver_id
> ----------+--------
> (0 rows)
>
> postgres=# select * from busbar_version7;
> objectid | ver_id
> ----------+--------
> 6 | 6
> (1 row)
>
> Why second insert has gone to table busbar_version7 instead of busbar_version6?
It's your understanding that's not correct. The value of is passed
through a hash function and the partition is selected based partition
matching the remainder value after dividing the return value of the
hash function by the largest modulus of any partition.
That might surprise you, but how would you select which partition a
varchar value should go into if you didn't use a hash function.
David
How can I see the output of hash function that is used internally?
Regards,
Seenu.
On Fri, 2020-06-19 at 12:12 +0530, Srinivasa T N wrote: > On Fri, Jun 19, 2020 at 11:44 AM David Rowley <dgrowleyml@gmail.com> wrote: > > On Fri, 19 Jun 2020 at 17:42, Srinivasa T N <seenutn@gmail.com> wrote: > > > After seeing the below, I feel partitioning is not working properly or it maybe case that my understanding is wrong. Can somebody explain me what is happening? > > > > It's your understanding that's not correct. The value of is passed > > through a hash function and the partition is selected based partition > > matching the remainder value after dividing the return value of the > > hash function by the largest modulus of any partition. > > > > That might surprise you, but how would you select which partition a > > varchar value should go into if you didn't use a hash function. > > > > David > > How can I see the output of hash function that is used internally? In the case of "integer", the hash function is "pg_catalog"."hashint4". Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
On Fri, Jun 19, 2020 at 12:34 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Fri, 2020-06-19 at 12:12 +0530, Srinivasa T N wrote:
> On Fri, Jun 19, 2020 at 11:44 AM David Rowley <dgrowleyml@gmail.com> wrote:
> > On Fri, 19 Jun 2020 at 17:42, Srinivasa T N <seenutn@gmail.com> wrote:
> > > After seeing the below, I feel partitioning is not working properly or it maybe case that my understanding is wrong. Can somebody explain me what is happening?
> >
> > It's your understanding that's not correct. The value of is passed
> > through a hash function and the partition is selected based partition
> > matching the remainder value after dividing the return value of the
> > hash function by the largest modulus of any partition.
> >
> > That might surprise you, but how would you select which partition a
> > varchar value should go into if you didn't use a hash function.
> >
> > David
>
> How can I see the output of hash function that is used internally?
In the case of "integer", the hash function is "pg_catalog"."hashint4".
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
I guess output formatting is wrong, any help?
postgres=# select pg_catalog.hashint4(7);
hashint4 ------------
-978793473
(1 row)
Regards,
Seenu.
On Fri, Jun 19, 2020 at 1:28 PM Srinivasa T N <seenutn@gmail.com> wrote: > > On Fri, Jun 19, 2020 at 12:34 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: >> >> On Fri, 2020-06-19 at 12:12 +0530, Srinivasa T N wrote: >> > On Fri, Jun 19, 2020 at 11:44 AM David Rowley <dgrowleyml@gmail.com> wrote: >> > > On Fri, 19 Jun 2020 at 17:42, Srinivasa T N <seenutn@gmail.com> wrote: >> > > > After seeing the below, I feel partitioning is not working properly or it maybe case that my understanding iswrong. Can somebody explain me what is happening? >> > > >> > > It's your understanding that's not correct. The value of is passed >> > > through a hash function and the partition is selected based partition >> > > matching the remainder value after dividing the return value of the >> > > hash function by the largest modulus of any partition. >> > > >> > > That might surprise you, but how would you select which partition a >> > > varchar value should go into if you didn't use a hash function. >> > > >> > > David >> > >> > How can I see the output of hash function that is used internally? >> >> In the case of "integer", the hash function is "pg_catalog"."hashint4". >> >> Yours, >> Laurenz Albe >> -- >> Cybertec | https://www.cybertec-postgresql.com >> > I guess output formatting is wrong, any help? > > postgres=# select pg_catalog.hashint4(7); > hashint4 > ------------ > -978793473 > (1 row) > Instead of direct hash function, the easiest way to use satisfies_hash_partition() what is used in defining hash partitioning constraint. You can see the partition constraint by description partition table i.e. use \d+ busbar_version5. Regards, Amul
On Fri, Jun 19, 2020 at 3:09 PM Amul Sul <sulamul@gmail.com> wrote:
On Fri, Jun 19, 2020 at 1:28 PM Srinivasa T N <seenutn@gmail.com> wrote:
>
> On Fri, Jun 19, 2020 at 12:34 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>>
>> On Fri, 2020-06-19 at 12:12 +0530, Srinivasa T N wrote:
>> > On Fri, Jun 19, 2020 at 11:44 AM David Rowley <dgrowleyml@gmail.com> wrote:
>> > > On Fri, 19 Jun 2020 at 17:42, Srinivasa T N <seenutn@gmail.com> wrote:
>> > > > After seeing the below, I feel partitioning is not working properly or it maybe case that my understanding is wrong. Can somebody explain me what is happening?
>> > >
>> > > It's your understanding that's not correct. The value of is passed
>> > > through a hash function and the partition is selected based partition
>> > > matching the remainder value after dividing the return value of the
>> > > hash function by the largest modulus of any partition.
>> > >
>> > > That might surprise you, but how would you select which partition a
>> > > varchar value should go into if you didn't use a hash function.
>> > >
>> > > David
>> >
>> > How can I see the output of hash function that is used internally?
>>
>> In the case of "integer", the hash function is "pg_catalog"."hashint4".
>>
>> Yours,
>> Laurenz Albe
>> --
>> Cybertec | https://www.cybertec-postgresql.com
>>
> I guess output formatting is wrong, any help?
>
> postgres=# select pg_catalog.hashint4(7);
> hashint4
> ------------
> -978793473
> (1 row)
>
Instead of direct hash function, the easiest way to use
satisfies_hash_partition() what is used in defining hash
partitioning constraint.
You can see the partition constraint by description partition table i.e.
use \d+ busbar_version5.
Regards,
Amul
Sorry, I did not get you.
My current \d+ is
Table "test.busbar_version6"
Column | Type | Collation | Nullable | Default | Storage | Stats target |
Description
----------+---------+-----------+----------+---------+---------+--------------+-
------------
objectid | integer | | | | plain | |
ver_id | integer | | | | plain | |
Partition of: busbar_version FOR VALUES WITH (modulus 10, remainder 6)
Partition constraint: satisfies_hash_partition('16397'::oid, 10, 6, ver_id)
Access method: heap
Regards,
Seenu.
On Fri, Jun 19, 2020 at 3:50 PM Srinivasa T N <seenutn@gmail.com> wrote: > > > > On Fri, Jun 19, 2020 at 3:09 PM Amul Sul <sulamul@gmail.com> wrote: >> >> On Fri, Jun 19, 2020 at 1:28 PM Srinivasa T N <seenutn@gmail.com> wrote: >> > >> > On Fri, Jun 19, 2020 at 12:34 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: >> >> >> >> On Fri, 2020-06-19 at 12:12 +0530, Srinivasa T N wrote: >> >> > On Fri, Jun 19, 2020 at 11:44 AM David Rowley <dgrowleyml@gmail.com> wrote: >> >> > > On Fri, 19 Jun 2020 at 17:42, Srinivasa T N <seenutn@gmail.com> wrote: >> >> > > > After seeing the below, I feel partitioning is not working properly or it maybe case that my understandingis wrong. Can somebody explain me what is happening? >> >> > > >> >> > > It's your understanding that's not correct. The value of is passed >> >> > > through a hash function and the partition is selected based partition >> >> > > matching the remainder value after dividing the return value of the >> >> > > hash function by the largest modulus of any partition. >> >> > > >> >> > > That might surprise you, but how would you select which partition a >> >> > > varchar value should go into if you didn't use a hash function. >> >> > > >> >> > > David >> >> > >> >> > How can I see the output of hash function that is used internally? >> >> >> >> In the case of "integer", the hash function is "pg_catalog"."hashint4". >> >> >> >> Yours, >> >> Laurenz Albe >> >> -- >> >> Cybertec | https://www.cybertec-postgresql.com >> >> >> > I guess output formatting is wrong, any help? >> > >> > postgres=# select pg_catalog.hashint4(7); >> > hashint4 >> > ------------ >> > -978793473 >> > (1 row) >> > >> Instead of direct hash function, the easiest way to use >> satisfies_hash_partition() what is used in defining hash >> partitioning constraint. >> >> You can see the partition constraint by description partition table i.e. >> use \d+ busbar_version5. >> >> Regards, >> Amul > > > Sorry, I did not get you. > > My current \d+ is > > postgres=# \d+ busbar_version6; > Table "test.busbar_version6" > Column | Type | Collation | Nullable | Default | Storage | Stats target | > Description > ----------+---------+-----------+----------+---------+---------+--------------+- > ------------ > objectid | integer | | | | plain | | > ver_id | integer | | | | plain | | > Partition of: busbar_version FOR VALUES WITH (modulus 10, remainder 6) > Partition constraint: satisfies_hash_partition('16397'::oid, 10, 6, ver_id) > Access method: heap > By executing "SELECT satisfies_hash_partition('16397'::oid, 10, 6, <VALUE>) " will tell you whether <VALUE> fits in the partition having modulus 10 and remainder 6 or not. Regards, Amul
On Fri, Jun 19, 2020, 5:45 PM Amul Sul <sulamul@gmail.com> wrote:
On Fri, Jun 19, 2020 at 3:50 PM Srinivasa T N <seenutn@gmail.com> wrote:
>
>
>
> On Fri, Jun 19, 2020 at 3:09 PM Amul Sul <sulamul@gmail.com> wrote:
>>
>> On Fri, Jun 19, 2020 at 1:28 PM Srinivasa T N <seenutn@gmail.com> wrote:
>> >
>> > On Fri, Jun 19, 2020 at 12:34 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>> >>
>> >> On Fri, 2020-06-19 at 12:12 +0530, Srinivasa T N wrote:
>> >> > On Fri, Jun 19, 2020 at 11:44 AM David Rowley <dgrowleyml@gmail.com> wrote:
>> >> > > On Fri, 19 Jun 2020 at 17:42, Srinivasa T N <seenutn@gmail.com> wrote:
>> >> > > > After seeing the below, I feel partitioning is not working properly or it maybe case that my understanding is wrong. Can somebody explain me what is happening?
>> >> > >
>> >> > > It's your understanding that's not correct. The value of is passed
>> >> > > through a hash function and the partition is selected based partition
>> >> > > matching the remainder value after dividing the return value of the
>> >> > > hash function by the largest modulus of any partition.
>> >> > >
>> >> > > That might surprise you, but how would you select which partition a
>> >> > > varchar value should go into if you didn't use a hash function.
>> >> > >
>> >> > > David
>> >> >
>> >> > How can I see the output of hash function that is used internally?
>> >>
>> >> In the case of "integer", the hash function is "pg_catalog"."hashint4".
>> >>
>> >> Yours,
>> >> Laurenz Albe
>> >> --
>> >> Cybertec | https://www.cybertec-postgresql.com
>> >>
>> > I guess output formatting is wrong, any help?
>> >
>> > postgres=# select pg_catalog.hashint4(7);
>> > hashint4
>> > ------------
>> > -978793473
>> > (1 row)
>> >
>> Instead of direct hash function, the easiest way to use
>> satisfies_hash_partition() what is used in defining hash
>> partitioning constraint.
>>
>> You can see the partition constraint by description partition table i.e.
>> use \d+ busbar_version5.
>>
>> Regards,
>> Amul
>
>
> Sorry, I did not get you.
>
> My current \d+ is
>
> postgres=# \d+ busbar_version6;
> Table "test.busbar_version6"
> Column | Type | Collation | Nullable | Default | Storage | Stats target |
> Description
> ----------+---------+-----------+----------+---------+---------+--------------+-
> ------------
> objectid | integer | | | | plain | |
> ver_id | integer | | | | plain | |
> Partition of: busbar_version FOR VALUES WITH (modulus 10, remainder 6)
> Partition constraint: satisfies_hash_partition('16397'::oid, 10, 6, ver_id)
> Access method: heap
>
By executing "SELECT satisfies_hash_partition('16397'::oid, 10, 6, <VALUE>) "
will tell you whether <VALUE> fits in the partition having modulus 10 and
remainder 6 or not.
Regards,
Amul
OK.. Thanks.
BTW, is it possible to have a custom hash function instead of predefined hash function?
Regards,
Seenu.
On Fri, 2020-06-19 at 13:27 +0530, Srinivasa T N wrote: > > > How can I see the output of hash function that is used internally? > > > > In the case of "integer", the hash function is "pg_catalog"."hashint4". > > I guess output formatting is wrong, any help? > > postgres=# select pg_catalog.hashint4(7); > hashint4 > ------------ > -978793473 > (1 row) No, that is fine. Just take the result mod 10 if that is how hash partitioning was defined: select pg_catalog.hashint4(7) - floor(pg_catalog.hashint4(7) / 10.0) * 10; ?column? ---------- 7 (1 row) So that should end up in the eighth partition. You have no choice which hash function to use for partitioning. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com