Thread: HASH partitioning not working properly

HASH partitioning not working properly

From
Srinivasa T N
Date:
Hi,
   After seeing the below, I feel partitioning is not working properly or it maybe case that my understanding is wrong.  Can somebody explain me what is happening?

postgres=# create table busbar_version (objectid int, ver_id int) partition by hash(ver_id);
CREATE TABLE
postgres=# CREATE TABLE busbar_version0 PARTITION OF busbar_version FOR VALUES WITH (MODULUS 10, REMAINDER 0);
CREATE TABLE
postgres=# CREATE TABLE busbar_version5 PARTITION OF busbar_version FOR VALUES WITH (MODULUS 10, REMAINDER 5);
CREATE TABLE
postgres=# CREATE TABLE busbar_version6 PARTITION OF busbar_version FOR VALUES WITH (MODULUS 10, REMAINDER 6);
CREATE TABLE
postgres=# CREATE TABLE busbar_version7 PARTITION OF busbar_version FOR VALUES WITH (MODULUS 10, REMAINDER 7);
CREATE TABLE

I did insert using following:

postgres=# insert into busbar_version(objectid,ver_id) values (5,5);
INSERT 0 1
postgres=# insert into busbar_version(objectid,ver_id) values (6,6);
INSERT 0 1

I was of the opinion that the above rows were inserted into busbar_version5 and busbar_version6, but I'm wrong.

postgres=# select * from busbar_version;
 objectid | ver_id
----------+--------
        5 |      5
        6 |      6
(2 rows)

postgres=# select * from busbar_version5;
 objectid | ver_id
----------+--------
        5 |      5
(1 row)

postgres=# select * from busbar_version6;
 objectid | ver_id
----------+--------
(0 rows)

postgres=# select * from busbar_version7;
 objectid | ver_id
----------+--------
        6 |      6
(1 row)

   Why second insert has gone to table busbar_version7 instead of busbar_version6?  

If it helps, I am trying on the "psql (12.3 (Debian 12.3-1.pgdg100+1))" container.

Regards,
Seenu.

Re: HASH partitioning not working properly

From
David Rowley
Date:
On Fri, 19 Jun 2020 at 17:42, Srinivasa T N <seenutn@gmail.com> wrote:
>    After seeing the below, I feel partitioning is not working properly or it maybe case that my understanding is
wrong. Can somebody explain me what is happening?
 

> postgres=# select * from busbar_version6;
>  objectid | ver_id
> ----------+--------
> (0 rows)
>
> postgres=# select * from busbar_version7;
>  objectid | ver_id
> ----------+--------
>         6 |      6
> (1 row)
>
>    Why second insert has gone to table busbar_version7 instead of busbar_version6?

It's your understanding that's not correct.  The value of is passed
through a hash function and the partition is selected based partition
matching the remainder value after dividing the return value of the
hash function by the largest modulus of any partition.

That might surprise you, but how would you select which partition a
varchar value should go into if you didn't use a hash function.

David



Re: HASH partitioning not working properly

From
Srinivasa T N
Date:



On Fri, Jun 19, 2020 at 11:44 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 19 Jun 2020 at 17:42, Srinivasa T N <seenutn@gmail.com> wrote:
>    After seeing the below, I feel partitioning is not working properly or it maybe case that my understanding is wrong.  Can somebody explain me what is happening?

> postgres=# select * from busbar_version6;
>  objectid | ver_id
> ----------+--------
> (0 rows)
>
> postgres=# select * from busbar_version7;
>  objectid | ver_id
> ----------+--------
>         6 |      6
> (1 row)
>
>    Why second insert has gone to table busbar_version7 instead of busbar_version6?

It's your understanding that's not correct.  The value of is passed
through a hash function and the partition is selected based partition
matching the remainder value after dividing the return value of the
hash function by the largest modulus of any partition.

That might surprise you, but how would you select which partition a
varchar value should go into if you didn't use a hash function.

David

How can I see the output of hash function that is used internally?

Regards,
Seenu. 

Re: HASH partitioning not working properly

From
Laurenz Albe
Date:
On Fri, 2020-06-19 at 12:12 +0530, Srinivasa T N wrote:
> On Fri, Jun 19, 2020 at 11:44 AM David Rowley <dgrowleyml@gmail.com> wrote:
> > On Fri, 19 Jun 2020 at 17:42, Srinivasa T N <seenutn@gmail.com> wrote:
> > >    After seeing the below, I feel partitioning is not working properly or it maybe case that my understanding is
wrong. Can somebody explain me what is happening?
 
> > 
> > It's your understanding that's not correct.  The value of is passed
> > through a hash function and the partition is selected based partition
> > matching the remainder value after dividing the return value of the
> > hash function by the largest modulus of any partition.
> > 
> > That might surprise you, but how would you select which partition a
> > varchar value should go into if you didn't use a hash function.
> > 
> > David
> 
> How can I see the output of hash function that is used internally?

In the case of "integer", the hash function is "pg_catalog"."hashint4".

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: HASH partitioning not working properly

From
Srinivasa T N
Date:
On Fri, Jun 19, 2020 at 12:34 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Fri, 2020-06-19 at 12:12 +0530, Srinivasa T N wrote:
> On Fri, Jun 19, 2020 at 11:44 AM David Rowley <dgrowleyml@gmail.com> wrote:
> > On Fri, 19 Jun 2020 at 17:42, Srinivasa T N <seenutn@gmail.com> wrote:
> > >    After seeing the below, I feel partitioning is not working properly or it maybe case that my understanding is wrong.  Can somebody explain me what is happening?
> >
> > It's your understanding that's not correct.  The value of is passed
> > through a hash function and the partition is selected based partition
> > matching the remainder value after dividing the return value of the
> > hash function by the largest modulus of any partition.
> >
> > That might surprise you, but how would you select which partition a
> > varchar value should go into if you didn't use a hash function.
> >
> > David
>
> How can I see the output of hash function that is used internally?

In the case of "integer", the hash function is "pg_catalog"."hashint4".

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

I guess output formatting is wrong, any help?

 postgres=# select pg_catalog.hashint4(7);
  hashint4  
------------
 -978793473
(1 row)

Regards,
Seenu.

Re: HASH partitioning not working properly

From
Amul Sul
Date:
On Fri, Jun 19, 2020 at 1:28 PM Srinivasa T N <seenutn@gmail.com> wrote:
>
> On Fri, Jun 19, 2020 at 12:34 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>>
>> On Fri, 2020-06-19 at 12:12 +0530, Srinivasa T N wrote:
>> > On Fri, Jun 19, 2020 at 11:44 AM David Rowley <dgrowleyml@gmail.com> wrote:
>> > > On Fri, 19 Jun 2020 at 17:42, Srinivasa T N <seenutn@gmail.com> wrote:
>> > > >    After seeing the below, I feel partitioning is not working properly or it maybe case that my understanding
iswrong.  Can somebody explain me what is happening?
 
>> > >
>> > > It's your understanding that's not correct.  The value of is passed
>> > > through a hash function and the partition is selected based partition
>> > > matching the remainder value after dividing the return value of the
>> > > hash function by the largest modulus of any partition.
>> > >
>> > > That might surprise you, but how would you select which partition a
>> > > varchar value should go into if you didn't use a hash function.
>> > >
>> > > David
>> >
>> > How can I see the output of hash function that is used internally?
>>
>> In the case of "integer", the hash function is "pg_catalog"."hashint4".
>>
>> Yours,
>> Laurenz Albe
>> --
>> Cybertec | https://www.cybertec-postgresql.com
>>
> I guess output formatting is wrong, any help?
>
>  postgres=# select pg_catalog.hashint4(7);
>   hashint4
> ------------
>  -978793473
> (1 row)
>
Instead of direct hash function, the easiest way to use
satisfies_hash_partition() what is used in defining hash
partitioning constraint.

You can see the partition constraint by description partition table i.e.
 use \d+ busbar_version5.

Regards,
Amul



Re: HASH partitioning not working properly

From
Srinivasa T N
Date:


On Fri, Jun 19, 2020 at 3:09 PM Amul Sul <sulamul@gmail.com> wrote:
On Fri, Jun 19, 2020 at 1:28 PM Srinivasa T N <seenutn@gmail.com> wrote:
>
> On Fri, Jun 19, 2020 at 12:34 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>>
>> On Fri, 2020-06-19 at 12:12 +0530, Srinivasa T N wrote:
>> > On Fri, Jun 19, 2020 at 11:44 AM David Rowley <dgrowleyml@gmail.com> wrote:
>> > > On Fri, 19 Jun 2020 at 17:42, Srinivasa T N <seenutn@gmail.com> wrote:
>> > > >    After seeing the below, I feel partitioning is not working properly or it maybe case that my understanding is wrong.  Can somebody explain me what is happening?
>> > >
>> > > It's your understanding that's not correct.  The value of is passed
>> > > through a hash function and the partition is selected based partition
>> > > matching the remainder value after dividing the return value of the
>> > > hash function by the largest modulus of any partition.
>> > >
>> > > That might surprise you, but how would you select which partition a
>> > > varchar value should go into if you didn't use a hash function.
>> > >
>> > > David
>> >
>> > How can I see the output of hash function that is used internally?
>>
>> In the case of "integer", the hash function is "pg_catalog"."hashint4".
>>
>> Yours,
>> Laurenz Albe
>> --
>> Cybertec | https://www.cybertec-postgresql.com
>>
> I guess output formatting is wrong, any help?
>
>  postgres=# select pg_catalog.hashint4(7);
>   hashint4
> ------------
>  -978793473
> (1 row)
>
Instead of direct hash function, the easiest way to use
satisfies_hash_partition() what is used in defining hash
partitioning constraint.

You can see the partition constraint by description partition table i.e.
 use \d+ busbar_version5.

Regards,
Amul

Sorry, I did not get you.

My current \d+ is

postgres=# \d+ busbar_version6;
                                Table "test.busbar_version6"
  Column  |  Type   | Collation | Nullable | Default | Storage | Stats target |
Description
----------+---------+-----------+----------+---------+---------+--------------+-
------------
 objectid | integer |           |          |         | plain   |              |
 ver_id   | integer |           |          |         | plain   |              |
Partition of: busbar_version FOR VALUES WITH (modulus 10, remainder 6)
Partition constraint: satisfies_hash_partition('16397'::oid, 10, 6, ver_id)
Access method: heap

Regards,
Seenu. 

Re: HASH partitioning not working properly

From
Amul Sul
Date:
On Fri, Jun 19, 2020 at 3:50 PM Srinivasa T N <seenutn@gmail.com> wrote:
>
>
>
> On Fri, Jun 19, 2020 at 3:09 PM Amul Sul <sulamul@gmail.com> wrote:
>>
>> On Fri, Jun 19, 2020 at 1:28 PM Srinivasa T N <seenutn@gmail.com> wrote:
>> >
>> > On Fri, Jun 19, 2020 at 12:34 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>> >>
>> >> On Fri, 2020-06-19 at 12:12 +0530, Srinivasa T N wrote:
>> >> > On Fri, Jun 19, 2020 at 11:44 AM David Rowley <dgrowleyml@gmail.com> wrote:
>> >> > > On Fri, 19 Jun 2020 at 17:42, Srinivasa T N <seenutn@gmail.com> wrote:
>> >> > > >    After seeing the below, I feel partitioning is not working properly or it maybe case that my
understandingis wrong.  Can somebody explain me what is happening?
 
>> >> > >
>> >> > > It's your understanding that's not correct.  The value of is passed
>> >> > > through a hash function and the partition is selected based partition
>> >> > > matching the remainder value after dividing the return value of the
>> >> > > hash function by the largest modulus of any partition.
>> >> > >
>> >> > > That might surprise you, but how would you select which partition a
>> >> > > varchar value should go into if you didn't use a hash function.
>> >> > >
>> >> > > David
>> >> >
>> >> > How can I see the output of hash function that is used internally?
>> >>
>> >> In the case of "integer", the hash function is "pg_catalog"."hashint4".
>> >>
>> >> Yours,
>> >> Laurenz Albe
>> >> --
>> >> Cybertec | https://www.cybertec-postgresql.com
>> >>
>> > I guess output formatting is wrong, any help?
>> >
>> >  postgres=# select pg_catalog.hashint4(7);
>> >   hashint4
>> > ------------
>> >  -978793473
>> > (1 row)
>> >
>> Instead of direct hash function, the easiest way to use
>> satisfies_hash_partition() what is used in defining hash
>> partitioning constraint.
>>
>> You can see the partition constraint by description partition table i.e.
>>  use \d+ busbar_version5.
>>
>> Regards,
>> Amul
>
>
> Sorry, I did not get you.
>
> My current \d+ is
>
> postgres=# \d+ busbar_version6;
>                                 Table "test.busbar_version6"
>   Column  |  Type   | Collation | Nullable | Default | Storage | Stats target |
> Description
> ----------+---------+-----------+----------+---------+---------+--------------+-
> ------------
>  objectid | integer |           |          |         | plain   |              |
>  ver_id   | integer |           |          |         | plain   |              |
> Partition of: busbar_version FOR VALUES WITH (modulus 10, remainder 6)
> Partition constraint: satisfies_hash_partition('16397'::oid, 10, 6, ver_id)
> Access method: heap
>
By executing "SELECT satisfies_hash_partition('16397'::oid, 10, 6, <VALUE>) "
will tell you whether <VALUE> fits in the partition having modulus 10 and
remainder 6 or not.

Regards,
Amul



Re: HASH partitioning not working properly

From
Srinivasa T N
Date:


On Fri, Jun 19, 2020, 5:45 PM Amul Sul <sulamul@gmail.com> wrote:
On Fri, Jun 19, 2020 at 3:50 PM Srinivasa T N <seenutn@gmail.com> wrote:
>
>
>
> On Fri, Jun 19, 2020 at 3:09 PM Amul Sul <sulamul@gmail.com> wrote:
>>
>> On Fri, Jun 19, 2020 at 1:28 PM Srinivasa T N <seenutn@gmail.com> wrote:
>> >
>> > On Fri, Jun 19, 2020 at 12:34 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>> >>
>> >> On Fri, 2020-06-19 at 12:12 +0530, Srinivasa T N wrote:
>> >> > On Fri, Jun 19, 2020 at 11:44 AM David Rowley <dgrowleyml@gmail.com> wrote:
>> >> > > On Fri, 19 Jun 2020 at 17:42, Srinivasa T N <seenutn@gmail.com> wrote:
>> >> > > >    After seeing the below, I feel partitioning is not working properly or it maybe case that my understanding is wrong.  Can somebody explain me what is happening?
>> >> > >
>> >> > > It's your understanding that's not correct.  The value of is passed
>> >> > > through a hash function and the partition is selected based partition
>> >> > > matching the remainder value after dividing the return value of the
>> >> > > hash function by the largest modulus of any partition.
>> >> > >
>> >> > > That might surprise you, but how would you select which partition a
>> >> > > varchar value should go into if you didn't use a hash function.
>> >> > >
>> >> > > David
>> >> >
>> >> > How can I see the output of hash function that is used internally?
>> >>
>> >> In the case of "integer", the hash function is "pg_catalog"."hashint4".
>> >>
>> >> Yours,
>> >> Laurenz Albe
>> >> --
>> >> Cybertec | https://www.cybertec-postgresql.com
>> >>
>> > I guess output formatting is wrong, any help?
>> >
>> >  postgres=# select pg_catalog.hashint4(7);
>> >   hashint4
>> > ------------
>> >  -978793473
>> > (1 row)
>> >
>> Instead of direct hash function, the easiest way to use
>> satisfies_hash_partition() what is used in defining hash
>> partitioning constraint.
>>
>> You can see the partition constraint by description partition table i.e.
>>  use \d+ busbar_version5.
>>
>> Regards,
>> Amul
>
>
> Sorry, I did not get you.
>
> My current \d+ is
>
> postgres=# \d+ busbar_version6;
>                                 Table "test.busbar_version6"
>   Column  |  Type   | Collation | Nullable | Default | Storage | Stats target |
> Description
> ----------+---------+-----------+----------+---------+---------+--------------+-
> ------------
>  objectid | integer |           |          |         | plain   |              |
>  ver_id   | integer |           |          |         | plain   |              |
> Partition of: busbar_version FOR VALUES WITH (modulus 10, remainder 6)
> Partition constraint: satisfies_hash_partition('16397'::oid, 10, 6, ver_id)
> Access method: heap
>
By executing "SELECT satisfies_hash_partition('16397'::oid, 10, 6, <VALUE>) "
will tell you whether <VALUE> fits in the partition having modulus 10 and
remainder 6 or not.

Regards,
Amul

OK.. Thanks.

BTW, is it possible to have a custom hash function instead of predefined hash function?

Regards,
Seenu.

Re: HASH partitioning not working properly

From
Laurenz Albe
Date:
On Fri, 2020-06-19 at 13:27 +0530, Srinivasa T N wrote:
> > > How can I see the output of hash function that is used internally?
> > 
> > In the case of "integer", the hash function is "pg_catalog"."hashint4".
>
> I guess output formatting is wrong, any help?
>
>  postgres=# select pg_catalog.hashint4(7);
>   hashint4  
> ------------
>  -978793473
> (1 row)

No, that is fine.
Just take the result mod 10 if that is how hash partitioning was defined:

select pg_catalog.hashint4(7) - floor(pg_catalog.hashint4(7) / 10.0) * 10;
 ?column? 
----------
        7
(1 row)

So that should end up in the eighth partition.

You have no choice which hash function to use for partitioning.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com