Thread: Partitioning a table by integer value (preferably in place)

Partitioning a table by integer value (preferably in place)

From

Pól Ua Laoínecháin

Date:

09 August 2021, 12:16:16

Hi all,


Linux Fedora 34
1TB Samsung SSD
4 CPUs, 2 cores

PostgreSQL 12.7 (can upgrade if a better solution is to be found in 13
or even 14 beta2 - currently testing a proposed solution, so by the
time it's fully implemented, 14 should be on GA and hey, I might even
find a bug or two to help with  the project!).

I have a 400GB joining table (one SMALLINT and the other INTEGER -
Primary Keys on other tables) with 1000 fields on one side and 10M on
the other, so 10,000M (or 10Bn) records all told.

What I would like to do is to partition by the SMALLINT (1 - 1000)
value - which would give 1,000 tables of 400MB each.

I wish to avoid having to do this manually 1,000 times - is there a
way of telling PostgreSQL to partition by value without specifying the
SMALLINT value each time?

I looked here:

https://www.enterprisedb.com/postgres-tutorials/how-use-table-partitioning-scale-postgresql

and there is LIST, RANGE and HASH partitioning.

I think a RANGE of 1 would be what I want here?

So, questions;

Is 1,000 partitions reasonable? This:

https://elephas.io/is-there-a-limit-on-number-of-partitions-handled-by-postgres/

appears to suggest that it shouldn't be a problem?


Could I go with a RANGE of, say, 10 values per partition? If I have to
explicitly code, I'd prefer this for my test - at least it would save
on the typing! :-)

This would product 100 tables of ~ 4GB each. Would I see much
performance degradation with a 4GB table on an SSD?

Finally, the icing on the cake would be if this could be done in place
- my SSD is 1TB and the output from df -h is:

test=# \! df -h
Filesystem                               Size  Used Avail Use% Mounted on
devtmpfs                                  16G     0   16G   0% /dev
tmpfs                                     16G  212K   16G   1% /dev/shm
tmpfs                                    6.3G  1.8M  6.3G   1% /run
/dev/mapper/fedora_localhost--live-root   69G   11G   55G  17% /
tmpfs                                     16G  284K   16G   1% /tmp
/dev/sda5                                976M  192M  718M  22% /boot
/dev/mapper/fedora_localhost--live-home  1.3T  898G  270G  77% /home
/dev/sda2                                 96M   52M   45M  54% /boot/efi
tmpfs                                    3.2G   96K  3.2G   1% /run/user/1000
test=#


So, I only have 270 GB left on disk - and it took > 12 Hrs to fill it
with indexes and Foreign Key constraints so I'd like to be able to do
it without having to go through that again.

This:

https://www.2ndquadrant.com/en/blog/partitioning-a-large-table-without-a-long-running-lock/

appears to suggest that it can be done online. There will be no other
activity on the table while any partitioning &c. will be ongoing.
However, the article makes no mention of space considerations.

This is my first time considering partitioning, so I'd be grateful for
any advice, pointers, references, URLs &c.... and please let me know
if I"ve left out any important information.


TIA and rgs,


Pól...


Some (relevant?) settings;

max_worker_processes = 4
max_parallel_workers_per_gather = 2
max_parallel_workers = 4
max_parallel_maintenance_workers = 2

as suggested by pgtune. Is pgtune a good bet for configuration suggestions?

Re: Partitioning a table by integer value (preferably in place)

From

Pól Ua Laoínecháin

Date:

09 August 2021, 14:14:09

Hi again all,

Just a quick follow-up - could I script the creation of 1000
partitions using bash or PL/pgSQL? I think (rightly/wrongly?) that
this may be the best solution?

I've found samples on the web, but they are for partitioning by date -
a quick sample by integer would be gratefully received!

Is partman recommended by the community?

TIA and rgs,

Pól...

Re: Partitioning a table by integer value (preferably in place)

From

Ron

Date:

09 August 2021, 14:54:07

On 8/9/21 9:14 AM, Pól Ua Laoínecháin wrote:
> Hi again all,
>
> Just a quick follow-up - could I script the creation of 1000
> partitions using bash

Sure.  That's what scripting languages are for.

>   or PL/pgSQL? I think (rightly/wrongly?) that
> this may be the best solution?
>
> I've found samples on the web, but they are for partitioning by date -
> a quick sample by integer would be gratefully received!
>
> Is partman recommended by the community?

-- 
Angular momentum makes the world go 'round.

Re: Partitioning a table by integer value (preferably in place)

From

Vijaykumar Jain

Date:

09 August 2021, 16:01:11

I have a 400GB joining table (one SMALLINT and the other INTEGER -
Primary Keys on other tables) with 1000 fields on one side and 10M on
the other, so 10,000M (or 10Bn) records all told.

My queries:

Do you have any explain analyze,buffers

<query> results with the existing setup? Does it look problematic?

How would your table grow on either side of the join ? Append only, static data or too frequently updated etc, or dropped periodically, so that delete based bloating can be skipped completely.

How distributed is the data based on smallint keys, equally or unequally.

What kind of queries would be run and results returned ? Oltp or olap like ? Quick queries with few rows retuned or heavy queries with lot of rows returned.

Partitioning has been ever improving, so the best option if possible would be to use the latest pg version is possible,.

Also is there any scope of normalisation of that table, I mean I know theoretically it is possible, but I have not seen any design with that wide table( of 1000 cols), so would be good to know.

Just asking, maybe partitioning would the best option but wanting to know/see the benefit pre and post partitioning.

Thanks,

Vijay

Mumbai, India

Re: Partitioning a table by integer value (preferably in place)

From

Pól Ua Laoínecháin

Date:

13 August 2021, 06:50:40

Hi all - I resolved my issue - and took a 25 minute query down to 5 seconds.

> I have a 400GB joining table (one SMALLINT and the other INTEGER -
> What I would like to do is to partition by the SMALLINT (1 - 1000)
> value - which would give 1,000 tables of 400MB each.

I found this site very helpful for explaining the basics of
partitioning (LIST, RANGE and HASH):

https://www.enterprisedb.com/postgres-tutorials/how-use-table-partitioning-scale-postgresql

I then found this absolute beaut of a site which was like manna from heaven:

https://www.depesz.com/2021/01/17/are-there-limits-to-partition-counts/

which explained (they had a different issue - I adapted the code) how
what I required can be done entirely from the psql client without the
need for bash or PL/pgSQL or anything else. The "trick" here is to
combine the FORMAT function with GENERATE_SERIES as follows (code from
site):

CREATE TABLE test_ranged (id serial PRIMARY KEY, payload TEXT)
partition BY range (id);
SELECT FORMAT ('CREATE TABLE %I partition OF test_ranged FOR VALUES
FROM (%s) to (%s);', 'test_ranged_' || i, i, i+1) FROM
generate_series(1, 2) i \gexec

Output of this (having removed \gexec - another thing I learnt):

                                     format
---------------------------------------------------------------------------------
 CREATE TABLE test_ranged_1 partition OF test_ranged FOR VALUES FROM (1) to (2);
 CREATE TABLE test_ranged_2 partition OF test_ranged FOR VALUES FROM (2) to (3);


So, I generated the series for (1, 1000) with my own fields using the
LIST method with a single INTEGER value in the list. Ran the script -
had my 1000 partitions in a matter of seconds. Loading them (750GB
with indexes) was an overnight job however - but that's not
PostgreSQL's fault! :-)

I really love the way that PostgreSQL/psql is so flexible that it's
possible to do heaps of stuff without having to resort to other tools.

I posted this answer to my own question in the hope that it may help
others in my situation. If I haven't been clear, or there's something
missing, please let me know - or add your own opinions/experience if
there's an alternative which may or may not be as efficient. I'm
trying to collect as many strings to my bow as possible!

Rgs,

Pól...

Re: Partitioning a table by integer value (preferably in place)

From

Pól Ua Laoínecháin

Date:

13 August 2021, 08:42:29

Hi Vijay, and thanks for replying,

>> I have a 400GB joining table (one SMALLINT and the other INTEGER -
>> Primary Keys on other tables) with 1000 fields on one side and 10M on
>> the other, so 10,000M (or 10Bn) records all told.


> My queries:

>> <query>  results with the existing setup? Does it look problematic?


> How would your table grow on either side of the join?

In this case uniformly! 1 -> 10

> Append only, static data or too frequently updated etc, or dropped periodically,

Append only in this case - not updated nor dropped.

> so that delete based bloating can be skipped completely.

It can be skipped!

> How distributed is the data based on smallint keys, equally or unequally.

Totally uniform - see my own answer to my question - if it wasn't
uniform, I might have considered RANGE based partitioning?

> What kind of queries would be run and results returned ? Oltp or olap like ? Quick queries with few rows retuned or
heavyqueries with lot of rows returned. 

Pretty much OLAP like - summary queries. Point queries return in
sub-millisecond range when based on PK!

> Partitioning has been ever improving, so the best option if possible would be to use the latest pg version is
possible,.
> Also is there any scope of normalisation of that table, I mean I know theoretically it is possible, but I have not
seenany design with that wide table( of 1000 cols), so would be good to know. 

> Just asking, maybe partitioning would the best option but wanting to know/see the benefit pre and post partitioning.


Thanks again for your questions - they gave me pause for thought and I
will try to apply them in future partitioning scenarios. (Unfortunatly
:-) ) there is no magic number of partitions for, say, a given size of
table - otherwise it would be the default and would be done
automatically!

Rgs,

Pól...


> Vijay

Re: Partitioning a table by integer value (preferably in place)

From

Pól Ua Laoínecháin

Date:

13 August 2021, 08:43:08

Dear Ron,

> > Just a quick follow-up - could I script the creation of 1000
> > partitions using bash

> Sure.  That's what scripting languages are for.

Thank you so much for your helpful and expansive answer. No wonder
everyone talks about how friendly and welcoming the PostgreSQL
community is!

Your friend always,

Pól...

Re: Partitioning a table by integer value (preferably in place)

From

Vijaykumar Jain

Date:

13 August 2021, 09:02:16

On Fri, 13 Aug 2021 at 14:13, Pól Ua Laoínecháin <linehanp@tcd.ie> wrote:

Thanks again for your questions - they gave me pause for thought and I
will try to apply them in future partitioning scenarios. (Unfortunatly
:-) ) there is no magic number of partitions for, say, a given size of
table - otherwise it would be the default and would be done
automatically!

no worries, it seems you have a fair idea how things would grow. so partitioning would be of help.

My biggest worry around partitioning is around moving data around partitions/ rebalancing partitions.

It requires making use of triggers, syncing data and then attach/remove partitions if it is simple as

list/range and use binary split for large partitions, to avoid large row movements.

But when it comes to hash, it requires rebuilding the entire setup either via triggers if there is significant space on the db,

or logically replicating to another server

with a new hash scheme. because i do not have lot of exp around sharding/partitioning,

nor about how others dbs on how they claim automatic movements of data between

partitions, I think i wore myself out due to poor design.

Thanks,

Vijay

Mumbai, India

Re: Partitioning a table by integer value (preferably in place)

From

Michael Lewis

Date:

13 August 2021, 15:37:22

On Fri, Aug 13, 2021 at 3:02 AM Vijaykumar Jain <vijaykumarjain.github@gmail.com> wrote:

... use binary split for large partitions, to avoid large row movements.

Would you expound on this?

Re: Partitioning a table by integer value (preferably in place)

From

Vijaykumar Jain

Date:

13 August 2021, 16:30:06

On Fri, 13 Aug 2021 at 21:07, Michael Lewis <mlewis@entrata.com> wrote:

On Fri, Aug 13, 2021 at 3:02 AM Vijaykumar Jain <vijaykumarjain.github@gmail.com> wrote:
... use binary split for large partitions, to avoid large row movements.

Would you expound on this?

if we have range partitions based on year, 2018 -2019, 2019-2020, etc and for

any one year we see uneven growth in data, we simply split that partition by 2 and move only 6 months worth of data

to another new partition.

If we see similar issues not solved by the first split, we split each again by 2 and only move the remainder of data to all four partitions.

So at any point of time, if the range partition which was planned earlier does not work well for some periods, we split the partitions.

and if we see the trend growing with increased volume, the new partitions are created qtry etc.

I have not done hash partitioning rebuild, but i think this video explains how we could best do it.

Ways of sharding and resharding in a distributed system - YouTube

Thanks,

Vijay

Mumbai, India