Re: Faster data load - Mailing list pgsql-general

From Muhammad Usman Khan
Subject Re: Faster data load
Date
Msg-id CAPnRvGswoifDeosThKM5zkFZzph5u86ozYkrETpGD=oig14pOA@mail.gmail.com
Whole thread Raw
In response to Faster data load  (Lok P <loknath.73@gmail.com>)
Responses Re: Faster data load
List pgsql-general
Hi,

You can use pg_partman. If your table is partitioned, you can manage partitions in parallel by distributing the load across partitions concurrently. Or you can use citus. It can be an excellent solution, especially for handling large data volumes and parallelizing data operations


On Fri, 6 Sept 2024 at 01:14, Lok P <loknath.73@gmail.com> wrote:
Hi, 

We are having a requirement to create approx 50 billion rows in a partition table(~1 billion rows per partition, 200+gb size daily partitions) for a performance test. We are currently using ' insert into <target table_partition> select.. From <source_table_partition> or <some transformed query>;' method . We have dropped all indexes and constraints First and then doing the load. Still it's taking 2-3 hours to populate one partition. Is there a faster way to achieve this? 

Few teammate suggesting to use copy command and use file load instead, which will be faster. So I wanted to understand, how different things it does behind the scenes as compared to insert as select command? As because it only deals with sql engine only. 

Additionally, when we were trying to create indexes post data load on one partition, it took 30+ minutes. Any possible way to make it faster? 

Is there any way to drive the above things in parallel by utilizing full database resources?

It's postgres 15.4

Regards
Lok

pgsql-general by date:

Previous
From: Jeff Ross
Date:
Subject: Re: Faster data load
Next
From: Lok P
Date:
Subject: Re: Faster data load