Re: Faster data load - Mailing list pgsql-general

From Jeff Ross
Subject Re: Faster data load
Date
Msg-id 640e21b1-a841-40ae-a5eb-e6ee548520a1@openvistas.net
Whole thread Raw
In response to Faster data load  (Lok P <loknath.73@gmail.com>)
List pgsql-general
On 9/5/24 14:14, Lok P wrote:

> Hi,
>
> We are having a requirement to create approx 50 billion rows in a 
> partition table(~1 billion rows per partition, 200+gb size daily 
> partitions) for a performance test. We are currently using ' insert 
> into <target table_partition> select.. From <source_table_partition> 
> or <some transformed query>;' method . We have dropped all indexes and 
> constraints First and then doing the load. Still it's taking 2-3 hours 
> to populate one partition. Is there a faster way to achieve this?
>
> Few teammate suggesting to use copy command and use file load instead, 
> which will be faster. So I wanted to understand, how different things 
> it does behind the scenes as compared to insert as select command? As 
> because it only deals with sql engine only.
>
> Additionally, when we were trying to create indexes post data load on 
> one partition, it took 30+ minutes. Any possible way to make it faster?
>
> Is there any way to drive the above things in parallel by utilizing 
> full database resources?
>
> It's postgres 15.4
>
> Regards
> Lok

Try pg_bulkload to load the data--takes a little set up but it is very 
fast.  Do pay attention to the caveats.  For a performance test they 
probably won't be relevant.

https://github.com/ossc-db/pg_bulkload?tab=readme-ov-file

Jeff




pgsql-general by date:

Previous
From: Ron Johnson
Date:
Subject: Re: Faster data load
Next
From: Muhammad Usman Khan
Date:
Subject: Re: Faster data load