Re: Faster data load - Mailing list pgsql-general

From Ron Johnson
Subject Re: Faster data load
Date
Msg-id CANzqJaCxizjVfdiDvxeGyrWUisefQ3y9Z3DVdA5rEU4PC5SLJw@mail.gmail.com
Whole thread Raw
In response to Faster data load  (Lok P <loknath.73@gmail.com>)
List pgsql-general
On Thu, Sep 5, 2024 at 4:14 PM Lok P <loknath.73@gmail.com> wrote:
Hi, 

We are having a requirement to create approx 50 billion rows in a partition table(~1 billion rows per partition, 200+gb size daily partitions) for a performance test. We are currently using ' insert into <target table_partition> select.. From <source_table_partition> or <some transformed query>;' method . We have dropped all indexes and constraints First and then doing the load. Still it's taking 2-3 hours to populate one partition.

At three hours, that's 92,593 records/second.  Seems pretty slow.

How much of that time is taken by <some transformed query>?
How big are the records?
How fast is the hardware?

Is there a faster way to achieve this? 

Testing is the only way to know for sure.
 
Few teammate suggesting to use copy command and use file load instead, which will be faster. So I wanted to understand, how different things it does behind the scenes as compared to insert as select command? As because it only deals with sql engine only. 

COPY is highly optimized for buffered operation.  INSERT... maybe not so much.

But if the source data is already in a table, that would require piping the data to stdout and then back into the database.

psql appdb -c "COPY (SELECT ...) TO STDOUT;" | psql appdb -c "COPY some_table FROM STDOUT;".  Use binary mode, so text conversion isn't required.

Maybe that's faster, maybe not.

Additionally, when we were trying to create indexes post data load on one partition, it took 30+ minutes. Any possible way to make it faster? 

Is there any way to drive the above things in parallel by utilizing full database resources?

Put the destination tables in a different tablespace on a different controller.
 
It's postgres 15.4

Why not 15.8?

--
Death to America, and butter sauce.
Iraq lobster!

pgsql-general by date:

Previous
From: Lok P
Date:
Subject: Faster data load
Next
From: Jeff Ross
Date:
Subject: Re: Faster data load