Re: Bulk DML performance - Mailing list pgsql-performance

From Laurenz Albe
Subject Re: Bulk DML performance
Date
Msg-id d58bb88b06a40233aef8e07a39fd234458abc009.camel@cybertec.at
Whole thread Raw
In response to Bulk DML performance  (<bill.poole@ymail.com>)
Responses Re: Bulk DML performance
RE: Bulk DML performance
List pgsql-performance
On Thu, 2025-03-13 at 18:13 +0800, bill.poole@ymail.com wrote:
>
> it is noteworthy that inserting 27 MB of data into a newly created table creates
> 191 MB of data including the index and 127 MB of data excluding the index.

PostgreSQL has a lot of overhead per row.

>
> Can you help me understand why performing 3 million lookups on a b-tree index
> with all pages cached in memory takes so long?

It is probably not the lookup, but the *modification* of the index that is slow.

>
> It seems like deleting 3 million rows identified by 3 million IDs should be
> faster than updating 3 million rows (also identified by 3 million IDs).

It should be, yes.
To find out where the time is spent, use EXPLAIN (ANALYZE, BUFFERS) on the
statement.

> > To get better performance, the best I can think of is to parallelize loading
> > the data until you saturate CPU, disk or hit internal contention in the database.
>
> Sadly, I cannot do that because I need all rows to be inserted in a single
> database transaction, which I cannot do over multiple database connections.

Then the best you can do is to use COPY rather than INSERT.
It will perform better (but now vastly better).

Yours,
Laurenz Albe



pgsql-performance by date:

Previous
From:
Date:
Subject: RE: Bulk DML performance
Next
From: Laurenz Albe
Date:
Subject: Re: Bulk DML performance