RE: [EXTERNAL] Re: Inserts and bad performance - Mailing list pgsql-general

From Godfrin, Philippe E
Subject RE: [EXTERNAL] Re: Inserts and bad performance
Date
Msg-id SA0PR15MB3933EA7C7372814AAEFC74DF82619@SA0PR15MB3933.namprd15.prod.outlook.com
Whole thread Raw
In response to Re: Inserts and bad performance  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [EXTERNAL] Re: Inserts and bad performance  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-general

Hi Tom. Good point about the index paging out of the buffer. I did that and no change. I do have the shared buffers at 40GB, so there’s a good bit there, but I also did all those things on the page you referred, except for using copy. At this point the data has not been scrubbed, so I’m trapping data errors and duplicates. I am curios though, as sidebar, why copy is considered faster than inserts. I was unable to get COPY faster than around 25K inserts a second (pretty fast anyway). Frankly, initially I was running 3 concurrent insert jobs and getting 90K ins/sec ! but after a certain number of records, the speed just dropped off.

 

pg

 

From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Wednesday, November 24, 2021 1:32 PM
To: Godfrin, Philippe E <Philippe.Godfrin@nov.com>
Cc: pgsql-general@lists.postgresql.org
Subject: [EXTERNAL] Re: Inserts and bad performance

 

"Godfrin, Philippe E" <Philippe.Godfrin@nov.com> writes:
> I am inserting a large number of rows, 5,10, 15 million. The python code commits every 5000 inserts. The table has partitioned children.
> At first, when there were a low number of rows inserted, the inserts would run at a good clip - 30 - 50K inserts per second. Now, after inserting oh say 1.5 Billion rows, the insert rate has dropped to around 5000 inserts per second. I dropped the unique index , rebuilt the other indexes and no change. The instance is 16 vcpu and 64GB ram.

Can you drop the indexes and not rebuild them till after the bulk load is
done? Once the indexes exceed available RAM, insert performance is going
to fall off a cliff, except maybe for indexes that are receiving purely
sequential inserts (so that only the right end of the index gets touched).

Also see

https://www.postgresql.org/docs/current/populate.html

regards, tom lane

pgsql-general by date:

Previous
From: Gavin Roy
Date:
Subject: Re: Inserts and bad performance
Next
From: "Godfrin, Philippe E"
Date:
Subject: RE: [EXTERNAL] Re: Inserts and bad performance