Thread: 700K Inserts in transaction

700K Inserts in transaction

From
Asmir Mustafic
Date:
Hi everybody!
I'm new in mailing list, and i have a little question.


The tables are:
postalcodes (place_id, code),  PK(place_id, code) 600K of rws
places (id, name),  PK(id), INDEX(name) 3M of rows

I've to insert another 600k of rows into postalcodes table, in a single
transaction, omitting duplicates.

The insert query is a prepared statement like this:

INSERT INTO postalcodes (place_id, code)
SELECT places.id, :code
FROM places
LEFT JOIN postalcodes (postalcodes.place_id = places.id and
postalcodes.code = :code)
WHERE places.name = :name AND postalcodes.place_id IS NULL

Inserting rows works well (3000 queries per second), but when i reach
30K of executed statements, the insert rate slows down to 500/1000
queries per second).

Doing a commit every 20K of inserts, the insert rate remain 3000 queries
per second.

There is a limit of inserts in a transaction?





Re: 700K Inserts in transaction

From
nik9000@gmail.com
Date:
Are the duplicates evenly distributed?  You might have started on a big chunk of dupes.

I'd go about this by loading my new data in a new table, removing the dupes, then inserting all the new data into the
oldtable. That way you have more granular information about the process. And you can do the initial load with copy if
youneed it. And you can remove the dupes outside of a transaction.  

Nik

Sent from my iPhone

On Feb 14, 2013, at 5:28 AM, Asmir Mustafic <goetas@lignano.it> wrote:

> Hi everybody!
> I'm new in mailing list, and i have a little question.
>
>
> The tables are:
> postalcodes (place_id, code),  PK(place_id, code) 600K of rws
> places (id, name),  PK(id), INDEX(name) 3M of rows
>
> I've to insert another 600k of rows into postalcodes table, in a single transaction, omitting duplicates.
>
> The insert query is a prepared statement like this:
>
> INSERT INTO postalcodes (place_id, code)
> SELECT places.id, :code
> FROM places
> LEFT JOIN postalcodes (postalcodes.place_id = places.id and postalcodes.code = :code)
> WHERE places.name = :name AND postalcodes.place_id IS NULL
>
> Inserting rows works well (3000 queries per second), but when i reach 30K of executed statements, the insert rate
slowsdown to 500/1000 queries per second). 
>
> Doing a commit every 20K of inserts, the insert rate remain 3000 queries per second.
>
> There is a limit of inserts in a transaction?
>
>
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance