Re: Slow Inserts on 1 table? - Mailing list pgsql-general

From Richard Huxton
Subject Re: Slow Inserts on 1 table?
Date
Msg-id 42EF8D49.1060107@archonet.com
Whole thread Raw
In response to Re: Slow Inserts on 1 table?  (Dan Armbrust <daniel.armbrust.list@gmail.com>)
Responses Re: Slow Inserts on 1 table?  (Dan Armbrust <daniel.armbrust.list@gmail.com>)
List pgsql-general
Dan Armbrust wrote:
>
>>
>> What, ALWAYS faster, even for the first FK check when there's only one
>> row in the target table and that's cached?
>>
>> If you're really in a hurry doing your bulk loads:
>>  1. Use COPY.
>>  2. Drop/restore the foreign-key constraints before/after.
>> That will be hugely faster than INSERTs, although it's not always an
>> applicable solution.
>> --
>>   Richard Huxton
>>   Archonet Ltd
>>
> It seems like the query planner goes to great lengths to avoid using
> indexes because it might take 5 ms longer to execute an index lookup on
> a table with one row.
> But then, when the table has 1 million rows, and a full scan takes 3
> minutes, and the index scan takes 3 seconds, it has no problem picking
> the 3 minute route.
> I'll gladly give up the 5 ms in turn for not having to wait 3 minutes,
> which is why I disabled the sequential scans.  If I have a small table,
> where indexes won't speed things up, I wont build an index on it.
>
> The other factor, is that most of my tables have at least thousands, and
> usually millions of rows.  Sequential scans will never be faster for the
> queries that I am doing - like I said, that is why I created the indexes.

The issue is nothing to do with special "small table" handling code.
It's all to do with not having up-to-date stats. Of course, once you've
analysed your table the system knows your index is good.

> My loading is done programatically, from another format, so COPY is not
> an option.

Why not? A lot of my bulk-loads are generated from other systems and I
go through a temporary-file/pipe via COPY when I can. When I don't I
block inserts into groups of e.g. 1000 and stick in an analyse/etc as
required.

 > Neither is removing foreign keys, as they are required to
> guarantee valid data.

Ah, but you can still guarantee your data. You can wrap the whole
drop-FK, bulk-load, recreate-FK in a single transaction, and it can
still be faster. Obviously doing this on a high-activity table won't win
though, you'll have to block everyone else doing updates.

> I don't really have a problem with the insert
> speed when it is working properly - it is on par with other DBs that I
> have on the same hardware.  The problem is when it stops using the
> indexes, for no good reason.
>
> Example, last night, I kicked off a load process - this morning, it had
> only managed to make it through about 600,000 rows (split across several
> tables).  After restarting it this morning, it made it through the same
> data in 30 minutes.
> If thats not bad and buggy behavior, I don't know what is....

So run ANALYSE in parallel with your load, or break the bulk-load into
blocks and analyse in-line. I'm not sure ripping out PG's cost-based
query analyser will be a popular solution just to address bulk-loads.

--
   Richard Huxton
   Archonet Ltd

pgsql-general by date:

Previous
From: Oliver Siegmar
Date:
Subject: Re: Problem with dropping a tablespace
Next
From: Alvaro Herrera
Date:
Subject: Re: Slow Inserts on 1 table?