Re: Postgres insert performance and storage requirement compared to Oracle - Mailing list pgsql-performance

From Scott Carey
Subject Re: Postgres insert performance and storage requirement compared to Oracle
Date
Msg-id E2FA14BE-A47B-4673-9DE1-851EB74009F2@richrelevance.com
Whole thread Raw
In response to Re: Postgres insert performance and storage requirement compared to Oracle  (Mladen Gogala <mladen.gogala@vmsinfo.com>)
Responses Re: Postgres insert performance and storage requirement compared to Oracle
List pgsql-performance
On Oct 26, 2010, at 2:54 PM, Mladen Gogala wrote:

> On 10/26/2010 5:27 PM, Jon Nelson wrote:
>> start loop:
>>   populate rows in temporary table
>>   insert from temporary table into permanent table
>>   truncate temporary table
>>   loop
>>
>> I do something similar, where I COPY data to a temporary table, do
>> lots of manipulations, and then perform a series of INSERTS from the
>> temporary table into a permanent table.
>>
>
> 1) It's definitely not faster because you have to insert into the
> temporary table, in addition to inserting into the permanent table.

It is almost always significantly faster than a direct bulk load into a table.
* The temp table has no indexes, the final table usually does, bulk operations on indexes are faster than per row
operations.
* The final table might require both updates and inserts, doing these in bulk from a temp stage table is far faster
thanper row. 
* You don't even have to commit after the merge from the temp table, and can loop until its all done, then commit --
thoughthis can have table/index bloat implications if doing updates. 

> 2) This is what I had in mind:
>
> mgogala=# create table a(c1 int);
> CREATE TABLE
> mgogala=# create temporary table t1(c1 int) on commit delete rows;
> CREATE TABLE
> mgogala=# begin;
> BEGIN
> mgogala=# insert into t1 select generate_series(1,1000);
> INSERT 0 1000
> mgogala=# insert into a select * from t1;
> INSERT 0 1000
> mgogala=# commit;
> COMMIT
> mgogala=# select count(*) from a;
>  count
> -------
>   1000
> (1 row)
>
> mgogala=# select count(*) from t1;
>  count
> -------
>      0
> (1 row)
>
> The table is created with "on commit obliterate rows" option which means
> that there is no need to do "truncate". The "truncate" command is a
> heavy artillery. Truncating a temporary table is like shooting ducks in
> a duck pond, with a howitzer.

???  Test it.  DELETE is slow, truncate is nearly instantaneous for normal tables.  For temp tables its the same thing.
Maybe in Oracle TRUNCATE is a howitzer, in Postgres its lightweight.  Your loop above requires a commit after every
1000rows.  What if you require that all rows are seen at once or not at all?  What if you fail part way through?  One
bigtransaction is often a better idea and/or required.  Especially in postgres, with no undo-log, bulk inserts in one
largetransaction work out very well -- usually better than multiple smaller transactions. 
>
> --
>
> Mladen Gogala
> Sr. Oracle DBA
> 1500 Broadway
> New York, NY 10036
> (212) 329-5251
> http://www.vmsinfo.com
> The Leader in Integrated Media Intelligence Solutions
>
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


pgsql-performance by date:

Previous
From: Mladen Gogala
Date:
Subject: Re: temporary tables, indexes, and query plans
Next
From: Josh Berkus
Date:
Subject: Re: CPUs for new databases