Thread: insert vs select into performance

insert vs select into performance

From

Thomas Finneid

Date:

17 July 2007, 16:39:08

Hi

I was doing some testing on "insert" compared to "select into". I
inserted 100 000 rows (with 8 column values) into a table, which took 14
seconds, compared to a select into, which took 0.8 seconds.
(fyi, the inserts where batched, autocommit was turned off and it all
happend on the local machine)

Now I am wondering why the select into is that much faster?
Does the select into translate into a specially optimised function in c
that can cut corners which a insert can not do (e.g. lazy copying), or
is it some other reason?

The reason I am asking is that select into shows that a number of rows
can be inserted into a table quite a lot faster than one would think was
possible with ordinary sql. If that is the case, it means that if I
write an pl-pgsql insert function in C instead of sql, then I can have
my db perform order of magnitude faster.

Any comments?

regards

thomas

Re: insert vs select into performance

From

Michael Glaesemann

Date:

17 July 2007, 16:55:22

On Jul 17, 2007, at 14:38 , Thomas Finneid wrote:

> I was doing some testing on "insert" compared to "select into". I
> inserted 100 000 rows (with 8 column values) into a table, which
> took 14 seconds, compared to a select into, which took 0.8 seconds.
> (fyi, the inserts where batched, autocommit was turned off and it
> all happend on the local machine)
>
> Now I am wondering why the select into is that much faster?

It would be helpful if you included the actual queries you're using,
as there are a number of variables:

1) If there are any constraints on the original table, the INSERT
will be checking those constraints. AIUI, SELECT INTO does not
generate any table constraints.

2a) Are you using INSERT INTO foo (foo1, foo2, foo2) SELECT foo1,
foo2, foo3 FROM pre_foo or individual inserts for each row? The
former would be faster than the latter.

2b) If you are doing individual inserts, are you wrapping them in a
transaction? The latter would be faster.

Michael Glaesemann
grzm seespotcode net

Re: insert vs select into performance

From

Arjen van der Meijden

Date:

17 July 2007, 17:17:36

Have you also tried the COPY-statement? Afaik select into is similar to
what happens in there.

Best regards,

Arjen

On 17-7-2007 21:38 Thomas Finneid wrote:
> Hi
>
> I was doing some testing on "insert" compared to "select into". I
> inserted 100 000 rows (with 8 column values) into a table, which took 14
> seconds, compared to a select into, which took 0.8 seconds.
> (fyi, the inserts where batched, autocommit was turned off and it all
> happend on the local machine)
>
> Now I am wondering why the select into is that much faster?
> Does the select into translate into a specially optimised function in c
> that can cut corners which a insert can not do (e.g. lazy copying), or
> is it some other reason?
>
> The reason I am asking is that select into shows that a number of rows
> can be inserted into a table quite a lot faster than one would think was
> possible with ordinary sql. If that is the case, it means that if I
> write an pl-pgsql insert function in C instead of sql, then I can have
> my db perform order of magnitude faster.
>
> Any comments?
>
> regards
>
> thomas
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

Re: insert vs select into performance

From

Tom Lane

Date:

17 July 2007, 17:31:25

Michael Glaesemann <grzm@seespotcode.net> writes:
> It would be helpful if you included the actual queries you're using,
> as there are a number of variables:

Not to mention which PG version he's testing.  Since (I think) 8.1,
SELECT INTO knows that it can substitute one fsync for WAL-logging
the individual row inserts, since if there's a crash the new table
will disappear anyway.

            regards, tom lane

Re: insert vs select into performance

From

Thomas Finneid

Date:

17 July 2007, 17:50:27


Michael Glaesemann wrote:
>
> On Jul 17, 2007, at 14:38 , Thomas Finneid wrote:
>
>> I was doing some testing on "insert" compared to "select into". I
>> inserted 100 000 rows (with 8 column values) into a table, which took
>> 14 seconds, compared to a select into, which took 0.8 seconds.
>> (fyi, the inserts where batched, autocommit was turned off and it all
>> happend on the local machine)
>>
>> Now I am wondering why the select into is that much faster?
>
> It would be helpful if you included the actual queries you're using, as
> there are a number of variables:

create table ciu_data_type
(
    id        integer,
    loc_id          integer,
    value1        integer,
    value2        real,
    value3        integer,
    value4        real,
    value5        real,
    value6        char(2),
    value7        char(3),
    value8        bigint,
    value9        bigint,
    value10        real,
    value11        bigint,
    value12        smallint,
    value13        double precision,
    value14        real,
    value15        real,
    value16        char(1),
    value17        varchar(18),
    value18        bigint,
    value19        char(4)
);

performed with JDBC

insert into ciu_data_type (id, loc_id, value3, value5, value8, value9,
value10, value11 ) values (?,?,?,?,?,?,?,?)

select * into ciu_data_type_copy from ciu_data_type

> 1) If there are any constraints on the original table, the INSERT will
> be checking those constraints. AIUI, SELECT INTO does not generate any
> table constraints.

No constraints in this test.

> 2a) Are you using INSERT INTO foo (foo1, foo2, foo2) SELECT foo1, foo2,
> foo3 FROM pre_foo or individual inserts for each row? The former would
> be faster than the latter.
>
> 2b) If you are doing individual inserts, are you wrapping them in a
> transaction? The latter would be faster.

disabling autocommit, but nothing more than that


I havent done this test in a stored function yet, nor have I tried it
with a C client so far, so there is the chance that it is java/jdbc that
makes the insert so slow. I'll get to that test soon if there is any
chance my theory makes sence.

regards

thomas

Re: insert vs select into performance

From

Thomas Finneid

Date:

17 July 2007, 17:58:07

Tom Lane wrote:
> Michael Glaesemann <grzm@seespotcode.net> writes:
>> It would be helpful if you included the actual queries you're using,
>> as there are a number of variables:
>
> Not to mention which PG version he's testing.

Its pg 8.1, for now, I'll be upgrading to a compile optimised 8.2 when I
do the real test on the real server.

(its on kubuntu 6.10 running on a Thinkpad T60 with dual core 1.5,GB RAM
and 100GB SATA, just in case anybody feels that is of any interrest.)

> Since (I think) 8.1,
> SELECT INTO knows that it can substitute one fsync for WAL-logging
> the individual row inserts, since if there's a crash the new table
> will disappear anyway.

I am not sure I understand you correctly here, are you saying that
SELECT INTO in 8.1 disables WAL logging and uses just a single fsync at
the end? in that case it means that I could disable WAL as well and
achieve the same performance, does it not?

regards

thomas

Re: insert vs select into performance

From

Michael Stone

Date:

17 July 2007, 17:59:04

On Tue, Jul 17, 2007 at 10:50:22PM +0200, Thomas Finneid wrote:
>I havent done this test in a stored function yet, nor have I tried it
>with a C client so far, so there is the chance that it is java/jdbc that
>makes the insert so slow. I'll get to that test soon if there is any
>chance my theory makes sence.

What you're seeing is perfectly normal. Switch to COPY for fast inserts.
(When you use inserts you need to wait for a round-trip for each row,
instead of sending data to the server as fast as possible.)

Mike Stone

Re: insert vs select into performance

From

Mark Lewis

Date:

17 July 2007, 17:59:36

If you're performing via JDBC, are you using addBatch/executeBatch, or
are you directly executing each insert?  If you directly execute each
insert, then your code will wait for a server round-trip between each
insert.

That still won't get you to the speed of select into, but it should
help.  You could also look at the pgsql-jdbc archives for the JDBC
driver patches which allow you to use COPY-style bulk loading, which
should get you to the performance level of COPY, which should be
reasonably close to the performance of select into.

-- Mark Lewis

On Tue, 2007-07-17 at 22:50 +0200, Thomas Finneid wrote:
>
> Michael Glaesemann wrote:
> >
> > On Jul 17, 2007, at 14:38 , Thomas Finneid wrote:
> >
> >> I was doing some testing on "insert" compared to "select into". I
> >> inserted 100 000 rows (with 8 column values) into a table, which took
> >> 14 seconds, compared to a select into, which took 0.8 seconds.
> >> (fyi, the inserts where batched, autocommit was turned off and it all
> >> happend on the local machine)
> >>
> >> Now I am wondering why the select into is that much faster?
> >
> > It would be helpful if you included the actual queries you're using, as
> > there are a number of variables:
>
> create table ciu_data_type
> (
>     id        integer,
>     loc_id          integer,
>     value1        integer,
>     value2        real,
>     value3        integer,
>     value4        real,
>     value5        real,
>     value6        char(2),
>     value7        char(3),
>     value8        bigint,
>     value9        bigint,
>     value10        real,
>     value11        bigint,
>     value12        smallint,
>     value13        double precision,
>     value14        real,
>     value15        real,
>     value16        char(1),
>     value17        varchar(18),
>     value18        bigint,
>     value19        char(4)
> );
>
> performed with JDBC
>
> insert into ciu_data_type (id, loc_id, value3, value5, value8, value9,
> value10, value11 ) values (?,?,?,?,?,?,?,?)
>
> select * into ciu_data_type_copy from ciu_data_type
>
> > 1) If there are any constraints on the original table, the INSERT will
> > be checking those constraints. AIUI, SELECT INTO does not generate any
> > table constraints.
>
> No constraints in this test.
>
> > 2a) Are you using INSERT INTO foo (foo1, foo2, foo2) SELECT foo1, foo2,
> > foo3 FROM pre_foo or individual inserts for each row? The former would
> > be faster than the latter.
> >
> > 2b) If you are doing individual inserts, are you wrapping them in a
> > transaction? The latter would be faster.
>
> disabling autocommit, but nothing more than that
>
>
> I havent done this test in a stored function yet, nor have I tried it
> with a C client so far, so there is the chance that it is java/jdbc that
> makes the insert so slow. I'll get to that test soon if there is any
> chance my theory makes sence.
>
> regards
>
> thomas
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

Re: insert vs select into performance

From

Thomas Finneid

Date:

17 July 2007, 18:01:25


Arjen van der Meijden wrote:
> Have you also tried the COPY-statement? Afaik select into is similar to
> what happens in there.

No, because it only works on file to db or vice versa not table to table.

regards

thoams

Re: insert vs select into performance

From

Michael Glaesemann

Date:

17 July 2007, 18:07:10

On Jul 17, 2007, at 15:50 , Thomas Finneid wrote:

> Michael Glaesemann wrote:

>> 2a) Are you using INSERT INTO foo (foo1, foo2, foo2) SELECT foo1,
>> foo2, foo3 FROM pre_foo or individual inserts for each row? The
>> former would be faster than the latter.

> performed with JDBC
>
> insert into ciu_data_type (id, loc_id, value3, value5, value8,
> value9, value10, value11 ) values (?,?,?,?,?,?,?,?)

As they're individual inserts, I think what you're seeing is overhead
from calling this statement 100,000 times, not just on the server but
also the overhead through JDBC. For comparison, try

CREATE TABLE ciu_data_type_copy LIKE ciu_data_type;

INSERT INTO ciu_data_type_copy (id, loc_id, value3, value5, value8,
value9, value10, value11)
SELECT id, loc_id, value3, value5, value8, value9, value10, value11
FROM ciu_data_type;

I think this would be more comparable to what you're seeing.

> I havent done this test in a stored function yet, nor have I tried
> it with a C client so far, so there is the chance that it is java/
> jdbc that makes the insert so slow. I'll get to that test soon if
> there is any chance my theory makes sence.

Just testing in psql with \timing should be fairly easy.

Michael Glaesemann
grzm seespotcode net

Re: insert vs select into performance

From

Thomas Finneid

Date:

17 July 2007, 18:11:00

Mark Lewis wrote:
> If you're performing via JDBC, are you using addBatch/executeBatch, or
> are you directly executing each insert?  If you directly execute each
> insert, then your code will wait for a server round-trip between each
> insert.

I tested both and I found almost no difference in the time it took to
perform it. Mind you this was on a local machine, but I still thought
that it was a bit strange.

> That still won't get you to the speed of select into, but it should
> help.  You could also look at the pgsql-jdbc archives for the JDBC
> driver patches which allow you to use COPY-style bulk loading, which
> should get you to the performance level of COPY, which should be
> reasonably close to the performance of select into.

Yes, someone else on the list suggested this a couple of weeks ago. I
havent had a chance to test it yet, but I am hopeful that I can use it.

The only issue I have is that the test I have done are rather
simplistic, because they are just speed trials. The real system will
probably use 5-10 tables, with up to 100 columns for all tables, that
means I need a stored function which goes through all bulked data and
reinserts them into their real tables. I am worried that this might hurt
the performance so much so that almost the entire bulk copy advantage
diasappears. This is why I am wondering about the details of SELECT INTO
and C functions etc.

regards

thomas

Re: insert vs select into performance

From

PFC

Date:

17 July 2007, 18:13:00

> I was doing some testing on "insert" compared to "select into". I
> inserted 100 000 rows (with 8 column values) into a table, which took 14
> seconds, compared to a select into, which took 0.8 seconds.
> (fyi, the inserts where batched, autocommit was turned off and it all
> happend on the local machine)

    Did you use prepared statements ?
    Did you use INSERT INTO ... VALUES () with a long list of values, or just
100K insert statements ?

    It's the time to parse statements, plan, execute, roundtrips with the
client, context switches, time for your client library to escape the data
and encode it and for postgres to decode it, etc. In a word : OVERHEAD.

    By the way which language and client library are you using ?

    FYI 14s / 100k = 140 microseconds per individual SQL query. That ain't
slow at all.

> Does the select into translate into a specially optimised function in c
> that can cut corners which a insert can not do (e.g. lazy copying), or
> is it some other reason?

    Yeah : instead of your client having to encode 100K * 8 values, send it
over a socket, and postgres decoding it, INSERT INTO SELECT just takes the
data, and writes the data. Same thing as writing a file a byte at a time
versus using a big buffer.

> The reason I am asking is that select into shows that a number of rows
> can be inserted into a table quite a lot faster than one would think was
> possible with ordinary sql. If that is the case, it means that if I
> write an pl-pgsql insert function in C instead of sql, then I can have
> my db perform order of magnitude faster.

    Fortunately this is already done for you : there is the PREPARE
statement, which will remove the parsing overhead. If you must insert many
rows, use VALUES (),(),()...

Re: insert vs select into performance

From

Thomas Finneid

Date:

17 July 2007, 18:27:14

PFC wrote:
>
>> I was doing some testing on "insert" compared to "select into". I
>> inserted 100 000 rows (with 8 column values) into a table, which took
>> 14 seconds, compared to a select into, which took 0.8 seconds.
>> (fyi, the inserts where batched, autocommit was turned off and it all
>> happend on the local machine)
>
>     Did you use prepared statements ?
>     Did you use INSERT INTO ... VALUES () with a long list of values, or
> just 100K insert statements ?

It was prepared statements and I tried it both batched and non-batched
(not much difference on a local machine)

>     It's the time to parse statements, plan, execute, roundtrips with
> the client, context switches, time for your client library to escape the
> data and encode it and for postgres to decode it, etc. In a word :
> OVERHEAD.

I know there is some overhead, but that much when running it batched...?

>     By the way which language and client library are you using ?
>
>     FYI 14s / 100k = 140 microseconds per individual SQL query. That
> ain't slow at all.

Unfortunately its not fast enough, it needs to be done in no more than
1-2 seconds, ( and in production it will be maybe 20-50 columns of data,
perhaps divided over 5-10 tables.)
Additionally it needs to scale to perhaps three times as many columns
and perhaps 2 - 3 times as many rows in some situation within 1 seconds.
Further on it needs to allow for about 20 - 50 clients reading much of
that data before the next batch of data arrives.

I know the computer is going to be a much faster one than the one I am
testing with, but I need to make sure the solution scales well.

regars

thomas

Re: insert vs select into performance

From

PFC

Date:

18 July 2007, 08:05:41

>>     It's the time to parse statements, plan, execute, roundtrips with
>> the client, context switches, time for your client library to escape
>> the data and encode it and for postgres to decode it, etc. In a word :
>> OVERHEAD.
>
> I know there is some overhead, but that much when running it batched...?

    Well, yeah ;)

> Unfortunately its not fast enough, it needs to be done in no more than
> 1-2 seconds, ( and in production it will be maybe 20-50 columns of data,
> perhaps divided over 5-10 tables.)
> Additionally it needs to scale to perhaps three times as many columns
> and perhaps 2 - 3 times as many rows in some situation within 1 seconds.
> Further on it needs to allow for about 20 - 50 clients reading much of
> that data before the next batch of data arrives.

    Wow. What is the application ?

    Test run on a desktop PC, Athlon 64 3200+, 2 IDE disks in RAID1 (pretty
slow) :

test=> CREATE TABLE test (a INT, b INT, c INT, d INT, e INT, f INT);
CREATE TABLE
Temps : 11,463 ms

test=> INSERT INTO test SELECT 1,2,3,4,5,a FROM generate_series( 1, 100000
) as a;
INSERT 0 100000
Temps : 721,579 ms

    OK, so you see, insert speed is pretty fast. With a better CPU and faster
disks, you can get a lot more.

test=> TRUNCATE TABLE test;
TRUNCATE TABLE
Temps : 30,010 ms

test=> ALTER TABLE test ADD PRIMARY KEY (f);
INFO:  ALTER TABLE / ADD PRIMARY KEY créera un index implicite «test_pkey»
pour la table «test»
ALTER TABLE
Temps : 100,577 ms

test=> INSERT INTO test SELECT 1,2,3,4,5,a FROM generate_series( 1, 100000
) as a;
INSERT 0 100000
Temps : 1915,928 ms

    This includes the time to update the index.

test=> DROP TABLE test;
DROP TABLE
Temps : 28,804 ms

test=> CREATE TABLE test (a INT, b INT, c INT, d INT, e INT, f INT);
CREATE TABLE
Temps : 1,626 ms

test=> CREATE OR REPLACE FUNCTION test_insert( )
     RETURNS VOID
     LANGUAGE plpgsql
     AS
$$
DECLARE
     _i INTEGER;
BEGIN
     FOR _i IN 0..100000 LOOP
         INSERT INTO test (a,b,c,d,e,f) VALUES (1,2,3,4,5, _i);
     END LOOP;
END;
$$;
CREATE FUNCTION
Temps : 51,948 ms

test=> SELECT test_insert();
  test_insert
-------------

(1 ligne)

Temps : 1885,382 ms

    Now you see, performing 100K individual inserts inside a plpgsql function
is also fast.
    The postgres engine is pretty damn fast ; it's the communication overhead
that you feel, especially switching between client and server processes.

    Another example :

=> INSERT INTO test (a,b,c,d,e,f) VALUES (... 100000 integer tuples)
INSERT 0 100000
Temps : 1836,458 ms

    VALUES is actually pretty fast. Here, there is no context switch,
everything is done in 1 INSERT.

    However COPY is much faster because the parsing overhead and de-escaping
of data is faster. COPY is optimized for throughput.

    So, advice :

    For optimum throughput, have your application build chunks of data into
text files and use COPY. Or if your client lib supports the copy
interface, use it.
    You will need a fast disk system with xlog and data on separate disks,
several CPU cores (1 insert thread will max out 1 core, use the others for
selects), lots of RAM so index updates don't need to seek, and tuning of
bgwriter and checkpoints to avoid load spikes.

Re: insert vs select into performance

From

Michael Stone

Date:

18 July 2007, 08:32:17

On Tue, Jul 17, 2007 at 10:58:01PM +0200, Thomas Finneid wrote:
>I am not sure I understand you correctly here, are you saying that
>SELECT INTO in 8.1 disables WAL logging and uses just a single fsync at
>the end? in that case it means that I could disable WAL as well and
>achieve the same performance, does it not?

Yes. The difference is that the select into optimization just means that
if the system crashes the data you're inserting is invalid (and is
properly cleaned up), and disabling the WAL means that if the system
crashes everything is invalid (and can't be cleaned up).

Mike Stone

Re: insert vs select into performance

From

Michael Stone

Date:

18 July 2007, 08:33:06

On Tue, Jul 17, 2007 at 11:01:15PM +0200, Thomas Finneid wrote:
>Arjen van der Meijden wrote:
>>Have you also tried the COPY-statement? Afaik select into is similar to
>>what happens in there.
>
>No, because it only works on file to db or vice versa not table to table.

I don't understand how the insert you described is table to table?

Mike Stone

Re: insert vs select into performance

From

Thomas Finneid

Date:

18 July 2007, 16:08:55

PFC wrote:
>> Unfortunately its not fast enough, it needs to be done in no more than
>> 1-2 seconds, ( and in production it will be maybe 20-50 columns of
>> data, perhaps divided over 5-10 tables.)
>> Additionally it needs to scale to perhaps three times as many columns
>> and perhaps 2 - 3 times as many rows in some situation within 1 seconds.
>> Further on it needs to allow for about 20 - 50 clients reading much of
>> that data before the next batch of data arrives.
>
>     Wow. What is the application ?

Geological surveys, where they perform realtime geo/hydro-phone shots of
areas of the size of 10x10km every 3-15 seconds.

> test=> CREATE OR REPLACE FUNCTION test_insert( )
>     RETURNS VOID
>     LANGUAGE plpgsql
>     AS
> $$
> DECLARE
>     _i INTEGER;
> BEGIN
>     FOR _i IN 0..100000 LOOP
>         INSERT INTO test (a,b,c,d,e,f) VALUES (1,2,3,4,5, _i);
>     END LOOP;
> END;
> $$;
> CREATE FUNCTION
> Temps : 51,948 ms
>
> test=> SELECT test_insert();
>  test_insert
> -------------
>
> (1 ligne)
>
> Temps : 1885,382 ms

I tested this one and it took 4 seconds, compared to the jdbc insert
which took 14 seconds, so its a lot faster. but not as fast as the
SELECT INTO.

I also tested an INSERT INTO FROM SELECT, which took 1.8 seconds, now we
are starting to talk about real performance.

>     However COPY is much faster because the parsing overhead and
> de-escaping of data is faster. COPY is optimized for throughput.
>
>     So, advice :
>
>     For optimum throughput, have your application build chunks of data
> into text files and use COPY. Or if your client lib supports the copy
> interface, use it.

I did test COPY, i.e. the jdbc COPY patch for pg 8.1, it performs at
approx 1.8 seconds :) The test was done with text input, I am going to
test it with binary input, which I expect will increase the performance
with 20-50%.

All these test have ben performed on a laptop with a Kubuntu 6.10
version of pg 8.1 without any special pg performance tuning. So I expect
that compiling lates pg and doing some tuning on it and testing it on
the a representative server will give it an additional boost in performance.

The key here is that with abundance in performance, I can experiment
with the solution in a completely different way than if I had any
"artificial" restrictions.

>     You will need a fast disk system with xlog and data on separate
> disks, several CPU cores (1 insert thread will max out 1 core, use the
> others for selects), lots of RAM so index updates don't need to seek,
> and tuning of bgwriter and checkpoints to avoid load spikes.

will have a look at it.

regards

thomas

Re: insert vs select into performance

From

Thomas Finneid

Date:

18 July 2007, 16:12:09

Michael Glaesemann wrote:
>
> As they're individual inserts, I think what you're seeing is overhead
> from calling this statement 100,000 times, not just on the server but
> also the overhead through JDBC. For comparison, try
>
> CREATE TABLE ciu_data_type_copy LIKE ciu_data_type;
>
> INSERT INTO ciu_data_type_copy (id, loc_id, value3, value5, value8,
> value9, value10, value11)
> SELECT id, loc_id, value3, value5, value8, value9, value10, value11
> FROM ciu_data_type;
>
> I think this would be more comparable to what you're seeing.

This is much faster than my previous solution, but, I also tested two
other solutions
- a stored function with array arguments and it performed 3 times better.
- jdbc with COPY patch performed 8.4 times faster with text input,
expect binary input to be even faster.

regards

thomas

Re: insert vs select into performance

From

Thomas Finneid

Date:

18 July 2007, 16:14:00


Michael Stone wrote:
> On Tue, Jul 17, 2007 at 11:01:15PM +0200, Thomas Finneid wrote:
>> Arjen van der Meijden wrote:
>>> Have you also tried the COPY-statement? Afaik select into is similar
>>> to what happens in there.
>>
>> No, because it only works on file to db or vice versa not table to table.
>
> I don't understand how the insert you described is table to table?

SELECT INTO is table to table, so is INSERT INTO SELECT FROM.

regards

thomas

Re: insert vs select into performance

From

Adriaan van Os

Date:

18 July 2007, 16:29:12

Michael Stone wrote:
> On Tue, Jul 17, 2007 at 10:58:01PM +0200, Thomas Finneid wrote:
>> I am not sure I understand you correctly here, are you saying that
>> SELECT INTO in 8.1 disables WAL logging and uses just a single fsync
>> at the end? in that case it means that I could disable WAL as well and
>> achieve the same performance, does it not?
>
> Yes. The difference is that the select into optimization just means that
> if the system crashes the data you're inserting is invalid (and is
> properly cleaned up), and disabling the WAL means that if the system
> crashes everything is invalid (and can't be cleaned up).

So, how does one (temporarily) disable WAL logging ? Or, for example, disable WAL logging for a
temporary table ?

Regards,

Adriaan van Os

Re: insert vs select into performance

From

Heikki Linnakangas

Date:

18 July 2007, 16:59:10

Adriaan van Os wrote:
> So, how does one (temporarily) disable WAL logging ? Or, for example,
> disable WAL logging for a temporary table ?

Operations on temporary tables are never WAL logged. Operations on other
tables are, and there's no way to disable it.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Re: insert vs select into performance

From

Michael Stone

Date:

23 July 2007, 08:52:11

On Wed, Jul 18, 2007 at 09:13:14PM +0200, Thomas Finneid wrote:
>Michael Stone wrote:
>>I don't understand how the insert you described is table to table?
>
>SELECT INTO is table to table, so is INSERT INTO SELECT FROM.

I could have sworn that at least one of the examples you gave didn't
have any select. Doesn't really matter.

Mike Stone