Home > mailing lists

Re: Select with qualified join condition / Batch inserts - Mailing list pgsql-performance

From	Michael Nonemacher
Subject	Re: Select with qualified join condition / Batch inserts
Date	October 15, 2004 18:17:15
Msg-id	E3A41572DB871B42AB6939873D95E8CA523DC0@auscorpex-1.austin.messageone.com Whole thread Raw
In response to	Select with qualified join condition / Batch inserts (Bernd <bernd_pg@genedata.com>)
List	pgsql-performance

Tree view

> 2/ Batch-inserts using jdbc (maybe this should go to the jdbc-mailing
list -
> but it is also performance related ...):
> Performing many inserts using a PreparedStatement and batch execution
makes a
> significant performance improvement in Oracle. In postgres, I did not
observe
> any performance improvement using batch execution. Are there any
special
> caveats when using batch execution with postgres?

When you call executeBatch(), it doesn't send all the queries in a
single round-trip; it just iterates through the batched queries and
executes them one by one.  In my own applications, I've done
simulated-batch queries like this:

insert into T (a, b, c)
  select 1,2,3 union all
  select 2,3,4 union all
  select 3,4,5

It's ugly, and you have to structure your code in such a way that the
query can't get too large, but it provides a similar performance benefit
to batching.  You probably don't save nearly as much parse time as using
a batched PreparedStatement, but you at least get rid of the network
roundtrips.

(Of course, it'd be much nicer if statement-batching worked.  There have
been rumblings about doing this, and some discussion on how to do it,
but I haven't heard about any progress.  Anyone?)

mike

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Bernd
Sent: Friday, October 15, 2004 5:25 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Select with qualified join condition / Batch inserts


Hi,

we are working on a product which was originally developed against an
Oracle
database and which should be changed to also work with postgres.

Overall the changes we had to make are very small and we are very
pleased with
the good performance of postgres - but we also found queries which
execute
much faster on Oracle. Since I am not yet familiar with tuning queries
for
postgres, it would be great if someone could give me a hint on the
following
two issues. (We are using PG 8.0.0beta3 on Linux kernel 2.4.27):

1/ The following query takes about 5 sec. with postrgres whereas on
Oracle it
executes in about 30 ms (although both tables only contain 200 k records
in
the postgres version).

SQL:

SELECT cmp.WELL_INDEX, cmp.COMPOUND, con.CONCENTRATION
    FROM SCR_WELL_COMPOUND cmp, SCR_WELL_CONCENTRATION con
    WHERE cmp.BARCODE=con.BARCODE
        AND cmp.WELL_INDEX=con.WELL_INDEX
        AND cmp.MAT_ID=con.MAT_ID
        AND cmp.MAT_ID = 3
        AND cmp.BARCODE='910125864'
        AND cmp.ID_LEVEL = 1;

Table-def:
        Table "public.scr_well_compound"
   Column   |          Type          | Modifiers
------------+------------------------+-----------
 mat_id     | numeric(10,0)          | not null
 barcode    | character varying(240) | not null
 well_index | numeric(5,0)           | not null
 id_level   | numeric(3,0)           | not null
 compound   | character varying(240) | not null
Indexes:
    "scr_wcm_pk" PRIMARY KEY, btree (id_level, mat_id, barcode,
well_index) Foreign-key constraints:
    "scr_wcm_mat_fk" FOREIGN KEY (mat_id) REFERENCES
scr_mapping_table(mat_id)
ON DELETE CASCADE

       Table "public.scr_well_concentration"
    Column     |          Type          | Modifiers
---------------+------------------------+-----------
 mat_id        | numeric(10,0)          | not null
 barcode       | character varying(240) | not null
 well_index    | numeric(5,0)           | not null
 concentration | numeric(20,10)         | not null
Indexes:
    "scr_wco_pk" PRIMARY KEY, btree (mat_id, barcode, well_index)
Foreign-key constraints:
    "scr_wco_mat_fk" FOREIGN KEY (mat_id) REFERENCES
scr_mapping_table(mat_id)
ON DELETE CASCADE

I tried several variants of the query (including the SQL 92 JOIN ON
syntax)
but with no success. I have also rebuilt the underlying indices.

A strange observation is that the same query runs pretty fast without
the
restriction to a certain MAT_ID, i. e. omitting the MAT_ID=3 part.

Also fetching the data for both tables separately is pretty fast and a
possible fallback would be to do the actual join in the application
(which is
of course not as beautiful as doing it using SQL ;-)

2/ Batch-inserts using jdbc (maybe this should go to the jdbc-mailing
list -
but it is also performance related ...):
Performing many inserts using a PreparedStatement and batch execution
makes a
significant performance improvement in Oracle. In postgres, I did not
observe
any performance improvement using batch execution. Are there any special

caveats when using batch execution with postgres?

Thanks and regards

Bernd




---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

pgsql-performance by date:

From: Alan Stange
Date: 15 October 2004, 16:53:28
Subject: Re: First set of OSDL Shared Mem scalability results, some

From: Tom Lane
Date: 15 October 2004, 19:37:50
Subject: Re: [pgsql-hackers-win32] Performance on Win32 vs Cygwin

Re: Select with qualified join condition / Batch inserts - Mailing list pgsql-performance

Previous

Next