Performance question - Mailing list pgsql-general

From Jean-Christian Imbeault
Subject Performance question
Date
Msg-id 3F026A5B.5050202@mega-bucks.co.jp
Whole thread Raw
Responses Re: Performance question  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
I'm trying to convince another open-source project (phpOpenTracker) to
modify their current INSERT sql queries. Currently they just do an
INSERT into a table without first checking if their might be a record
with the same primary key.

The reason for this that they need fast inserts and most user I assume
are using MySQL which silently drops INSERT queries that violate primary
key constraints. But postgres on the other hand (and rightly so) issues
and logs an error.

I have suggested that their current INSERT INTO t VALUES() be changed to:

INSERT INTO
   T
SELECT 'v1', 'v2'
WHERE
   NOT EXISTS (
     SELECT NULL FROM t WHERE pk='v1'
   )

However one of the developers is worried that this would cause a
performance drop for MySQL users b/c of the extra SELECT in my version
of the INSERT query.

I had thought that the 'extra' SELECT isn't really extra at all since
*any* DB still has to check that there is a record with the primary key
that we are trying to insert. So whereas in my query the SELECT is
explicitly stated in the regular version of a simple INSERT, the select
is still there but implicit since there was a primary key defined on the
table. So there really shouldn't be much, if any of a performance drop.

Is there any truth to my logic in the previous paragraph? :)

I'd like to test my theory on a real MySQL database but I don't have
access to one right now, and I am not sure how to go about testing a
MySQL db or even what kind of testing. If I can get a spare machine I
will give it a go though.

Thanks,

Jean-Christian Imbeault


pgsql-general by date:

Previous
From: Ian Barwick
Date:
Subject: Re: Duplicate key insert question
Next
From: Tom Lane
Date:
Subject: Re: ERROR: Relation "pg_user" does not exist