Thread: Inserting into table only if the row does not already exist.

Inserting into table only if the row does not already exist.

From

"C. Bensend"

Date:

15 October 2004, 05:18:53

Hey folks,
  I am inserting data into a table with a three-column primary key (the
table is only three columns).  The rows I'm trying to insert may very
well be duplicates of ones already in the table, so I would like to have
PostgreSQL handle the insert and possible error resulting from dup data.
I can certainly do a SELECT first and then INSERT if it's not a duplicate,
ala:

* Do a SELECT against the three columns
* If there are no rows returned, then do the INSERT
  But I thought this would be possible with the following:

INSERT INTO table ( column1, column2, column3 )  SELECT column1, column2, column3  WHERE NOT EXISTS (     SELECT
column1,column2, column3 FROM table WHERE        column1 = $column1 AND        column2 = $column2 AND        column3 =
$column3)
 
  .. which gave me 'ERROR: column1 does not exist'.  Nuts.
  Is this possible to do in one statement?  I checked with a DBA friend
(he's an Oracle guy), Google, and the list archives, and either didn't
find anything helpful or simply didn't try the correct search terms.  Or
should I be doing this sort of thing in two separate queries?

Thanks for all the help you folks have given me,

Benny


-- 
"Even if a man chops off your hand with a sword, you still have two nice,
sharp bones to stick in his eyes."                                                     -- .sig on Slashdot

Re: Inserting into table only if the row does not already exist.

From

Greg Stark

Date:

15 October 2004, 06:11:02

"C. Bensend" <benny@bennyvision.com> writes:

> INSERT INTO table ( column1, column2, column3 )
>    SELECT column1, column2, column3
>    WHERE NOT EXISTS (
>       SELECT column1, column2, column3 FROM table WHERE
>          column1 = $column1 AND
>          column2 = $column2 AND
>          column3 = $column3 )
> 
>    .. which gave me 'ERROR: column1 does not exist'.  Nuts.

Well you're not selecting from any table so "column1" isn't going to exist.
You just have to put it in the select list as a constant. If you're feeling
generous to the next programmer to read it you could put "AS column1" after
each one, but the column name doesn't actually have to match the column you're
inserting into.
INSERT INTO table ( column1, column2, column3 )  (   SELECT $column1, $column2, $column3    WHERE NOT EXISTS (
SELECT1        FROM table        WHERE column1 = $column1         AND column2 = $column2         AND column3 = $column3
)  )

Note that this is going to have some concurrency issues. I think it will be
possible for a second query to execute before the first commits. In that case
it won't see the record the first query inserted and try to insert again.
You'll just get a primary key violation though which I guess you can just
ignore.

Which raises a question. Why not forgoe this complicated SQL and try to do the
insert. If you get a primary key violation, well there's your answer... If you
don't care about the failure just ignore it and move on. I would suggest
checking specifically for a primary key violation and still stopping execution
on unexpected errors though.

If you're doing this inside a bigger transaction that's a bit more of a pain.
Until 8.0 postgres can't handle ignoring an error on a query without aborting
the entire transaction. But if you're in autocommit mode then you can just
ignore the primary key violation and continue. 

Incidentally, if you're putting your parameters directly into your queries
using $column1 then you've got a potential security problem. Unless you're
quoting every variable everywhere religiously using postgres's quoting
functions an attacker can sneak extra SQL into your queries. Potentially
including whole new statements such as "DELETE FROM table"...

-- 
greg

Re: Inserting into table only if the row does not already

From

"C. Bensend"

Date:

15 October 2004, 14:15:27

> You just have to put it in the select list as a constant. If you're
> feeling
> generous to the next programmer to read it you could put "AS column1"
> after
> each one, but the column name doesn't actually have to match the column
> you're
> inserting into.

Sweet GOD, I hope no one ever has to read the stuff I'm working on!

This is just a personal curiousity project, not anything for work or
anything released publicly.  I'd be far too embarrassed to ever release
this, as I'm just fumbling along, learning.  :)

> Note that this is going to have some concurrency issues. I think it will
> be
> possible for a second query to execute before the first commits. In that
> case
> it won't see the record the first query inserted and try to insert again.
> You'll just get a primary key violation though which I guess you can just
> ignore.

Concurrency shouldn't be an issue - this is a perl script running from
cron.  The only concurrency that will ever happen is if I'm a bonehead
and I run the script manually right as cron kicks off another copy.

> Which raises a question. Why not forgoe this complicated SQL and try to do
> the
> insert. If you get a primary key violation, well there's your answer... If
> you
> don't care about the failure just ignore it and move on. I would suggest
> checking specifically for a primary key violation and still stopping
> execution
> on unexpected errors though.

Well, I was hoping to minimize the amount of perl needed to get this one
task done, but I think that's going to be the best way to do it.

> Incidentally, if you're putting your parameters directly into your queries
> using $column1 then you've got a potential security problem. Unless you're
> quoting every variable everywhere religiously using postgres's quoting
> functions an attacker can sneak extra SQL into your queries. Potentially
> including whole new statements such as "DELETE FROM table"...

Yes indeed, good catch.  I'll clean that up immediately.

Thanks, Greg!

Benny


-- 
"Even if a man chops off your hand with a sword, you still have two nice,
sharp bones to stick in his eyes."                                                     -- .sig on Slashdot

Re: Inserting into table only if the row does not already

From

Steven Klassen

Date:

17 October 2004, 01:49:35

* C. Bensend <benny@bennyvision.com> [2004-10-15 08:14:51 -0500]:

> Concurrency shouldn't be an issue - this is a perl script running
> from cron. The only concurrency that will ever happen is if I'm a
> bonehead and I run the script manually right as cron kicks off
> another copy.

Even cron'd scripts can run amok. Lock files are great as a CYA
measure even if you're relatively certain there will be no opportunity
for one copy to step on the next.

I once had a cron'd script mail upper management every hour I was on
vacation. You can imagine how long it was until I got another vacation
request approved. ;)

-- 
Steven Klassen - Lead Programmer
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Replication & Support Services, (503) 667-4564

Re: Inserting into table only if the row does not already

From

"C. Bensend"

Date:

17 October 2004, 02:10:27

> Even cron'd scripts can run amok. Lock files are great as a CYA
> measure even if you're relatively certain there will be no opportunity
> for one copy to step on the next.

Yes, you are absolutely correct, of course.  :)

I should be more specific - I'm working on a personal project, and I'm
not going to worry about that right now.  The risk of a cron gone wild
is acceptable to me at this moment.

Right now my priority is getting data flowing into the database instead
of piling up on the host like it is.  :)

Thanks much!

Benny


-- 
"Even if a man chops off your hand with a sword, you still have two nice,
sharp bones to stick in his eyes."                                                     -- .sig on Slashdot

Re: Inserting into table only if the row does not already

From

Greg Stark

Date:

17 October 2004, 06:22:28

"C. Bensend" <benny@bennyvision.com> writes:

> The risk of a cron gone wild is acceptable to me at this moment.

Gee, now I have images of late-night advertisements for bofh-porn video tapes
of Cron Jobs Gone Wild(tm) dancing through my head... thanks.

-- 
greg

Re: Inserting into table only if the row does not already

From

"C. Bensend"

Date:

17 October 2004, 07:05:14

> Gee, now I have images of late-night advertisements for bofh-porn video
> tapes
> of Cron Jobs Gone Wild(tm) dancing through my head... thanks.

Wow.  THERE'S something I didn't need before hitting the sack.  ;)

Benny


-- 
"Even if a man chops off your hand with a sword, you still have two nice,
sharp bones to stick in his eyes."                                                     -- .sig on Slashdot