race conditions, intersect in subqueries - Mailing list pgsql-general

From Cristóvão Dalla Costa
Subject race conditions, intersect in subqueries
Date
Msg-id 005c01c019e2$3e357d10$02ffa8c0@terrificus
Whole thread Raw
Responses Re: race conditions, intersect in subqueries  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
Re: race conditions, intersect in subqueries  (Doug Semig <dougslist@semig.com>)
List pgsql-general
I'm working with an application I wrote which does something along these
lines:

SELECT ID FROM ITEM WHERE URL='X' FOR UPDATE
IF (ROW RETURNED) {
    $ID = ITEM.ID
} ELSE {
    SELECT nextval ('item_id_seq')
    $ID = nextval
    INSERT INTO ITEM....
}
DO OTHER STUFF WITH $ID

So, I check if an item with a given url exists. If it does, I get its id and
use it later. If it doesn't, I insert a new item and proceed with the new
id. Everything happens inside a transaction. Now, there is a race condition
where the first line is executed simultaneously by two processes, looking
for the same url, and resulting in duplicate lines. So far, there are about
40 duplicates in a 80,000 row database, and short of manually correcting
them, I don't know what to do to fix the race condition.

Finally, it seems I cannot do INTERSECT on subqueries, sice the following
fails with a parse error "at or near INTERSECT", and the subquery by itself
works.

SELECT * FROM item WHERE id IN (SELECT item_id FROM item_words, words WHERE
words.id = words_id AND words.word='x' INTERSECT SELECT item_id FROM
item_words, words WHERE words.id = words_id AND words.word='y')

Basically, I'm using the above query to look for words in a reverse index,
sometimes with as many as 10 different words, causing a lot of rows to be
generated, to be later filtered by the intersects. Are there any better ways
to do that, performance-wise?

BTW, I'm not sure whether this is the appropriate mailing list to report
this, but the query optimizer should read
SELECT * FROM x WHERE id IN (SELECT x_id FROM Y)
as
SELECT * FROM x WHERE EXISTS (SELECT * FROM Y WHERE x_id = x.id)
when the tables are "large", and the necessary indexes exist.

Thanks for the help.

Cristovao.




pgsql-general by date:

Previous
From: Joseph Shraibman
Date:
Subject: log message
Next
From: Stephan Szabo
Date:
Subject: Re: race conditions, intersect in subqueries