Thread: plpgsql; execute query inside exists

plpgsql; execute query inside exists

From
jozsef.kurucz@invitel.hu
Date:
Hi there,

I would like to use EXISTS in a small plpgsql function but I always
get a "syntax error". How can I execute a query inside the
EXISTS function?



IF NOT EXISTS(EXECUTE 'SELECT * FROM '|| tmp_tbl)
   THEN
      CREATE TABLE tt();




ERROR:  syntax error at or near "EXECUTE"
LINE 1: SELECT  NOT EXISTS(EXECUTE 'SELECT * FROM '||  $1 )



Sorry for my lame question but I'm new in postgres.

Thanks!

Re: plpgsql; execute query inside exists

From
Merlin Moncure
Date:
On Mon, Oct 17, 2011 at 2:32 AM,  <jozsef.kurucz@invitel.hu> wrote:
> Hi there,
>
> I would like to use EXISTS in a small plpgsql function but I always
> get a "syntax error". How can I execute a query inside the
> EXISTS function?
>
>
>
> IF NOT EXISTS(EXECUTE 'SELECT * FROM '|| tmp_tbl)
>   THEN
>      CREATE TABLE tt();
>
>
>
>
> ERROR:  syntax error at or near "EXECUTE"
> LINE 1: SELECT  NOT EXISTS(EXECUTE 'SELECT * FROM '||  $1 )


EXECUTE is a top level statement -- you can't run it inside a query
like that.  Also, EXISTS is not a way to check to see if a table does
exist -- it is a clause for the presence of a row and returns true if
it finds one -- but if the table does not exist you would get an SQL
error.

A better way to do this is to query information_schema:

PERFORM 1 FROM information_schema.tables where schema_name = x and
table_name = y;

IF FOUND THEN
  CREATE TABLE ...
END IF;

(there is a race condition in the above code -- do you see it? if
concurrent access to this function is an issue, you have to LOCK an
object before running the PERFORM or perhaps use an advisory lock).

merlin

Re: plpgsql; execute query inside exists

From
Merlin Moncure
Date:
On Mon, Oct 17, 2011 at 8:20 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Mon, Oct 17, 2011 at 2:32 AM,  <jozsef.kurucz@invitel.hu> wrote:
>> Hi there,
>>
>> I would like to use EXISTS in a small plpgsql function but I always
>> get a "syntax error". How can I execute a query inside the
>> EXISTS function?
>>
>>
>>
>> IF NOT EXISTS(EXECUTE 'SELECT * FROM '|| tmp_tbl)
>>   THEN
>>      CREATE TABLE tt();
>>
>>
>>
>>
>> ERROR:  syntax error at or near "EXECUTE"
>> LINE 1: SELECT  NOT EXISTS(EXECUTE 'SELECT * FROM '||  $1 )
>
>
> EXECUTE is a top level statement -- you can't run it inside a query
> like that.  Also, EXISTS is not a way to check to see if a table does
> exist -- it is a clause for the presence of a row and returns true if
> it finds one -- but if the table does not exist you would get an SQL
> error.
>
> A better way to do this is to query information_schema:
>
> PERFORM 1 FROM information_schema.tables where schema_name = x and
> table_name = y;
>
> IF FOUND THEN
>  CREATE TABLE ...
> END IF;

oops.. meant to say IF NOT FOUND... :-).

merlin

Re: plpgsql; execute query inside exists

From
Alban Hertroys
Date:
On 17 October 2011 15:20, Merlin Moncure <mmoncure@gmail.com> wrote:
> A better way to do this is to query information_schema:
>
> PERFORM 1 FROM information_schema.tables where schema_name = x and
> table_name = y;
>
> IF FOUND THEN
>  CREATE TABLE ...
> END IF;
>
> (there is a race condition in the above code -- do you see it? if
> concurrent access to this function is an issue, you have to LOCK an
> object before running the PERFORM or perhaps use an advisory lock).

Is there? You'd think that with transactional DDL and the code running
in a single transaction (namely inside a stored function) it would be
concurrency-safe.

--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

Re: plpgsql; execute query inside exists

From
Merlin Moncure
Date:
On Mon, Oct 17, 2011 at 8:44 AM, Alban Hertroys <haramrae@gmail.com> wrote:
> On 17 October 2011 15:20, Merlin Moncure <mmoncure@gmail.com> wrote:
>> A better way to do this is to query information_schema:
>>
>> PERFORM 1 FROM information_schema.tables where schema_name = x and
>> table_name = y;
>>
>> IF FOUND THEN
>>  CREATE TABLE ...
>> END IF;
>>
>> (there is a race condition in the above code -- do you see it? if
>> concurrent access to this function is an issue, you have to LOCK an
>> object before running the PERFORM or perhaps use an advisory lock).
>
> Is there? You'd think that with transactional DDL and the code running
> in a single transaction (namely inside a stored function) it would be
> concurrency-safe.

Transactional DDL does not protect you from race conditions any more
than MVCC protects you from race conditions in regular DML.  What
transactional DDL does is roll back the changes in the event of an
error so you don't have half written schema changes in your database.
MVCC gives a rigorous definition of visibility rules and transactions
guarantee only a complete unit of work getting committed to the
database.  You still have to code defensively against multi-user
access however.  The good news is that multi user coding is about an
order of magnitude easier in sql (especially postgres variant) than in
any other development platform that I'm aware of.

The race condition here is basically the same problem that affects
'upsert' patterns:

test record(s) if found update if not found insert;

The problem comes that in between the test and the insert case someone
else can also test and get the insert in before you do.  You have two
general strategies to get around this: locking and retry.  I greatly
advise going the locking route unless your concurrency requirements
are very high.  It's much simpler, and since you're not invoking a
subtransaction, faster in the uncontested case.

merlin

Re: plpgsql; execute query inside exists

From
Alban Hertroys
Date:
On 17 October 2011 16:24, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Mon, Oct 17, 2011 at 8:44 AM, Alban Hertroys <haramrae@gmail.com> wrote:
>> On 17 October 2011 15:20, Merlin Moncure <mmoncure@gmail.com> wrote:
>>> A better way to do this is to query information_schema:
>>>
>>> PERFORM 1 FROM information_schema.tables where schema_name = x and
>>> table_name = y;
>>>
>>> IF FOUND THEN
>>>  CREATE TABLE ...
>>> END IF;
>>>
>>> (there is a race condition in the above code -- do you see it? if
>>> concurrent access to this function is an issue, you have to LOCK an
>>> object before running the PERFORM or perhaps use an advisory lock).
>>
>> Is there? You'd think that with transactional DDL and the code running
>> in a single transaction (namely inside a stored function) it would be
>> concurrency-safe.
>
> Transactional DDL does not protect you from race conditions any more
> than MVCC protects you from race conditions in regular DML.  What
> transactional DDL does is roll back the changes in the event of an
> error so you don't have half written schema changes in your database.
> MVCC gives a rigorous definition of visibility rules and transactions
> guarantee only a complete unit of work getting committed to the
> database.  You still have to code defensively against multi-user
> access however.  The good news is that multi user coding is about an
> order of magnitude easier in sql (especially postgres variant) than in
> any other development platform that I'm aware of.
>
> The race condition here is basically the same problem that affects
> 'upsert' patterns:
>
> test record(s) if found update if not found insert;
>
> The problem comes that in between the test and the insert case someone
> else can also test and get the insert in before you do.  You have two
> general strategies to get around this: locking and retry.  I greatly
> advise going the locking route unless your concurrency requirements
> are very high.  It's much simpler, and since you're not invoking a
> subtransaction, faster in the uncontested case.

So what would happen if you don't lock? I think it's this:

 Session A        | Session B
----------------+-----------------
 SELECT x        | SELECT x
 NOT FOUND        | NOT FOUND
   ...            | CREATE TABLE
 CREATE TABLE    | <commit>
 <error>        |
 <rollback>        |

If I understand correctly, if you don't mind the error and the
subsequent rollback in Session A, than there's not much need to lock,
or is there? It is important to be aware of the possible rollback of
such a transaction, of course.

And what would you lock? A record in information_schema.tables? That's
a read-only view. A table that doesn't exist yet? Can't do. A record
in the pg_ schema? Rather not...

I suppose you could work around that problem by keeping track of your
own tables that were generated using aforementioned plpgsql function.
Then you have a table (that you own) with records to lock.

--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

Re: plpgsql; execute query inside exists

From
Merlin Moncure
Date:
On Mon, Oct 17, 2011 at 10:28 AM, Alban Hertroys <haramrae@gmail.com> wrote:
> On 17 October 2011 16:24, Merlin Moncure <mmoncure@gmail.com> wrote:
>> On Mon, Oct 17, 2011 at 8:44 AM, Alban Hertroys <haramrae@gmail.com> wrote:
>>> On 17 October 2011 15:20, Merlin Moncure <mmoncure@gmail.com> wrote:
>>>> A better way to do this is to query information_schema:
>>>>
>>>> PERFORM 1 FROM information_schema.tables where schema_name = x and
>>>> table_name = y;
>>>>
>>>> IF FOUND THEN
>>>>  CREATE TABLE ...
>>>> END IF;
>>>>
>>>> (there is a race condition in the above code -- do you see it? if
>>>> concurrent access to this function is an issue, you have to LOCK an
>>>> object before running the PERFORM or perhaps use an advisory lock).
>>>
>>> Is there? You'd think that with transactional DDL and the code running
>>> in a single transaction (namely inside a stored function) it would be
>>> concurrency-safe.
>>
>> Transactional DDL does not protect you from race conditions any more
>> than MVCC protects you from race conditions in regular DML.  What
>> transactional DDL does is roll back the changes in the event of an
>> error so you don't have half written schema changes in your database.
>> MVCC gives a rigorous definition of visibility rules and transactions
>> guarantee only a complete unit of work getting committed to the
>> database.  You still have to code defensively against multi-user
>> access however.  The good news is that multi user coding is about an
>> order of magnitude easier in sql (especially postgres variant) than in
>> any other development platform that I'm aware of.
>>
>> The race condition here is basically the same problem that affects
>> 'upsert' patterns:
>>
>> test record(s) if found update if not found insert;
>>
>> The problem comes that in between the test and the insert case someone
>> else can also test and get the insert in before you do.  You have two
>> general strategies to get around this: locking and retry.  I greatly
>> advise going the locking route unless your concurrency requirements
>> are very high.  It's much simpler, and since you're not invoking a
>> subtransaction, faster in the uncontested case.
>
> So what would happen if you don't lock? I think it's this:
>
>  Session A              | Session B
> ----------------+-----------------
>  SELECT x               | SELECT x
>  NOT FOUND              | NOT FOUND
>   ...                  | CREATE TABLE
>  CREATE TABLE   | <commit>
>  <error>                |
>  <rollback>             |
>
> If I understand correctly, if you don't mind the error and the
> subsequent rollback in Session A, than there's not much need to lock,
> or is there? It is important to be aware of the possible rollback of
> such a transaction, of course.

right -- allowing for rollback and retrying (either from the client or
in the procedure via sub-transaction) is always an option.  I prefer
to lock -- it's faster (usually) and gives more regular behavior.

> And what would you lock? A record in information_schema.tables? That's
> a read-only view. A table that doesn't exist yet? Can't do. A record
> in the pg_ schema? Rather not...
> I suppose you could work around that problem by keeping track of your
> own tables that were generated using aforementioned plpgsql function.
> Then you have a table (that you own) with records to lock.

yeah -- you could use an advisory lock or a special table created for
that purpose, or a row of a table that does your tracking.  Agree that
locking system catalogs is *not* advisable.

merlin

Re: plpgsql; execute query inside exists

From
jozsef.kurucz@invitel.hu
Date:
Hi,

Thanks for the reply!
But I don't want to check if the table exists, I want to see the
result of the SELECT query, if a row presence or not.
The tmp_tbl is a dynamic generated table name, but when I write the
code without EXECUTE, I get syntax error too.
In this case how can I check if a SELECT has result or not?


SELECT INTO rndmd5 md5(random()::text);

tmp_tbl := 'tbl_tmp_' || rndmd5;


IF NOT EXISTS(SELECT * FROM tmp_tbl)
   THEN
     END IF;


ERROR:  syntax error at or near "$1"
LINE 1: SELECT  NOT EXISTS(SELECT * FROM  $1 )


Thanks!



2011/10/17 Merlin Moncure <mmoncure@gmail.com>:
> On Mon, Oct 17, 2011 at 8:20 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> On Mon, Oct 17, 2011 at 2:32 AM,  <jozsef.kurucz@invitel.hu> wrote:
>>> Hi there,
>>>
>>> I would like to use EXISTS in a small plpgsql function but I always
>>> get a "syntax error". How can I execute a query inside the
>>> EXISTS function?
>>>
>>>
>>>
>>> IF NOT EXISTS(EXECUTE 'SELECT * FROM '|| tmp_tbl)
>>>   THEN
>>>      CREATE TABLE tt();
>>>
>>>
>>>
>>>
>>> ERROR:  syntax error at or near "EXECUTE"
>>> LINE 1: SELECT  NOT EXISTS(EXECUTE 'SELECT * FROM '||  $1 )
>>
>>
>> EXECUTE is a top level statement -- you can't run it inside a query
>> like that.  Also, EXISTS is not a way to check to see if a table does
>> exist -- it is a clause for the presence of a row and returns true if
>> it finds one -- but if the table does not exist you would get an SQL
>> error.
>>
>> A better way to do this is to query information_schema:
>>
>> PERFORM 1 FROM information_schema.tables where schema_name = x and
>> table_name = y;
>>
>> IF FOUND THEN
>>  CREATE TABLE ...
>> END IF;
>
> oops.. meant to say IF NOT FOUND... :-).
>
> merlin
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: plpgsql; execute query inside exists

From
Alban Hertroys
Date:
On 18 October 2011 09:57,  <jozsef.kurucz@invitel.hu> wrote:
> Hi,
>
> Thanks for the reply!
> But I don't want to check if the table exists, I want to see the
> result of the SELECT query, if a row presence or not.

So you want to check that the table contains data? In that case it
makes no sense to create the table if it doesn't contain data. It may
very well exist already.

> The tmp_tbl is a dynamic generated table name, but when I write the
> code without EXECUTE, I get syntax error too.

They were explaining why you got the error, they were not telling you
to leave out EXECUTE for dynamic SQL.

> In this case how can I check if a SELECT has result or not?
>
> SELECT INTO rndmd5 md5(random()::text);
>
> tmp_tbl := 'tbl_tmp_' || rndmd5;
>
>
> IF NOT EXISTS(SELECT * FROM tmp_tbl)
>   THEN
>     END IF;

You really should at least limit the possible amount of results from
that SELECT statement. You're not interested in the results.

Anyway, the way to do this in plpgsql is:

EXECUTE SELECT * FROM tmp_tbl LIMIT 1;
IF NOT FOUND THEN
    ...
END IF;

--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.