Thread: how to make functions multi-user safe?

how to make functions multi-user safe?

From
Robert Treat
Date:
Due to the limitation that SetQuerySnapshot() is not called from within plpgsql functions, we
are having trouble making a function safe for a multiuser environment.

Our goal: To have a multiuser-safe function that checks for the existence of a table.  If the table
exists, it should return the name of the table.  If it does not exist, it will call another
function that creates the table and then it should return the name of the table after it's been created.

If two users call this function at the same time, the second client should recognize that the first
client is in the process of creating the table, and it should wait for the first client to return,
and then it should return the name of the newly created table rather than also try to create it
(which results in an error, since the table already exists.)

I've included our best attempt below.  However, this fails because even in READ COMMITTED mode,
inside of a function you will not see updated data from other transactions that have committed
during the execution of the function (since SetQuerySnapshot() is not called within functions.
This means that in our case, the second client cannot see the table that the first client has
created, and thus attempts to create the table anyway.

Here are the functions and some comments that demonstrate this effect.  (Note in the 2nd function,
we have some queries which we know to take a while to execute in our db, to lengthen the execution
time of the function and help show the effect.  If you want to test these, you'll have to replace these
with similar code that makes sense in your db environment.)

-- the purpose of these functions is to illustrate that plpgsql functions run in READ COMMITTED
-- transaction isolation level do not update their snapshot during function execution.

--  The function tests whether a table, foo, exists, and if it
-- doesn't it acquires a lock on a dummy table available for locking, tests again whether it
-- exists, and then calls another function that creates the table.  To simulate a complex table
-- being created, the _create_tbl function wastes a few seconds running queries that are known
-- to take a while to run.  This helps highlight the effect of the second client waiting on the
-- lock to be released.

-- prerequisites: no table named foo exists, that you are in READ COMMITTED mode, and that a table
-- foo_lock exists

-- to execute the test, I opened two psql clients and then executed select check_for_tbl();
-- at the same time in both windows.

CREATE OR REPLACE FUNCTION check_for_tbl() RETURNS text
AS'
DECLARE
    intTableExists    INT;            -- holds result of query to determine if table exists
BEGIN

    -- check if table exists
    SELECT INTO intTableExists count(*) FROM pg_class WHERE relname=quote_ident(''foo'');
    IF intTableExists = 0 THEN
                -- Acquire lock and make sure no one else has created the table in the meantime
                EXECUTE ''LOCK TABLE foo_lock'';
        RAISE NOTICE ''Acquired lock'';
                -- if SetQuerySnapshot were called from within functions, the second client would now
                -- see that the first client has already created this table (since to get to this
                -- point the second client would have had to acquire the lock, meaning the first client
                -- finished running the function that creates the foo table)
             SELECT INTO intTableExists count(*) FROM pg_class WHERE relname=quote_ident(''foo'');
        IF intTableExists = 0 THEN
            PERFORM _create_tbl();
        END IF;
    END IF;

    RETURN ''foo'';
END;
' LANGUAGE 'plpgsql';

CREATE OR REPLACE FUNCTION _create_tbl() RETURNS bool
AS'
DECLARE
  sql TEXT;
BEGIN
  sql := ''CREATE TABLE foo (bar int, baz int)'';
  EXECUTE sql;
  -- some random queries that take a few seconds to run
  -- so that I can have time to call check_for_tbl() from
  -- a second window and force it to wait for the lock

  -- NOTE:  You will need to replace these PERFORMs with
  -- an expensive operation that makes sense in your db

  PERFORM * FROM msg230,msg218;
  PERFORM * FROM msg230,msg218;
  RETURN true;
END;
' LANGUAGE 'plpgsql';




Re: how to make functions multi-user safe?

From
Scott Lamb
Date:
Robert Treat wrote:
> Our goal: To have a multiuser-safe function that checks for the existence of a table.  If the table
> exists, it should return the name of the table.  If it does not exist, it will call another
> function that creates the table and then it should return the name of the table after it's been created.

Why? This seems strange to me. Is there some reason your application
can't require the entire schema be created beforehand? What are you
trying to accomplish by that?

Scott


Re: how to make functions multi-user safe?

From
Christoph Dalitz
Date:
> Date: 04 Dec 2002 15:53:07 -0500
> From: Robert Treat <xzilla@users.sourceforge.net>
>
> Our goal: To have a multiuser-safe function that checks for the existence of a table.  If the table
> exists, it should return the name of the table.  If it does not exist, it will call another
> function that creates the table and then it should return the name of the table after it's been created.
>
The most basic rule in database design is that applications must not change the
database schema but only the database contents. According to this rule a function
that changes the schema and is called by several application users parallel is a
bad idea.

It is generally a wrong assumption that DDL statements can be run "multi user safe"
at all: eg. in Oracle DDL statements are not run in transactions (sorry, no rollback
for a "drop table") and are not allowed in stored procedures.

So what is the reasoning behind your odd function?
It is most probably a design flaw.

Christoph Dalitz

Re: how to make functions multi-user safe?

From
Jakub Ouhrabka
Date:
hi,

> Due to the limitation that SetQuerySnapshot() is not called from within plpgsql functions, we
> are having trouble making a function safe for a multiuser environment.
>
> Our goal: To have a multiuser-safe function that checks for the existence of a table.  If the table
> exists, it should return the name of the table.  If it does not exist, it will call another
> function that creates the table and then it should return the name of the table after it's been created.
> ...

i'd suggest encapsulating the whole function in LOCK lock_table IN
EXCLUSIVE MODE; to avoid concurrent access to the function and using
UPDATE pg_class SET relname=relname WHERE relname=quote_ident(''foo'') for
checking for the table existence (if there are any affected rows...). i
think that for update and delete the snapshot is taken when executing the
statement not before the function start as in the case of select.

hth,

kuba