Thread:

From
Vincenzo Romano
Date:
Hi all
I'm making some experiments with table archiving and I'd like to
"replace" a full table F with an empty one E.
In order to do this I see only one way:

ALTER TABLE F RENAME TO T;
ALTER TABLE E RENAME TO F;
ALTER TABLE T RENAME TO E; -- optional

This implies there's a moment when the full table doesn't exist.
Would a transaction enclosure ensure that the table F will be always
available to all clients?
Thanks.

--
Vincenzo Romano - NotOrAnd.IT
Information Technologies
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: transactional swap of tables

From
Kevin Grittner
Date:
Vincenzo Romano <vincenzo.romano@notorand.it> wrote:

> I'd like to "replace" a full table F with an empty one E.
> In order to do this I see only one way:
>
> ALTER TABLE F RENAME TO T;
> ALTER TABLE E RENAME TO F;
> ALTER TABLE T RENAME TO E; -- optional
>
> This implies there's a moment when the full table doesn't exist.
> Would a transaction enclosure ensure that the table F will be
> always available to all clients?

Yes.  What you show is safe.  What has a race condition is dropping
the old table before all transactions which started with it have
completed.  If you're going to drop the old table, be sure to wait
long enough after the COMMIT for things to "settle".

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: transactional swap of tables

From
Vincenzo Romano
Date:
> Vincenzo Romano <vincenzo.romano@notorand.it> wrote:
>
>> I'd like to "replace" a full table F with an empty one E.
>> In order to do this I see only one way:
>>
>> ALTER TABLE F RENAME TO T;
>> ALTER TABLE E RENAME TO F;
>> ALTER TABLE T RENAME TO E; -- optional
>>
>> This implies there's a moment when the full table doesn't exist.
>> Would a transaction enclosure ensure that the table F will be
>> always available to all clients?
>
> Yes.  What you show is safe.  What has a race condition is dropping
> the old table before all transactions which started with it have
> completed.  If you're going to drop the old table, be sure to wait
> long enough after the COMMIT for things to "settle".

Would you please elaborate more on the "wait[ing] long enough after
the COMMIT" ?


Re: transactional swap of tables

From
Kevin Grittner
Date:
Vincenzo Romano <vincenzo.romano@notorand.it> wrote:

> Would you please elaborate more on the "wait[ing] long enough
> after the COMMIT" ?

You can note the time when you commit the transaction, and then
poll pg_stat_activity until there are no active transactions which
started before that.

You can sometimes simplify this a little.  In the case where I did
something like this, the only use of the table was by a web
application which timed out any query which didn't complete in 20
seconds.  To that web team, and those using that web application,
there is absolutely no difference between a query which takes more
than 20 seconds and one where PostgreSQL throws an error; the
resulting user-facing behavior is identical.  So if, after waiting
more than 20 seconds, a query blocked long enough to get an error
on trying to access the dropped table OID, that made no difference
to them -- the user would have already been given an error.  In
that situation I was comfortable just hard-coding a 60-second sleep
before dropping the old table.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: transactional swap of tables

From
Merlin Moncure
Date:
On Fri, Jul 12, 2013 at 9:24 AM, Kevin Grittner <kgrittn@ymail.com> wrote:
> Vincenzo Romano <vincenzo.romano@notorand.it> wrote:
>
>> I'd like to "replace" a full table F with an empty one E.
>> In order to do this I see only one way:
>>
>> ALTER TABLE F RENAME TO T;
>> ALTER TABLE E RENAME TO F;
>> ALTER TABLE T RENAME TO E; -- optional
>>
>> This implies there's a moment when the full table doesn't exist.
>> Would a transaction enclosure ensure that the table F will be
>> always available to all clients?
>
> Yes.  What you show is safe.  What has a race condition is dropping
> the old table before all transactions which started with it have
> completed.  If you're going to drop the old table, be sure to wait
> long enough after the COMMIT for things to "settle".

By advised that when doing rename based swaps all depending structures
(foreign keys, views, table based composite types, etc) will still be
pointing at the old table.

merlin


Re:

From
Luca Ferrari
Date:
On Fri, Jul 12, 2013 at 1:23 PM, Vincenzo Romano
<vincenzo.romano@notorand.it> wrote:
> Hi all
> I'm making some experiments with table archiving and I'd like to
> "replace" a full table F with an empty one E.
> In order to do this I see only one way:
>
> ALTER TABLE F RENAME TO T;
> ALTER TABLE E RENAME TO F;
> ALTER TABLE T RENAME TO E; -- optional
>
> This implies there's a moment when the full table doesn't exist.
> Would a transaction enclosure ensure that the table F will be always
> available to all clients?


If I get it right using transaction boundaries around the DDL will
prevent clients to query the F table until the transaction ends, and
this is due to the locking of the alter table. In other words, a query
performed against the F table while the transaction is running will
simply locks without generating any error.

Hope this helps.
Luca


Re:

From
Vincenzo Romano
Date:
2013/7/14 Luca Ferrari <fluca1978@infinito.it>:
> On Fri, Jul 12, 2013 at 1:23 PM, Vincenzo Romano
> <vincenzo.romano@notorand.it> wrote:
>> Hi all
>> I'm making some experiments with table archiving and I'd like to
>> "replace" a full table F with an empty one E.
>> In order to do this I see only one way:
>>
>> ALTER TABLE F RENAME TO T;
>> ALTER TABLE E RENAME TO F;
>> ALTER TABLE T RENAME TO E; -- optional
>>
>> This implies there's a moment when the full table doesn't exist.
>> Would a transaction enclosure ensure that the table F will be always
>> available to all clients?
>
>
> If I get it right using transaction boundaries around the DDL will
> prevent clients to query the F table until the transaction ends, and
> this is due to the locking of the alter table. In other words, a query
> performed against the F table while the transaction is running will
> simply locks without generating any error.
>
> Hope this helps.
> Luca

Thank Luca. That sheds more light on DDL transactions.
The one I'm thinking of is the body of a PL/PGSQL function.
I am only concerned about how late is done the binding between a table
name and the actual OID for other functions, views and triggers.
Any idea?

Grazie.


Re:

From
Luca Ferrari
Date:
On Sun, Jul 14, 2013 at 8:36 PM, Vincenzo Romano
<vincenzo.romano@notorand.it> wrote:

> I am only concerned about how late is done the binding between a table
> name and the actual OID for other functions, views and triggers.


Well, it should work like this: once the parser decides that a query
looks good, it seaches for the catalogue to find out all the names of
implied relations. Therefore, the binding you mention should happen as
late as possible. However, I'm not sure if there a kind of caching
mechanism that can invalidate such sentence. Someone with more
experience can detail better.

Luca


Re:

From
Vincenzo Romano
Date:
2013/7/15 Luca Ferrari <fluca1978@infinito.it>:
> On Sun, Jul 14, 2013 at 8:36 PM, Vincenzo Romano
> <vincenzo.romano@notorand.it> wrote:
>
>> I am only concerned about how late is done the binding between a table
>> name and the actual OID for other functions, views and triggers.
>
>
> Well, it should work like this: once the parser decides that a query
> looks good, it seaches for the catalogue to find out all the names of
> implied relations. Therefore, the binding you mention should happen as
> late as possible. However, I'm not sure if there a kind of caching
> mechanism that can invalidate such sentence. Someone with more
> experience can detail better.
>
> Luca

That's exactly my fear in doing that transactional DDL.
And it's not just the table names, but also the structure, the
indexes, the statistics and so on.
Is there anyone on this ML who "owns" the truth? :-)

The alternative is to do things the "good ol' way" by DELETING+INSERTING
(http://tapoueh.org/blog/2013/07/05-archiving-data-fast.html)
Where I'd fear for longer LOCKs.


Re:

From
Luca Ferrari
Date:
On Mon, Jul 15, 2013 at 8:33 AM, Vincenzo Romano
<vincenzo.romano@notorand.it> wrote:

> The alternative is to do things the "good ol' way" by DELETING+INSERTING
> (http://tapoueh.org/blog/2013/07/05-archiving-data-fast.html)
> Where I'd fear for longer LOCKs.


I don't know if this is an option for your case study, but you could
also exploit schemas to achieve the result: placing the new table into
a new schema and changing the search path (disallowing access to the
old schema). Of course this means you are able to lock out your
clients during the migration or you need to use some rule to redirect
queries.

Luca


Re:

From
Vincenzo Romano
Date:
2013/7/15 Luca Ferrari <fluca1978@infinito.it>:
> On Mon, Jul 15, 2013 at 8:33 AM, Vincenzo Romano
> <vincenzo.romano@notorand.it> wrote:
>
>> The alternative is to do things the "good ol' way" by DELETING+INSERTING
>> (http://tapoueh.org/blog/2013/07/05-archiving-data-fast.html)
>> Where I'd fear for longer LOCKs.
>
>
> I don't know if this is an option for your case study, but you could
> also exploit schemas to achieve the result: placing the new table into
> a new schema and changing the search path (disallowing access to the
> old schema). Of course this means you are able to lock out your
> clients during the migration or you need to use some rule to redirect
> queries.
>
> Luca

I think the "schma trick" would just make things more complex.
The "late binding" and "binding cache" would popup anyway.


Re: transactional swap of tables

From
Vincenzo Romano
Date:
I have done the following test pn v9.2.4 with two concurrent sessions:

-- session no.1
tmp1=# create table t1 ( t text );
CREATE TABLE
Tempo: 37,351 ms
tmp1=# create table t2 ( t text );
CREATE TABLE
Tempo: 33,363 ms
tmp1=# create or replace function f1( out tx text )
tmp1-# language plpgsql
tmp1-# stable as $l0$
tmp1$# begin
tmp1$# select t into tx from t1 limit 1;
tmp1$# end;
tmp1$# $l0$;
CREATE FUNCTION
Tempo: 14,148 ms
tmp1=# create or replace function f2( out tx text )
tmp1-# language plpgsql
tmp1-# volatile as $l0$
tmp1$# begin
tmp1$# select t into tx from t1 limit 1;
tmp1$# end;
tmp1$# $l0$;
CREATE FUNCTION
Tempo: 12,712 ms
tmp1=# insert into t1 values ( 'ciao' );
INSERT 0 1
Tempo: 14,777 ms
tmp1=# insert into t2 values ( 'hello' );
INSERT 0 1
Tempo: 9,032 ms
tmp1=# select * from f1();
  tx
------
 ciao
(1 riga)

Tempo: 0,600 ms
tmp1=# select * from f2();
  tx
------
 ciao
(1 riga)

Tempo: 0,549 ms

-- session no.2
tmp1=# begin;
BEGIN
Tempo: 0,287 ms
tmp1=# alter table t1 rename to t3;
ALTER TABLE
Tempo: 1,023 ms
tmp1=# alter table t2 rename to t1;
ALTER TABLE
Tempo: 0,533 ms
tmp1=# alter table t3 rename to t2;
ALTER TABLE
Tempo: 0,449 ms

-- back to session no.1
tmp1=# select * from f1();
-- not ending, possibly due to table lock

-- back to session no.2
tmp1=# commit;
COMMIT
Tempo: 10,986 ms

-- back to session no.1
  tx
-------
 hello
(1 riga)

Tempo: 39946,137 ms

The result changes slightly if I query the function f1() just after ALTERing t1.
In this case from f1() I get NULL as result after COMMIT on session no.2.
A subsequent query returns 'hello'.
While from f2() I get always the right result.
This makes me think that the volatility specification in the function
declaration obviously changes something in the caching of the catalog
queries.
The NULL remains a mystere for me.
Any hint? Any way to avoid such a behaviour?


Re:

From
Ashish Karalkar
Date: