Thread: Data corruption/loss when altering tables (fwd)

Data corruption/loss when altering tables (fwd)

From
Nicola Pero
Date:
This is my third attempt at posting to the list ... hopefully it will make
it this time

---------- Forwarded message ----------
Date: Mon, 22 Nov 2004 04:34:22 +0000 (GMT)
From: Nicola Pero <n.pero@mi.flashnet.it>
To: pgsql-bugs@postgresql.org
Subject: Data corruption/loss when altering tables

I've been experiencing data corruption/loss in Postgresql 7.4.2.

I tracked this down to altering tables.  After I alter a table, I get data loss
(fields becoming NULL with no reason!) unless I restart immediately the
database server.  After the restart all seems OK (but the data which was lost,
is gone, I just don't loose any more).  I'm a bit scared though that this might
not be enough, so I'd appreciate comments.  Is this a known bug which has been
fixed in later releases ?

I managed to extract a simple sequence of SQL instructions which reliably
causes data corruption/loss on my 7.4.2 (fedora core 2 default install). I
found that all machines I've access to are 7.4.2, so I couldn't test it on
later releases.  Apologies if it has already been fixed, I didn't find
references to a bug fix for it.

Thanks

/*
  * Nicola Pero, November 2004
  *
  * Test which demonstrates data corruption in Postgres
  * when modifying tables then using plpgsql functions.
  */

/* We create a basic table.  */
CREATE TABLE MyTable (

  /* An index.  */
  ID SERIAL,
  PRIMARY KEY (ID),

  /* A value.  */
  CountA INT DEFAULT 0

);

/* We create a stored procedure to manipulate a record in the table.  */
CREATE FUNCTION DO_SOMETHING(integer,integer)
RETURNS INTEGER AS '
DECLARE
BEGIN
   UPDATE MyTable SET CountA = CountA + $1 WHERE ID = $2;
   RETURN 0;
END' LANGUAGE 'plpgsql';

/* We insert a single record in the table.  */
INSERT INTO MyTable (CountA) VALUES (1);

/* We try out that the stored procedure works.  */
SELECT DO_SOMETHING (1, 1);

/* Print out the table now.  */
/* On my system, I get:
  *  id | counta
  * ----+--------
  *   1 |      2
  * (1 row)
  */ SELECT * FROM MyTable;

/* OK.  Now we modify the table.  We add a column with value 0.  */
ALTER TABLE MyTable ADD COLUMN CountB INT;
ALTER TABLE MyTable ALTER COLUMN CountB SET DEFAULT 0;
UPDATE MyTable SET CountB = 0;

/* Now put a value in CountB.  */
UPDATE MyTable SET CountB = 12;

/* Print out the table with the new column added.  */
/* On my system, I get:
  *  id | counta | countb
  * ----+--------+--------
  *   1 |      2 |     12
  * (1 row)
  */
SELECT * FROM MyTable;

/* Alter the value in the CountA column using the stored procedure.  */
SELECT DO_SOMETHING (1, 1);

/* Print out the table again.  */
/* On my system, I get:
  *  id | counta | countb
  * ----+--------+--------
  *   1 |      3 |
  * (1 row)
  *
  * note how the value in the CountB column has been
  * destroyed/corrupted into NULL!!
  */
SELECT * FROM MyTable;

Re: Data corruption/loss when altering tables (fwd)

From
Ivan
Date:
Hello Nicola,

Monday, November 22, 2004, 9:00:30 PM, you wrote:

NP> I've been experiencing data corruption/loss in Postgresql 7.4.2.

Try to recompile (create or replace) your stored procedure
after you alter the table....

Hope this help...

--
Best regards,
 Ivan                            mailto:Ivan-Sun1@mail.ru

Re: Data corruption/loss when altering tables (fwd)

From
Tom Lane
Date:
Nicola Pero <n.pero@mi.flashnet.it> writes:
> I've been experiencing data corruption/loss in Postgresql 7.4.2.

> I tracked this down to altering tables.  After I alter a table, I get data loss
> (fields becoming NULL with no reason!) unless I restart immediately the
> database server.

The problem shown in your example is simply that the UPDATE in the
plpgsql function has cached a plan that applies to the original format
of the table.  You don't need to restart the server, but you do need to
start a fresh session to get rid of the obsolete cached plan.

There's a TODO item to cause cached plans to be flushed automatically
when a table changes, but I don't know when it will get done (not for
8.0, for sure).

            regards, tom lane

Re: Data corruption/loss when altering tables (fwd)

From
Nicola Pero
Date:
>> I've been experiencing data corruption/loss in Postgresql 7.4.2.
>
>> I tracked this down to altering tables.  After I alter a table, I get data loss
>> (fields becoming NULL with no reason!) unless I restart immediately the
>> database server.
>
> The problem shown in your example is simply that the UPDATE in the
> plpgsql function has cached a plan that applies to the original format
> of the table.  You don't need to restart the server, but you do need to
> start a fresh session to get rid of the obsolete cached plan.
>
> There's a TODO item to cause cached plans to be flushed automatically
> when a table changes, but I don't know when it will get done (not for
> 8.0, for sure).

Thanks - very good to know. :-)

In practice / real world, what's the best workaround ?  Is there any way
to control/prevent/flush the cached plans without restarting sessions or
restarting the server ?

I mean, the real world is a bit more complex than my example because you
have some processes accessing the database (web server / messaging
systems) through stored procedures while another process modifies the
table at random times.

How do I prevent the cached plans in the stored procedures used by the
other processes from messing things up when the other process modifies the
table ?

Is there a way to explictly flush cached plans ?  Is there a way to
prevent the caching of the plan in the stored procedure ?

Thanks

Re: Data corruption/loss when altering tables (fwd)

From
Michael Fuhr
Date:
On Mon, Nov 22, 2004 at 12:14:30PM -0500, Tom Lane wrote:

> The problem shown in your example is simply that the UPDATE in the
> plpgsql function has cached a plan that applies to the original format
> of the table.  You don't need to restart the server, but you do need to
> start a fresh session to get rid of the obsolete cached plan.

Would LOAD 'plpgsql' work?  Would that cause a fresh compile of the
function the next time it's called, resulting in a new cached plan?
It appears to work that way in simple tests -- are there situations
where that wouldn't work?

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Data corruption/loss when altering tables (fwd)

From
Michael Fuhr
Date:
On Mon, Nov 22, 2004 at 06:26:09PM +0000, Nicola Pero wrote:

> Is there a way to explictly flush cached plans ?  Is there a way to
> prevent the caching of the plan in the stored procedure ?

You can prevent plan caching in PL/pgSQL by using EXECUTE.  See the
"Executing Dynamic Commands" section in the PL/pgSQL chapter of the
documentation.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Data corruption/loss when altering tables (fwd)

From
Nicola Pero
Date:
>> Is there a way to explictly flush cached plans ?  Is there a way to
>> prevent the caching of the plan in the stored procedure ?
>
> You can prevent plan caching in PL/pgSQL by using EXECUTE.  See the
> "Executing Dynamic Commands" section in the PL/pgSQL chapter of the
> documentation.

Thanks!  I suppose that (rewriting all PL/pgSQL stored procedures which
might use tables which could be modified to do an EXECUTE instead of a
plain query) will do.

Thanks everyone for your help! :-)


NB: I'd recommend writing a large 'warning' somewhere in the PL/pgsql doc
about this because loosing data is really scary

Re: Data corruption/loss when altering tables (fwd)

From
Tom Lane
Date:
Michael Fuhr <mike@fuhr.org> writes:
> Would LOAD 'plpgsql' work?  Would that cause a fresh compile of the
> function the next time it's called, resulting in a new cached plan?

I think that would cause plpgsql to lose track of its entire function
table, which is a brute force way of doing that ... but it doesn't
really solve Nicola's problem, because the nasty part of this is
plans that are already cached by other backends.

            regards, tom lane

Re: Data corruption/loss when altering tables (fwd)

From
Michael Fuhr
Date:
On Mon, Nov 22, 2004 at 03:44:25PM -0500, Tom Lane wrote:
> Michael Fuhr <mike@fuhr.org> writes:
> > Would LOAD 'plpgsql' work?  Would that cause a fresh compile of the
> > function the next time it's called, resulting in a new cached plan?
>
> I think that would cause plpgsql to lose track of its entire function
> table, which is a brute force way of doing that ... but it doesn't
> really solve Nicola's problem, because the nasty part of this is
> plans that are already cached by other backends.

Yeah, I was just mentioning a way to avoid having to reconnect the
current session if you know you've altered a table.  In another
message I suggested using EXECUTE to prevent plans from being
cached -- is there a better way in the current implementation?

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/