Re: Patch: plan invalidation vs stored procedures - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Patch: plan invalidation vs stored procedures
Date
Msg-id 1219179833.7109.43.camel@huvostro
Whole thread Raw
In response to Re: Patch: plan invalidation vs stored procedures  ("Robert Haas" <robertmhaas@gmail.com>)
Responses Re: Patch: plan invalidation vs stored procedures  ("Robert Haas" <robertmhaas@gmail.com>)
Re: Patch: plan invalidation vs stored procedures  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, 2008-08-19 at 16:03 -0400, Robert Haas wrote:
> > Another thing I do not understand well is how people are expected to work in
> > 8.3 with a function based API, without hitting Skype problems. I'm having a
> 
> All database-driven applications have this problem.  Any time you have
> a database on the backend and interface code on the front-end, you
> need to keep in mind that it won't necessarily be possible to update
> the two of them simultaneously, especially if you have multiple
> back-ends and multiple front-ends, as you almost certainly do.  Even
> if PostgreSQL invalidated plans in the particular situation you're
> discussing, there would still be other problems.  You could add a new,
> non-NULLable column to a table before updating the code that insert
> into that table, or drop an old column that the code still counts on
> being able to access.
> 
> I handle these problems all the time by ordering the changes
> carefully. If I need to change a function API in an incompatible way,
> I change the NAME of the function as well as the type signature (eg.
> do_important_thing -> do_important_thing_v2).  Then I change the code.
>  Then I remove the old function once everything that relies on it is
> dead.

Not having plan invalidation forces you to have do_important_thing_v2
for do_important_thing even with no changes in source code, just for the
fact that do_part_of_important_thing() which it calls has changed.

An example -
you have functions

A) caller1() to callerN() which includes call to called1() 

B) one of these functions, say callerM() needs one more field returned
from called1(), so you either write a completely new function
called1_v2() with one more field and then update  callerM() to call
called1_v2()

C) now, to get rid of called1() you have to replace called1 with
called1_v2 also in all other functions caller1() to callerN() 

D) then you can drop called1()

if you missed one of callerx() functions (you can drop called1() even if
it is used, as postgreSQL does not check dependencies in functions) then
you have a non-functioning database, where even client reconnect won't
help, only putting called1() back.

If there is plan invalidation then you just change called1() to return
one more field and that's it - no juggling with C) and D) and generally
less things that can go wrong.

> Maybe in your particular environment plan invalidation for functions
> will solve most of the cases you care about, but I respectfully submit
> that there's no substitute for good release engineering. 

Nope, but the amount of release engineering (and deployment-time work)
you need to do depends a lot on fragility of the system.

The more arcane and fragile the system is, the more you need to rely on
external systems and procedures to keep it working.

Imagine how much harder it would be, if there were no transactions and
you had to ensure the right ordering of all changes by release process
only. You probably would end up doing several times more work and
temporary hacks and you would still be out of luck doing _any_
nontrivial updates while the systems are running 24/7.

> If you don't
> know exactly what functions are going to be created, modified, or
> dropped on your production servers during each release before you
> actually roll that release out...  

this is not about knowing this at all - this is about needing to change
less, about optimizing on work that does not need to be done if system
is smarter.

> you probably need to improve your internal documentation.

or improve the database system you use.

if you need to change less functions, you also need less documentation
about changes. if you can prove that "select a,b from f()" always
returns the same data as "select a,b from f_b2()" then you don't need to
write f_b2() at all, you just redefine f() and can also skip migrating
all callers of f() to f_v2() just to satisfy your databases quirks.

---------------
Hannu





pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: A smaller default postgresql.conf
Next
From: Joshua Drake
Date:
Subject: Re: A smaller default postgresql.conf