Thread: contrib/snapshot

contrib/snapshot

From
Joel Jacobson
Date:
Happy new year fellow pgsql-hackers!

This is the first alpha release of a new hopefully quite interesting little tool, named "snapshot".

Feedback welcomed.

-- 
Best regards,

Joel Jacobson
Glue Finance



URL




DESCRIPTION

Take a snapshot or rollback all your stored procedures in your PostgreSQL database.



RATIONALE

Before reading any further, ask yourselves the following questions.

1.  Have you ever,
    a)  modified stored procedures in your production database and
    b)  thought it went OK because all your tests passed and
    c)  later on realized "something is wrong" and
    d)  not being able to find nor fix the bug immediately
        leaving you no other option than to do a revert?
    If so, go to step 2.
    If not, go to step 4.

2.  During the minutes/hours while your malfunctional patch made a mess
    in the production database, was there any user activity causing important
    writes to the database?
    If so, go to step 3.
    If not, go to step 4.

3.  Did you enjoy the revert experience in step 1?
    If so, go to step 4.
    If not, go to step 5.

4. Are any of the following statements TRUE?
    a) your application is not very database centric.
    b) your users won't stop using your service if you lose their data.
    c) your application is read-only.
    d) your application does not have a lot of user traffic.
    If so, lucky you!
    If not, you probably have a good solution to my problem already,
            I would highly appreciate if you wanted to share it with me,
            please contact me at joel@gluefinance.com.

5.  This proposed solution might be interesting for you.
    I would highly appreciate your feedback on how to improve it,
    please contact me at joel@gluefinance.com.



INTRODUCTION

snapshot can take a snapshot of all your database functions and objects
depending on them, such as constraints and views using functions.

snapshot can rollback to a previous snapshot without modifying any of your
data or tables. It will only execute the minimum set of drop/create commands
to carry out the rollback.

snapshot depends on the pgcrypto contrib package.



TERMINOLOGY

object type     objects of the same type are created and dropped the same way,
                i.e. they use the same functions to build proper create and
                drop SQL-commands.

object          is of an object type and has a SHA1 hash of its content
                consisting of two SQL-commands, one to create and another to
                drop the object.

revision        has a timestamp when it was created and a list of objects

snapshot        has a timestamp when it was taken and has a revision

active snapshot the last snapshot taken

take snapshot   create a new revision of all objects currently live in the
                database and then create a new snapshot if the revision
                is different compared to the active snapshot.

rollback        restores a previously taken snapshot



SYNOPSIS

-- 1. Take a snapshot.

    postgres=# SELECT * FROM snapshot();
     _snapshotid | _revisionid 
    -------------+-------------
               1 |           1
    (1 row)


-- 2. Take a snapshot.

    postgres=# SELECT * FROM snapshot();
     _snapshotid | _revisionid 
    -------------+-------------
               1 |           1
    (1 row)


-- 3. We notice nothing changed between step 1 and 2.


-- 4. Modify your functions.

    postgres=# CREATE FUNCTION myfunc() RETURNS VOID AS $$ $$ LANGUAGE sql;
    CREATE FUNCTION
    glue=# \df myfunc
                             List of functions
     Schema |  Name  | Result data type | Argument data types |  Type  
    --------+--------+------------------+---------------------+--------
     public | myfunc | void             |                     | normal
    (1 row)


-- 5. Take a snapshot.

    postgres=# SELECT * FROM snapshot();
     _snapshotid | _revisionid 
    -------------+-------------
               2 |           2
    (1 row)


-- 4. Rollback to snapshot 1.

    postgres=# SELECT * FROM snapshot(1);
     _snapshotid | _revisionid 
    -------------+-------------
               3 |           1
    (1 row)


-- 5. We notice the function we created in step 4 has been dropped.

    postgres=# \df myfunc
                           List of functions
     Schema | Name | Result data type | Argument data types | Type 
    --------+------+------------------+---------------------+------
    (0 rows)


-- 6. Rollback to snapshot 2.

    postgres=# SELECT * FROM snapshot(2);
     _snapshotid | _revisionid 
    -------------+-------------
               4 |           2
    (1 row)


-- 7. We notice the function we created in step 4 has been created.

    postgres=# \df myfunc
                             List of functions
     Schema |  Name  | Result data type | Argument data types |  Type  
    --------+--------+------------------+---------------------+--------
     public | myfunc | void             |                     | normal
    (1 row)

Re: contrib/snapshot

From
"David E. Wheeler"
Date:
On Dec 31, 2010, at 5:00 AM, Joel Jacobson wrote:

> Happy new year fellow pgsql-hackers!
>
> This is the first alpha release of a new hopefully quite interesting little
> tool, named "snapshot".
>
> Feedback welcomed.

This looks awesome, Joel! One question: Why the dependence on pg_crypto? If it's just for SHA1 support, and you're just
usingit to to create hashes of function bodies, I suspect that you could also use the core MD5() function, yes? 

Looks really cool.

Best,

David



Re: contrib/snapshot

From
Joel Jacobson
Date:
2010/12/31 David E. Wheeler <david@kineticode.com>
This looks awesome, Joel! One question: Why the dependence on pg_crypto? If it's just for SHA1 support, and you're just using it to to create hashes of function bodies, I suspect that you could also use the core MD5() function, yes?

Thanks for fast reply. My guests hate me becuase I had to escape from the dinner party, but I simply couldn't wait a whole year fixing this bug.
Commit.

--
Best regards,

Joel Jacobson
Glue Finance

Re: contrib/snapshot

From
"David E. Wheeler"
Date:
On Dec 31, 2010, at 10:15 AM, Joel Jacobson wrote:

> 2010/12/31 David E. Wheeler <david@kineticode.com>
> This looks awesome, Joel! One question: Why the dependence on pg_crypto? If it's just for SHA1 support, and you're
justusing it to to create hashes of function bodies, I suspect that you could also use the core MD5() function, yes? 
>
> Thanks for fast reply. My guests hate me becuase I had to escape from the dinner party, but I simply couldn't wait a
wholeyear fixing this bug. 

lol! Go enjoy the party. There was no rush -- and no bug! :-)

Best,

David




Re: contrib/snapshot

From
Simon Riggs
Date:
On Fri, 2010-12-31 at 14:00 +0100, Joel Jacobson wrote:

> This is the first alpha release of a new hopefully quite interesting
> little tool, named "snapshot".

Please call it something other than "snapshot". There's already about 3
tools called something similar and a couple of different meanings of the
term in the world of Postgres.

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services



Re: contrib/snapshot

From
Joel Jacobson
Date:
2010/12/31 Simon Riggs <simon@2ndquadrant.com>
Please call it something other than "snapshot". There's already about 3
tools called something similar and a couple of different meanings of the
term in the world of Postgres.


Thanks, good point.
Renamed to fsnapshot.
Commit.

--
Best regards,

Joel Jacobson
Glue Finance

Re: contrib/snapshot

From
pasman pasmański
Date:
Hi. Will be useful to add a column with timestamp of the revision and
a comment can you do it? not today in order that your friends dont
kill you ..

-- 
Sent from my mobile device

------------
pasman


Re: contrib/snapshot

From
Joel Jacobson
Date:
2010/12/31 Simon Riggs <simon@2ndquadrant.com>
Please call it something other than "snapshot". There's already about 3
tools called something similar and a couple of different meanings of the
term in the world of Postgres.


Renamed the entire github project as well: https://github.com/gluefinance/fsnapshot

--
Best regards,

Joel Jacobson
Glue Finance

Re: contrib/snapshot

From
Jim Nasby
Date:
On Dec 31, 2010, at 1:35 PM, Joel Jacobson wrote:
> 2010/12/31 Simon Riggs <simon@2ndquadrant.com>
> Please call it something other than "snapshot". There's already about 3
> tools called something similar and a couple of different meanings of the
> term in the world of Postgres.
>
>
> Thanks, good point.
> Renamed to fsnapshot.

Is it actually limited to functions? ISTM this concept would be valuable for anything that's not in pg_class (in other
words,anything that doesn't have user data in it). 

Also, I'm not sure why this needs to be in contrib vs pgFoundry.
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net




Re: contrib/snapshot

From
Joel Jacobson
Date:
2011/1/2 Jim Nasby <jim@nasby.net>
> Renamed to fsnapshot.

Is it actually limited to functions? ISTM this concept would be valuable for anything that's not in pg_class (in other words, anything that doesn't have user data in it).

My ambition is to primarily support functions. Support for other object types are merely a necessary side-effect of the function dependencies.

Is there a matrix of all possible object types dependencies?
If not, for functions, is the following list correct?
   Object types which may depend on functions: constraints, views, triggers, any more?
   Functions may depend on: language, any more?

Instead of limiting the support to functions, perhaps it would make more sense to limit it to all non-data objects?
Is there a term for the group of object types not carrying any user data?
Which object types do carry user data? I can only think of tables and sequences, any other?

 
Also, I'm not sure why this needs to be in contrib vs pgFoundry.

Good point. It's actually in neither of them right now, it's only at github.com :) I merely used the prefix contrib/ in the subject line to indicate it's not a patch to the "core".

I do hope though it's possible to get a place for it in contrib/ at some time in the future, I think there is a chance quite a lot of users would appreciate a quicker, less error-prone way of handling these things.

This tool must be made extremely reliable, otherwise you won't feel safe using it in a production environment for deployment and revert purposes, which is my company's requirement.

I hope to achieve this by keeping a "bare minimum" approach to features, and making sure it only fulfills the objective:
1. take a snapshot of all non-data objects
2. <deploy code, test new code, or let time pass while other people make a mess in your database>
3. revert to previous snapshot without affecting any of the new data, generated in step 2

I put my faith in the reliability on system functions, such as pg_get_functiondef(), pg_get_viewdef() etc, to build proper create/drop commands for each object.
Even nicer would be if the pg_catalog provided functions to generate SQL create/drop commands for all non-data object types,
and to make sure _everything_ is included in the command, ensuring the object is created exactly the same,
currently pg_get_functiondef() does not restore the ownership of the function, which I had to append manually.

--
Best regards,

Joel Jacobson
Glue Finance

Re: contrib/snapshot

From
Joel Jacobson
Date:
2011/1/3 Joel Jacobson <joel@gluefinance.com>
2011/1/2 Jim Nasby <jim@nasby.net>
Is it actually limited to functions? ISTM this concept would be valuable for anything that's not in pg_class (in other words, anything that doesn't have user data in it).

Instead of limiting the support to functions, perhaps it would make more sense to limit it to all non-data objects?
Is there a term for the group of object types not carrying any user data?


My bad, I see you already answered both my questions.
So, it does make sense, and the term for non-data object types is therefore non-pg_class, non-class or perhaps non-relation objects?

--
Best regards,

Joel Jacobson
Glue Finance

Re: contrib/snapshot

From
Andrew Dunstan
Date:

On 01/02/2011 07:44 PM, Joel Jacobson wrote:
>
>     Also, I'm not sure why this needs to be in contrib vs pgFoundry.
>
>
> Good point. It's actually in neither of them right now, it's only at 
> github.com <http://github.com> :) I merely used the prefix contrib/ in 
> the subject line to indicate it's not a patch to the "core".
>
>

"contrib" in PostgreSQL means "a module maintained by the backend 
developers".

But it's not clear to me that there is any particular reason why this 
should be in contrib.

cheers

andrew


Re: contrib/snapshot

From
Jim Nasby
Date:
On Jan 2, 2011, at 6:50 PM, Joel Jacobson wrote:
> 2011/1/3 Joel Jacobson <joel@gluefinance.com>
> 2011/1/2 Jim Nasby <jim@nasby.net>
> Is it actually limited to functions? ISTM this concept would be valuable for anything that's not in pg_class (in
otherwords, anything that doesn't have user data in it). 
>
> Instead of limiting the support to functions, perhaps it would make more sense to limit it to all non-data objects?
> Is there a term for the group of object types not carrying any user data?
>
>
> My bad, I see you already answered both my questions.
> So, it does make sense, and the term for non-data object types is therefore non-pg_class, non-class or perhaps
non-relationobjects? 

The generic term for objects that keep their metadata in pg_class is "relation".

Actually, now that I think about it, existence in pg_class isn't a good determining factor, because there's stuff like
typesin there. 

Aside from tables and sequences, you might also want to exclude indexes, or at least provide the option to, since
rebuildingthem could take a significant amount of time. 
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net




Re: contrib/snapshot

From
Joel Jacobson
Date:
2011/1/3 Andrew Dunstan <andrew@dunslane.net>
"contrib" in PostgreSQL means "a module maintained by the backend developers".
But it's not clear to me that there is any particular reason why this should be in contrib.

Then I definitively think contrib is the only possible place for this module.
If the module will not be maintained by the backend developers, noone (including myself) will trust the module to perform the sensistive tasks in a mission critical production database.
Since the module depends on pg_catalog system tables, it's must be updated if the they would change in future versions of PostgreSQL, and I wouldn't trust any other team than the backend developers to do it.

I'm happy to continue hacking on the module until it's 100% working, stable, thoroughly tested and accepted by the backend developers.
It's not working 100% yet, for instance, I'm currently working on making sure objects are created/dropped in an order not breaking any dependencies.

--
Best regards,

Joel Jacobson
Glue Finance