Thread: temporal support patch
Hi all,
as a part of my master's thesis I have created temporal support patch for PostgreSQL. It enables the creation of special temporal tables with entries versioning. Modifying operations (UPDATE, DELETE, TRUNCATE) on these tables don't cause permanent changes to entries, but create new versions of them. Thus user can easily get to the past states of the table.
Basic information on temporal databases can be found on http://en.wikipedia.org/wiki/Temporal_database
In field of temporal databases, there are only proprietary solution available. During the analysis I found these:
- IBM DB2 10 for z/OS
- Oracle 11g Workspace Manager - Teradata Database 13.10
Primary goal of my work was the creation of opensource solution, that is easy to use and is backward compatible with existing applications, so that the change of the original tables to temporal ones, does not require changes to applications that work with them. This patch is built on standard SQL/Temporal with some minor modifications inspired by commercial temporal database systems. Currently it only deals with transaction time support.
Here is simple description on how it works:
1. user can create transaction time table using modified CREATE TABLE command:
CREATE TABLE person(name varchar(50)) AS TRANSACTIONTIME;
This command automatically creates all objects required for transaction time support:
List of relations
Schema | Name | Type | Owner
--------+----------------------+----------+----------
public | person | table | tester
public | person__entry_id_seq | sequence | tester
public | person_hist | table | postgres
Table "public.person"
Column | Type | Modifiers
------------+-----------------------------+------------------------------------------------------------------------------
name | character varying(50) |
_entry_id | bigint | not null default nextval('person__entry_id_seq'::regclass)
_sys_start | timestamp without time zone | not null default clock_timestamp()
_sys_end | timestamp without time zone | not null default '294276-12-31 23:59:59.999999'::timestamp without time zone
Indexes:
"person__entry_id_idx" btree (_entry_id)
"person__sys_start__sys_end_idx" btree (_sys_start, _sys_end)
Table "public.person_hist"
Column | Type | Modifiers
------------+-----------------------------+-----------
name | character varying(50) |
_entry_id | bigint | not null
_sys_start | timestamp without time zone | not null
_sys_end | timestamp without time zone | not null
Indexes:
"person_hist__entry_id_idx" btree (_entry_id)
"person_hist__sys_start__sys_end_idx" btree (_sys_start, _sys_end)
Table person stores current versions of entries. 3 additional columns are added:
_entry_id - id of entry. It groups together different versions of entry.
_sys_start - beginning of the version validity period (version creation timestamp).
_sys_end - end of the version validity period.
Table person_hist stores historical versions of entries. It has the same structure and indexes as the person table, but without any constraints and default values.
2. another way of creating transaction time table is adding transaction time support to existing standard table using ALTER command.
CREATE TABLE person(name varchar(50));
ALTER TABLE person ADD TRANSACTIONTIME;
3. INSERT entry
INSERT INTO person VALUES('Jack');
SELECT *, _entry_id, _sys_start, _sys_end FROM person;
name | _entry_id | _sys_start | _sys_end
------+-----------+----------------------------+------------------------------
Jack | 1 | 2012-05-16 22:11:39.856916 | 294276-12-31 23:59:59.999999
4. UPDATE entry
UPDATE person SET name = 'Tom';
SELECT *, _entry_id, _sys_start, _sys_end FROM person;
name | _entry_id | _sys_start | _sys_end
------+-----------+----------------------------+------------------------------
Tom | 1 | 2012-05-16 22:11:44.736195 | 294276-12-31 23:59:59.999999
SELECT * FROM person_hist;
name | _entry_id | _sys_start | _sys_end
------+-----------+----------------------------+----------------------------
Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16 22:11:44.736194
5. DELETE entry
DELETE FROM person;
SELECT *, _entry_id, _sys_start, _sys_end FROM person;
name | _entry_id | _sys_start | _sys_end
------+-----------+------------+----------
SELECT * FROM person_hist;
name | _entry_id | _sys_start | _sys_end
------+-----------+----------------------------+----------------------------
Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16 22:11:44.736194
Tom | 1 | 2012-05-16 22:11:44.736195 | 2012-05-16 22:14:33.875869
6. selecting entries
INSERT INTO person VALUES('Mike');
INSERT INTO person VALUES('Mike');
--standard SELECT - operates only with current versions of entries
SELECT * FROM person;
name
------
Mike
Mike
--special temporal SELECT which operates with all versions
NONSEQUENCED TRANSACTIONTIME SELECT *, _entry_id, _sys_start, _sys_end FROM person;
name | _entry_id | _sys_start | _sys_end
------+-----------+----------------------------+------------------------------
Mike | 3 | 2012-05-16 22:20:55.055671 | 294276-12-31 23:59:59.999999
Mike | 2 | 2012-05-16 22:20:51.619475 | 294276-12-31 23:59:59.999999
Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16 22:11:44.736194
Tom | 1 | 2012-05-16 22:11:44.736195 | 2012-05-16 22:14:33.875869
--special temporal SELECT which operates with versions valid in specified time
TRANSACTIONTIME AS OF '2012-05-16 22:11:39.856916' SELECT *, _entry_id, _sys_start, _sys_end FROM person;
name | _entry_id | _sys_start | _sys_end
------+-----------+----------------------------+----------------------------
Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16 22:11:44.736194
--it is also possible to set timestamp globally for session. All subsequent SELECTs without any temporal modifier will operate with versions valid in this time,
SET history_timestamp TO '2012-05-16 22:11:39.856916';
SELECT * FROM person;
name
------
Jack
--to select only current versions when history_tiumestamp is set, CURRENT TRANSACTIONTIME have to be used with SELECT
CURRENT TRANSACTIONTIME SELECT * FROM person;
name
------
Mike
Mike
This is only a illustration of main functionality. Later I can create a document about the design and implementation details, but first I need to know if such temporal features as described here, could be added to future versions of PostgreSQL, after meeting all the requirements of a new patch.
Regards
Miroslav Simulcik
Hello what is conformance of your solution with temporal extension in ANSI SQL 2011 http://www.slideshare.net/CraigBaumunk/temporal-extensions-tosql20112012010438 Regards Pavel Stehule 2012/5/16 Miroslav Šimulčík <simulcik.miro@gmail.com>: > Hi all, > > as a part of my master's thesis I have created temporal support patch for > PostgreSQL. It enables the creation of special temporal tables with entries > versioning. Modifying operations (UPDATE, DELETE, TRUNCATE) on these tables > don't cause permanent changes to entries, but create new versions of them. > Thus user can easily get to the past states of the table. > > Basic information on temporal databases can be found > on http://en.wikipedia.org/wiki/Temporal_database > > In field of temporal databases, there are only proprietary solution > available. During the analysis I found these: > - IBM DB2 10 for z/OS > - Oracle 11g Workspace Manager > - Teradata Database 13.10 > > Primary goal of my work was the creation of opensource solution, that is > easy to use and is backward compatible with existing applications, so that > the change of the original tables to temporal ones, does not require changes > to applications that work with them. This patch is built on standard > SQL/Temporal with some minor modifications inspired by commercial temporal > database systems. Currently it only deals with transaction time support. > > Here is simple description on how it works: > > 1. user can create transaction time table using modified CREATE TABLE > command: > > CREATE TABLE person(name varchar(50)) AS TRANSACTIONTIME; > > This command automatically creates all objects required for transaction > time support: > > List of relations > Schema | Name | Type | Owner > --------+----------------------+----------+---------- > public | person | table | tester > public | person__entry_id_seq | sequence | tester > public | person_hist | table | postgres > > > Table "public.person" > Column | Type | > Modifiers > > ------------+-----------------------------+------------------------------------------------------------------------------ > name | character varying(50) | > _entry_id | bigint | not null default > nextval('person__entry_id_seq'::regclass) > _sys_start | timestamp without time zone | not null default > clock_timestamp() > _sys_end | timestamp without time zone | not null default > '294276-12-31 23:59:59.999999'::timestamp without time zone > Indexes: > "person__entry_id_idx" btree (_entry_id) > "person__sys_start__sys_end_idx" btree (_sys_start, _sys_end) > > > Table "public.person_hist" > Column | Type | Modifiers > ------------+-----------------------------+----------- > name | character varying(50) | > _entry_id | bigint | not null > _sys_start | timestamp without time zone | not null > _sys_end | timestamp without time zone | not null > Indexes: > "person_hist__entry_id_idx" btree (_entry_id) > "person_hist__sys_start__sys_end_idx" btree (_sys_start, _sys_end) > > > > > Table person stores current versions of entries. 3 additional columns > are added: > _entry_id - id of entry. It groups together different versions of > entry. > _sys_start - beginning of the version validity period (version > creation timestamp). > _sys_end - end of the version validity period. > > Table person_hist stores historical versions of entries. It has the same > structure and indexes as the person table, but without any constraints and > default values. > > 2. another way of creating transaction time table is adding transaction time > support to existing standard table using ALTER command. > > CREATE TABLE person(name varchar(50)); > ALTER TABLE person ADD TRANSACTIONTIME; > > 3. INSERT entry > > INSERT INTO person VALUES('Jack'); > > SELECT *, _entry_id, _sys_start, _sys_end FROM person; > > name | _entry_id | _sys_start | _sys_end > > ------+-----------+----------------------------+------------------------------ > Jack | 1 | 2012-05-16 22:11:39.856916 | 294276-12-31 > 23:59:59.999999 > > 4. UPDATE entry > > UPDATE person SET name = 'Tom'; > > SELECT *, _entry_id, _sys_start, _sys_end FROM person; > > name | _entry_id | _sys_start | _sys_end > > ------+-----------+----------------------------+------------------------------ > Tom | 1 | 2012-05-16 22:11:44.736195 | 294276-12-31 > 23:59:59.999999 > > SELECT * FROM person_hist; > > name | _entry_id | _sys_start | _sys_end > > ------+-----------+----------------------------+---------------------------- > Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16 > 22:11:44.736194 > > 5. DELETE entry > > DELETE FROM person; > > SELECT *, _entry_id, _sys_start, _sys_end FROM person; > > name | _entry_id | _sys_start | _sys_end > ------+-----------+------------+---------- > > SELECT * FROM person_hist; > > name | _entry_id | _sys_start | _sys_end > > ------+-----------+----------------------------+---------------------------- > Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16 > 22:11:44.736194 > Tom | 1 | 2012-05-16 22:11:44.736195 | 2012-05-16 > 22:14:33.875869 > > 6. selecting entries > > INSERT INTO person VALUES('Mike'); > INSERT INTO person VALUES('Mike'); > > --standard SELECT - operates only with current versions of entries > SELECT * FROM person; > > name > ------ > Mike > Mike > > --special temporal SELECT which operates with all versions > NONSEQUENCED TRANSACTIONTIME SELECT *, _entry_id, _sys_start, _sys_end > FROM person; > > name | _entry_id | _sys_start | _sys_end > > ------+-----------+----------------------------+------------------------------ > Mike | 3 | 2012-05-16 22:20:55.055671 | 294276-12-31 > 23:59:59.999999 > Mike | 2 | 2012-05-16 22:20:51.619475 | 294276-12-31 > 23:59:59.999999 > Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16 > 22:11:44.736194 > Tom | 1 | 2012-05-16 22:11:44.736195 | 2012-05-16 > 22:14:33.875869 > > --special temporal SELECT which operates with versions valid in > specified time > TRANSACTIONTIME AS OF '2012-05-16 22:11:39.856916' SELECT *, _entry_id, > _sys_start, _sys_end FROM person; > > name | _entry_id | _sys_start | _sys_end > > ------+-----------+----------------------------+---------------------------- > Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16 > 22:11:44.736194 > > --it is also possible to set timestamp globally for session. All > subsequent SELECTs without any temporal modifier will operate with versions > valid in this time, > SET history_timestamp TO '2012-05-16 22:11:39.856916'; > > SELECT * FROM person; > > name > ------ > Jack > > --to select only current versions when history_tiumestamp is set, > CURRENT TRANSACTIONTIME have to be used with SELECT > CURRENT TRANSACTIONTIME SELECT * FROM person; > > name > ------ > Mike > Mike > > > > This is only a illustration of main functionality. Later I can create a > document about the design and implementation details, but first I need to > know if such temporal features as described here, could be added to future > versions of PostgreSQL, after meeting all the requirements of a new patch. > > Regards > > Miroslav Simulcik
Hello.
Basically my implementation serve for the same thing as table_log extension - entry versioning. It also uses history table and triggers to store old versions of entries. After quick review of table_log extension, here is comparison with my solution:
tale_log advantages compared to my solution:
- no columns added to the original table
- more control over created objects
advantages of my solution compared to table_log
- built in syntax
- more user friendly - everything is done automatically
- no data redundancy - in my extension current versions of entries are stored only once in original table (in table_log - entries are inserted to both original and log table)
- no speed degradation for INSERT statement - no triggers called on INSERT
- faster triggers - triggers in my implementation are based on referential integrity triggers and use prepared statements
- original table alterations are automatically done also on history table
- easier selection of data valid in specific time - each row contains start and end time
- easier restoration to table - TRANSACTIONTIME AS OF expression SELECT * INTO restoration_table FROM original_table;
Regards
Miroslav Simulcik
2012/5/17 A.M. <agentm@themactionfaction.com>
Hello!
On May 16, 2012, at 5:14 PM, Miroslav Šimulčík wrote:
> Hi all,
>
> as a part of my master's thesis I have created temporal support patch for PostgreSQL. It enables the creation of special temporal tables with entries versioning. Modifying operations (UPDATE, DELETE, TRUNCATE) on these tables don't cause permanent changes to entries, but create new versions of them. Thus user can easily get to the past states of the table.
>
> Basic information on temporal databases can be found on http://en.wikipedia.org/wiki/Temporal_database
>
I currently use the table_log extension: http://pgfoundry.org/projects/tablelog/
Other than the built-in syntax, how does your implementation differ feature-wise?
Cheers,
M
Miroslav Šimulcík wrote: > as a part of my master's thesis I have created temporal support patch for PostgreSQL. It enables the > creation of special temporal tables with entries versioning. Modifying operations (UPDATE, DELETE, > TRUNCATE) on these tables don't cause permanent changes to entries, but create new versions of them. > Thus user can easily get to the past states of the table. Most of that is something you could handle with triggers (not TRUNCATE of course). What does your proposed feature do that a set of triggers wouldn't? Yours, Laurenz Albe
Hello.
SQL 2011 standard wasn't available in time I started this project so I built my implementation on older standards TSQL2 and SQL/Temporal, that were only available. None of these were accepted by ANSI/ISO commissions however.
There is different syntax in SQL 2011 and it looks like one that IBM DB2 had been using even before this standard were published.
So my implementation differs in syntax, but features are same as stated in "system versioned tables" part of slideshow.
Regards
Miroslav Simulcik
2012/5/17 Pavel Stehule <pavel.stehule@gmail.com>
Hello
what is conformance of your solution with temporal extension in ANSI SQL 2011
http://www.slideshare.net/CraigBaumunk/temporal-extensions-tosql20112012010438
Regards
Pavel Stehule
2012/5/16 Miroslav Šimulčík <simulcik.miro@gmail.com>:> Hi all,
>
> as a part of my master's thesis I have created temporal support patch for
> PostgreSQL. It enables the creation of special temporal tables with entries
> versioning. Modifying operations (UPDATE, DELETE, TRUNCATE) on these tables
> don't cause permanent changes to entries, but create new versions of them.
> Thus user can easily get to the past states of the table.
>
> Basic information on temporal databases can be found
> on http://en.wikipedia.org/wiki/Temporal_database
>
> In field of temporal databases, there are only proprietary solution
> available. During the analysis I found these:
> - IBM DB2 10 for z/OS
> - Oracle 11g Workspace Manager
> - Teradata Database 13.10
>
> Primary goal of my work was the creation of opensource solution, that is
> easy to use and is backward compatible with existing applications, so that
> the change of the original tables to temporal ones, does not require changes
> to applications that work with them. This patch is built on standard
> SQL/Temporal with some minor modifications inspired by commercial temporal
> database systems. Currently it only deals with transaction time support.
>
> Here is simple description on how it works:
>
> 1. user can create transaction time table using modified CREATE TABLE
> command:
>
> CREATE TABLE person(name varchar(50)) AS TRANSACTIONTIME;
>
> This command automatically creates all objects required for transaction
> time support:
>
> List of relations
> Schema | Name | Type | Owner
> --------+----------------------+----------+----------
> public | person | table | tester
> public | person__entry_id_seq | sequence | tester
> public | person_hist | table | postgres
>
>
> Table "public.person"
> Column | Type |
> Modifiers
>
> ------------+-----------------------------+------------------------------------------------------------------------------
> name | character varying(50) |
> _entry_id | bigint | not null default
> nextval('person__entry_id_seq'::regclass)
> _sys_start | timestamp without time zone | not null default
> clock_timestamp()
> _sys_end | timestamp without time zone | not null default
> '294276-12-31 23:59:59.999999'::timestamp without time zone
> Indexes:
> "person__entry_id_idx" btree (_entry_id)
> "person__sys_start__sys_end_idx" btree (_sys_start, _sys_end)
>
>
> Table "public.person_hist"
> Column | Type | Modifiers
> ------------+-----------------------------+-----------
> name | character varying(50) |
> _entry_id | bigint | not null
> _sys_start | timestamp without time zone | not null
> _sys_end | timestamp without time zone | not null
> Indexes:
> "person_hist__entry_id_idx" btree (_entry_id)
> "person_hist__sys_start__sys_end_idx" btree (_sys_start, _sys_end)
>
>
>
>
> Table person stores current versions of entries. 3 additional columns
> are added:
> _entry_id - id of entry. It groups together different versions of
> entry.
> _sys_start - beginning of the version validity period (version
> creation timestamp).
> _sys_end - end of the version validity period.
>
> Table person_hist stores historical versions of entries. It has the same
> structure and indexes as the person table, but without any constraints and
> default values.
>
> 2. another way of creating transaction time table is adding transaction time
> support to existing standard table using ALTER command.
>
> CREATE TABLE person(name varchar(50));
> ALTER TABLE person ADD TRANSACTIONTIME;
>
> 3. INSERT entry
>
> INSERT INTO person VALUES('Jack');
>
> SELECT *, _entry_id, _sys_start, _sys_end FROM person;
>
> name | _entry_id | _sys_start | _sys_end
>
> ------+-----------+----------------------------+------------------------------
> Jack | 1 | 2012-05-16 22:11:39.856916 | 294276-12-31
> 23:59:59.999999
>
> 4. UPDATE entry
>
> UPDATE person SET name = 'Tom';
>
> SELECT *, _entry_id, _sys_start, _sys_end FROM person;
>
> name | _entry_id | _sys_start | _sys_end
>
> ------+-----------+----------------------------+------------------------------
> Tom | 1 | 2012-05-16 22:11:44.736195 | 294276-12-31
> 23:59:59.999999
>
> SELECT * FROM person_hist;
>
> name | _entry_id | _sys_start | _sys_end
>
> ------+-----------+----------------------------+----------------------------
> Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16
> 22:11:44.736194
>
> 5. DELETE entry
>
> DELETE FROM person;
>
> SELECT *, _entry_id, _sys_start, _sys_end FROM person;
>
> name | _entry_id | _sys_start | _sys_end
> ------+-----------+------------+----------
>
> SELECT * FROM person_hist;
>
> name | _entry_id | _sys_start | _sys_end
>
> ------+-----------+----------------------------+----------------------------
> Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16
> 22:11:44.736194
> Tom | 1 | 2012-05-16 22:11:44.736195 | 2012-05-16
> 22:14:33.875869
>
> 6. selecting entries
>
> INSERT INTO person VALUES('Mike');
> INSERT INTO person VALUES('Mike');
>
> --standard SELECT - operates only with current versions of entries
> SELECT * FROM person;
>
> name
> ------
> Mike
> Mike
>
> --special temporal SELECT which operates with all versions
> NONSEQUENCED TRANSACTIONTIME SELECT *, _entry_id, _sys_start, _sys_end
> FROM person;
>
> name | _entry_id | _sys_start | _sys_end
>
> ------+-----------+----------------------------+------------------------------
> Mike | 3 | 2012-05-16 22:20:55.055671 | 294276-12-31
> 23:59:59.999999
> Mike | 2 | 2012-05-16 22:20:51.619475 | 294276-12-31
> 23:59:59.999999
> Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16
> 22:11:44.736194
> Tom | 1 | 2012-05-16 22:11:44.736195 | 2012-05-16
> 22:14:33.875869
>
> --special temporal SELECT which operates with versions valid in
> specified time
> TRANSACTIONTIME AS OF '2012-05-16 22:11:39.856916' SELECT *, _entry_id,
> _sys_start, _sys_end FROM person;
>
> name | _entry_id | _sys_start | _sys_end
>
> ------+-----------+----------------------------+----------------------------
> Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16
> 22:11:44.736194
>
> --it is also possible to set timestamp globally for session. All
> subsequent SELECTs without any temporal modifier will operate with versions
> valid in this time,
> SET history_timestamp TO '2012-05-16 22:11:39.856916';
>
> SELECT * FROM person;
>
> name
> ------
> Jack
>
> --to select only current versions when history_tiumestamp is set,
> CURRENT TRANSACTIONTIME have to be used with SELECT
> CURRENT TRANSACTIONTIME SELECT * FROM person;
>
> name
> ------
> Mike
> Mike
>
>
>
> This is only a illustration of main functionality. Later I can create a
> document about the design and implementation details, but first I need to
> know if such temporal features as described here, could be added to future
> versions of PostgreSQL, after meeting all the requirements of a new patch.
>
> Regards
>
> Miroslav Simulcik
Hi,
yes you are right, storing old versions of entry to history table can be handled with triggers and my solution also uses triggers to do that. Advantage of my implementation is that user doesn't need to create all necessary objects (triggers, history table, begin/end columns ...) manually. He just need to use AS TRANSACTIONTIME clause in CREATE TABLE command and everything else is done automatically by backend.
I focused on backward compatibility with existing applications, so that if versioning is added to original table, application can run correctly without any changes. For example columns for beginning and end timestamp are added to original table. This requires modifying of star symbol '*' expansion in SELECT statement processing, because newly added (internal) column have to be implicitly hidden. The same for INSERT command without column list.
Simply said, my implementation makes work with temporal tables easier a minimizes limitations for existing applications. User can work with temporal table, just like with any other table and don't need to bother with triggers, columns, history tables etc. For example if you want add column to table with versioning, you can use simple ALTER TABLE ADD COLLUMN and column is automatically added to history table too.
Regards
Miroslav Simulcik
2012/5/18 Albe Laurenz <laurenz.albe@wien.gv.at>
Miroslav Šimulcík wrote:Most of that is something you could handle with triggers (not TRUNCATE of course).
> as a part of my master's thesis I have created temporal support patch for PostgreSQL. It enables the
> creation of special temporal tables with entries versioning. Modifying operations (UPDATE, DELETE,
> TRUNCATE) on these tables don't cause permanent changes to entries, but create new versions of them.
> Thus user can easily get to the past states of the table.
What does your proposed feature do that a set of triggers wouldn't?
Yours,
Laurenz Albe
On Wed, May 16, 2012 at 4:14 PM, Miroslav Šimulčík <simulcik.miro@gmail.com> wrote: > > This is only a illustration of main functionality. Later I can create a > document about the design and implementation details, but first I need to > know if such temporal features as described here, could be added to future > versions of PostgreSQL, after meeting all the requirements of a new patch. > how this affect to users *not* using this feature? i mean, it affects performance? probably not but... if you already have the patch (i understand it is from what you said) add it to the first commitfest (https://commitfest.postgresql.org/action/commitfest_view?id=14) that will start on june 15. is difficult to know if something we haven't seen is acceptable or not (please add some description about the way you did it) if the patch is still not ready, then start explaining the design you are going to use... -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación
Hello 2012/5/18 Miroslav Šimulčík <simulcik.miro@gmail.com>: > Hello. > > SQL 2011 standard wasn't available in time I started this project so I built > my implementation on older standards TSQL2 and SQL/Temporal, that were only > available. None of these were accepted by ANSI/ISO commissions however. > > There is different syntax in SQL 2011 and it looks like one that IBM DB2 had > been using even before this standard were published. > > So my implementation differs in syntax, but features are same as stated in > "system versioned tables" part of slideshow. > I would to see temporal functionality in pg, but only in SQL 2011 syntax. Using syntax from deprecated proposals has no sense. I am not sure so history table concept is best from performance view - it is simpler for implementation, but you duplicate all indexes - there will be lot of redundant fields in history table. A important query is difference in cost for some non trivial query for actual data and same query for historic data. Regards Pavel Stehule > Regards > Miroslav Simulcik > > > 2012/5/17 Pavel Stehule <pavel.stehule@gmail.com> >> >> Hello >> >> what is conformance of your solution with temporal extension in ANSI SQL >> 2011 >> >> >> http://www.slideshare.net/CraigBaumunk/temporal-extensions-tosql20112012010438 >> >> Regards >> >> Pavel Stehule >> >> 2012/5/16 Miroslav Šimulčík <simulcik.miro@gmail.com>: >> > Hi all, >> > >> > as a part of my master's thesis I have created temporal support patch >> > for >> > PostgreSQL. It enables the creation of special temporal tables with >> > entries >> > versioning. Modifying operations (UPDATE, DELETE, TRUNCATE) on these >> > tables >> > don't cause permanent changes to entries, but create new versions of >> > them. >> > Thus user can easily get to the past states of the table. >> > >> > Basic information on temporal databases can be found >> > on http://en.wikipedia.org/wiki/Temporal_database >> > >> > In field of temporal databases, there are only proprietary solution >> > available. During the analysis I found these: >> > - IBM DB2 10 for z/OS >> > - Oracle 11g Workspace Manager >> > - Teradata Database 13.10 >> > >> > Primary goal of my work was the creation of opensource solution, that is >> > easy to use and is backward compatible with existing applications, >> > so that >> > the change of the original tables to temporal ones, does not require >> > changes >> > to applications that work with them. This patch is built on standard >> > SQL/Temporal with some minor modifications inspired by commercial >> > temporal >> > database systems. Currently it only deals with transaction time support. >> > >> > Here is simple description on how it works: >> > >> > 1. user can create transaction time table using modified CREATE TABLE >> > command: >> > >> > CREATE TABLE person(name varchar(50)) AS TRANSACTIONTIME; >> > >> > This command automatically creates all objects required for >> > transaction >> > time support: >> > >> > List of relations >> > Schema | Name | Type | Owner >> > --------+----------------------+----------+---------- >> > public | person | table | tester >> > public | person__entry_id_seq | sequence | tester >> > public | person_hist | table | postgres >> > >> > >> > Table "public.person" >> > Column | Type | >> > Modifiers >> > >> > >> > ------------+-----------------------------+------------------------------------------------------------------------------ >> > name | character varying(50) | >> > _entry_id | bigint | not null default >> > nextval('person__entry_id_seq'::regclass) >> > _sys_start | timestamp without time zone | not null default >> > clock_timestamp() >> > _sys_end | timestamp without time zone | not null default >> > '294276-12-31 23:59:59.999999'::timestamp without time zone >> > Indexes: >> > "person__entry_id_idx" btree (_entry_id) >> > "person__sys_start__sys_end_idx" btree (_sys_start, _sys_end) >> > >> > >> > Table "public.person_hist" >> > Column | Type | Modifiers >> > ------------+-----------------------------+----------- >> > name | character varying(50) | >> > _entry_id | bigint | not null >> > _sys_start | timestamp without time zone | not null >> > _sys_end | timestamp without time zone | not null >> > Indexes: >> > "person_hist__entry_id_idx" btree (_entry_id) >> > "person_hist__sys_start__sys_end_idx" btree (_sys_start, >> > _sys_end) >> > >> > >> > >> > >> > Table person stores current versions of entries. 3 additional >> > columns >> > are added: >> > _entry_id - id of entry. It groups together different versions >> > of >> > entry. >> > _sys_start - beginning of the version validity period (version >> > creation timestamp). >> > _sys_end - end of the version validity period. >> > >> > Table person_hist stores historical versions of entries. It has the >> > same >> > structure and indexes as the person table, but without any constraints >> > and >> > default values. >> > >> > 2. another way of creating transaction time table is adding transaction >> > time >> > support to existing standard table using ALTER command. >> > >> > CREATE TABLE person(name varchar(50)); >> > ALTER TABLE person ADD TRANSACTIONTIME; >> > >> > 3. INSERT entry >> > >> > INSERT INTO person VALUES('Jack'); >> > >> > SELECT *, _entry_id, _sys_start, _sys_end FROM person; >> > >> > name | _entry_id | _sys_start | _sys_end >> > >> > >> > ------+-----------+----------------------------+------------------------------ >> > Jack | 1 | 2012-05-16 22:11:39.856916 | 294276-12-31 >> > 23:59:59.999999 >> > >> > 4. UPDATE entry >> > >> > UPDATE person SET name = 'Tom'; >> > >> > SELECT *, _entry_id, _sys_start, _sys_end FROM person; >> > >> > name | _entry_id | _sys_start | _sys_end >> > >> > >> > ------+-----------+----------------------------+------------------------------ >> > Tom | 1 | 2012-05-16 22:11:44.736195 | 294276-12-31 >> > 23:59:59.999999 >> > >> > SELECT * FROM person_hist; >> > >> > name | _entry_id | _sys_start | _sys_end >> > >> > >> > ------+-----------+----------------------------+---------------------------- >> > Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16 >> > 22:11:44.736194 >> > >> > 5. DELETE entry >> > >> > DELETE FROM person; >> > >> > SELECT *, _entry_id, _sys_start, _sys_end FROM person; >> > >> > name | _entry_id | _sys_start | _sys_end >> > ------+-----------+------------+---------- >> > >> > SELECT * FROM person_hist; >> > >> > name | _entry_id | _sys_start | _sys_end >> > >> > >> > ------+-----------+----------------------------+---------------------------- >> > Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16 >> > 22:11:44.736194 >> > Tom | 1 | 2012-05-16 22:11:44.736195 | 2012-05-16 >> > 22:14:33.875869 >> > >> > 6. selecting entries >> > >> > INSERT INTO person VALUES('Mike'); >> > INSERT INTO person VALUES('Mike'); >> > >> > --standard SELECT - operates only with current versions of entries >> > SELECT * FROM person; >> > >> > name >> > ------ >> > Mike >> > Mike >> > >> > --special temporal SELECT which operates with all versions >> > NONSEQUENCED TRANSACTIONTIME SELECT *, _entry_id, _sys_start, >> > _sys_end >> > FROM person; >> > >> > name | _entry_id | _sys_start | _sys_end >> > >> > >> > ------+-----------+----------------------------+------------------------------ >> > Mike | 3 | 2012-05-16 22:20:55.055671 | 294276-12-31 >> > 23:59:59.999999 >> > Mike | 2 | 2012-05-16 22:20:51.619475 | 294276-12-31 >> > 23:59:59.999999 >> > Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16 >> > 22:11:44.736194 >> > Tom | 1 | 2012-05-16 22:11:44.736195 | 2012-05-16 >> > 22:14:33.875869 >> > >> > --special temporal SELECT which operates with versions valid in >> > specified time >> > TRANSACTIONTIME AS OF '2012-05-16 22:11:39.856916' SELECT *, >> > _entry_id, >> > _sys_start, _sys_end FROM person; >> > >> > name | _entry_id | _sys_start | _sys_end >> > >> > >> > ------+-----------+----------------------------+---------------------------- >> > Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16 >> > 22:11:44.736194 >> > >> > --it is also possible to set timestamp globally for session. All >> > subsequent SELECTs without any temporal modifier will operate >> > with versions >> > valid in this time, >> > SET history_timestamp TO '2012-05-16 22:11:39.856916'; >> > >> > SELECT * FROM person; >> > >> > name >> > ------ >> > Jack >> > >> > --to select only current versions when history_tiumestamp is set, >> > CURRENT TRANSACTIONTIME have to be used with SELECT >> > CURRENT TRANSACTIONTIME SELECT * FROM person; >> > >> > name >> > ------ >> > Mike >> > Mike >> > >> > >> > >> > This is only a illustration of main functionality. Later I can create a >> > document about the design and implementation details, but first I need >> > to >> > know if such temporal features as described here, could be added to >> > future >> > versions of PostgreSQL, after meeting all the requirements of a new >> > patch. >> > >> > Regards >> > >> > Miroslav Simulcik > >
On 5/18/12 2:06 AM, Miroslav Šimulčík wrote: > - no data redundancy - in my extension current versions of entries are stored only once in original table (in table_log- entries are inserted to both original and log table) That's not necessarily a benefit... it makes querying for both history *and* current data a lot more complex. Table inheritancemight be an elegant solution to that, but I doubt you could just bolt that on top of what you've created. The timestamp fields need to have timezone info. If you change the timezone for a connection you will get inconsistent resultswithout it. _sys_end should either be NULLable or if it's going to have a magic value that magic value should be "Infinity": decibel@workbook.local=# create table t(t timestamptz); CREATE TABLE decibel@workbook.local=# insert into t values('infinity'); INSERT 0 1 decibel@workbook.local=# select * from t; t ---------- infinity (1 row) decibel@workbook.local=# -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On Wed, 2012-05-16 at 23:14 +0200, Miroslav Šimulčík wrote: > Hi all, > > > as a part of my master's thesis I have created temporal support patch > for PostgreSQL. It enables the creation of special temporal tables > with entries versioning. Modifying operations (UPDATE, DELETE, > TRUNCATE) on these tables don't cause permanent changes to entries, > but create new versions of them. Thus user can easily get to the past > states of the table. > I would be very interested to see this, thank you for working on it. There are quite a few aspects to a temporal database system, and you are working on a system-maintained transaction-time historical table, right? Or are there other aspects to your proposal? Some general comments: * I'd very much like to see you make use of Range Types from 9.2; in particular, TSTZRANGE would be much better than holding two timestamps. If a standard requires you to display two timestamps in certain situations, perhaps you could use ranges internally and display the boundaries as timestamps when needed. * There is other useful information that could be recorded, such as the user who inserted/updated/deleted the record. * For some purposes, it's very useful to keep track of the columns that changed. For instance, a query like "show me any time a salary was changed over the last month" (or some other rare event) would be very slow to run if there was not some explicit annotation on the historical records (e.g. a "columns changed" bitmap or something). * In general, I'm not fond of adorning queries with TRANSACTION TIME AS OF... kinds of things. Those constructs are redundant with a WHERE clause (on a range type, you'd use the "contains" operator). If a standard requires that, maybe it would be OK to allow such things as syntactic sugar. * I do like having special DDL that creates the appropriate objects. That helps to guide users so they don't have to invent their own solution with triggers, etc. * As Jim mentioned, it might make sense to use something resembling inheritance so that selecting from the historical table includes the current data (but with no upper bound for the range). * It might make sense to hammer out as many of the details as we can with an extension. For instance, exactly what options will be available, what data types will be used, what objects will be created, the trigger code, etc. Then, it will be more obvious exactly what we need to add extra core support for (e.g. if we are going to use some inheritance like mechanism), and what we need to add syntax sugar for. I recommend that you start posting more detailed designs on http://wiki.postgresql.org If you already have code, feel free to submit it for the next commitfest ( http://commitfest.postgresql.org ), but this is a relatively large project, so it will most likely take several commitfest cycles. Regards,Jeff Davis
There would be no problem to make my solution compatible with SQL 2011, but the standard is not freely available. Can anybody provide me with this standard?
2012/5/20 Pavel Stehule <pavel.stehule@gmail.com>
Hello
2012/5/18 Miroslav Šimulčík <simulcik.miro@gmail.com>:> Hello.I would to see temporal functionality in pg, but only in SQL 2011
>
> SQL 2011 standard wasn't available in time I started this project so I built
> my implementation on older standards TSQL2 and SQL/Temporal, that were only
> available. None of these were accepted by ANSI/ISO commissions however.
>
> There is different syntax in SQL 2011 and it looks like one that IBM DB2 had
> been using even before this standard were published.
>
> So my implementation differs in syntax, but features are same as stated in
> "system versioned tables" part of slideshow.
>
syntax. Using syntax from deprecated proposals has no sense. I am not
sure so history table concept is best from performance view - it is
simpler for implementation, but you duplicate all indexes - there will
be lot of redundant fields in history table. A important query is
difference in cost for some non trivial query for actual data and same
query for historic data.
Regards
Pavel Stehule
> Regards
> Miroslav Simulcik
>
>
> 2012/5/17 Pavel Stehule <pavel.stehule@gmail.com>
>>
>> Hello
>>
>> what is conformance of your solution with temporal extension in ANSI SQL
>> 2011
>>
>>
>> http://www.slideshare.net/CraigBaumunk/temporal-extensions-tosql20112012010438
>>
>> Regards
>>
>> Pavel Stehule
>>
>> 2012/5/16 Miroslav Šimulčík <simulcik.miro@gmail.com>:
>> > Hi all,
>> >
>> > as a part of my master's thesis I have created temporal support patch
>> > for
>> > PostgreSQL. It enables the creation of special temporal tables with
>> > entries
>> > versioning. Modifying operations (UPDATE, DELETE, TRUNCATE) on these
>> > tables
>> > don't cause permanent changes to entries, but create new versions of
>> > them.
>> > Thus user can easily get to the past states of the table.
>> >
>> > Basic information on temporal databases can be found
>> > on http://en.wikipedia.org/wiki/Temporal_database
>> >
>> > In field of temporal databases, there are only proprietary solution
>> > available. During the analysis I found these:
>> > - IBM DB2 10 for z/OS
>> > - Oracle 11g Workspace Manager
>> > - Teradata Database 13.10
>> >
>> > Primary goal of my work was the creation of opensource solution, that is
>> > easy to use and is backward compatible with existing applications,
>> > so that
>> > the change of the original tables to temporal ones, does not require
>> > changes
>> > to applications that work with them. This patch is built on standard
>> > SQL/Temporal with some minor modifications inspired by commercial
>> > temporal
>> > database systems. Currently it only deals with transaction time support.
>> >
>> > Here is simple description on how it works:
>> >
>> > 1. user can create transaction time table using modified CREATE TABLE
>> > command:
>> >
>> > CREATE TABLE person(name varchar(50)) AS TRANSACTIONTIME;
>> >
>> > This command automatically creates all objects required for
>> > transaction
>> > time support:
>> >
>> > List of relations
>> > Schema | Name | Type | Owner
>> > --------+----------------------+----------+----------
>> > public | person | table | tester
>> > public | person__entry_id_seq | sequence | tester
>> > public | person_hist | table | postgres
>> >
>> >
>> > Table "public.person"
>> > Column | Type |
>> > Modifiers
>> >
>> >
>> > ------------+-----------------------------+------------------------------------------------------------------------------
>> > name | character varying(50) |
>> > _entry_id | bigint | not null default
>> > nextval('person__entry_id_seq'::regclass)
>> > _sys_start | timestamp without time zone | not null default
>> > clock_timestamp()
>> > _sys_end | timestamp without time zone | not null default
>> > '294276-12-31 23:59:59.999999'::timestamp without time zone
>> > Indexes:
>> > "person__entry_id_idx" btree (_entry_id)
>> > "person__sys_start__sys_end_idx" btree (_sys_start, _sys_end)
>> >
>> >
>> > Table "public.person_hist"
>> > Column | Type | Modifiers
>> > ------------+-----------------------------+-----------
>> > name | character varying(50) |
>> > _entry_id | bigint | not null
>> > _sys_start | timestamp without time zone | not null
>> > _sys_end | timestamp without time zone | not null
>> > Indexes:
>> > "person_hist__entry_id_idx" btree (_entry_id)
>> > "person_hist__sys_start__sys_end_idx" btree (_sys_start,
>> > _sys_end)
>> >
>> >
>> >
>> >
>> > Table person stores current versions of entries. 3 additional
>> > columns
>> > are added:
>> > _entry_id - id of entry. It groups together different versions
>> > of
>> > entry.
>> > _sys_start - beginning of the version validity period (version
>> > creation timestamp).
>> > _sys_end - end of the version validity period.
>> >
>> > Table person_hist stores historical versions of entries. It has the
>> > same
>> > structure and indexes as the person table, but without any constraints
>> > and
>> > default values.
>> >
>> > 2. another way of creating transaction time table is adding transaction
>> > time
>> > support to existing standard table using ALTER command.
>> >
>> > CREATE TABLE person(name varchar(50));
>> > ALTER TABLE person ADD TRANSACTIONTIME;
>> >
>> > 3. INSERT entry
>> >
>> > INSERT INTO person VALUES('Jack');
>> >
>> > SELECT *, _entry_id, _sys_start, _sys_end FROM person;
>> >
>> > name | _entry_id | _sys_start | _sys_end
>> >
>> >
>> > ------+-----------+----------------------------+------------------------------
>> > Jack | 1 | 2012-05-16 22:11:39.856916 | 294276-12-31
>> > 23:59:59.999999
>> >
>> > 4. UPDATE entry
>> >
>> > UPDATE person SET name = 'Tom';
>> >
>> > SELECT *, _entry_id, _sys_start, _sys_end FROM person;
>> >
>> > name | _entry_id | _sys_start | _sys_end
>> >
>> >
>> > ------+-----------+----------------------------+------------------------------
>> > Tom | 1 | 2012-05-16 22:11:44.736195 | 294276-12-31
>> > 23:59:59.999999
>> >
>> > SELECT * FROM person_hist;
>> >
>> > name | _entry_id | _sys_start | _sys_end
>> >
>> >
>> > ------+-----------+----------------------------+----------------------------
>> > Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16
>> > 22:11:44.736194
>> >
>> > 5. DELETE entry
>> >
>> > DELETE FROM person;
>> >
>> > SELECT *, _entry_id, _sys_start, _sys_end FROM person;
>> >
>> > name | _entry_id | _sys_start | _sys_end
>> > ------+-----------+------------+----------
>> >
>> > SELECT * FROM person_hist;
>> >
>> > name | _entry_id | _sys_start | _sys_end
>> >
>> >
>> > ------+-----------+----------------------------+----------------------------
>> > Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16
>> > 22:11:44.736194
>> > Tom | 1 | 2012-05-16 22:11:44.736195 | 2012-05-16
>> > 22:14:33.875869
>> >
>> > 6. selecting entries
>> >
>> > INSERT INTO person VALUES('Mike');
>> > INSERT INTO person VALUES('Mike');
>> >
>> > --standard SELECT - operates only with current versions of entries
>> > SELECT * FROM person;
>> >
>> > name
>> > ------
>> > Mike
>> > Mike
>> >
>> > --special temporal SELECT which operates with all versions
>> > NONSEQUENCED TRANSACTIONTIME SELECT *, _entry_id, _sys_start,
>> > _sys_end
>> > FROM person;
>> >
>> > name | _entry_id | _sys_start | _sys_end
>> >
>> >
>> > ------+-----------+----------------------------+------------------------------
>> > Mike | 3 | 2012-05-16 22:20:55.055671 | 294276-12-31
>> > 23:59:59.999999
>> > Mike | 2 | 2012-05-16 22:20:51.619475 | 294276-12-31
>> > 23:59:59.999999
>> > Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16
>> > 22:11:44.736194
>> > Tom | 1 | 2012-05-16 22:11:44.736195 | 2012-05-16
>> > 22:14:33.875869
>> >
>> > --special temporal SELECT which operates with versions valid in
>> > specified time
>> > TRANSACTIONTIME AS OF '2012-05-16 22:11:39.856916' SELECT *,
>> > _entry_id,
>> > _sys_start, _sys_end FROM person;
>> >
>> > name | _entry_id | _sys_start | _sys_end
>> >
>> >
>> > ------+-----------+----------------------------+----------------------------
>> > Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16
>> > 22:11:44.736194
>> >
>> > --it is also possible to set timestamp globally for session. All
>> > subsequent SELECTs without any temporal modifier will operate
>> > with versions
>> > valid in this time,
>> > SET history_timestamp TO '2012-05-16 22:11:39.856916';
>> >
>> > SELECT * FROM person;
>> >
>> > name
>> > ------
>> > Jack
>> >
>> > --to select only current versions when history_tiumestamp is set,
>> > CURRENT TRANSACTIONTIME have to be used with SELECT
>> > CURRENT TRANSACTIONTIME SELECT * FROM person;
>> >
>> > name
>> > ------
>> > Mike
>> > Mike
>> >
>> >
>> >
>> > This is only a illustration of main functionality. Later I can create a
>> > document about the design and implementation details, but first I need
>> > to
>> > know if such temporal features as described here, could be added to
>> > future
>> > versions of PostgreSQL, after meeting all the requirements of a new
>> > patch.
>> >
>> > Regards
>> >
>> > Miroslav Simulcik
>
>
2012/5/30 Jim Nasby <jim@nasby.net>
On 5/18/12 2:06 AM, Miroslav Šimulčík wrote:That's not necessarily a benefit... it makes querying for both history *and* current data a lot more complex. Table inheritance might be an elegant solution to that, but I doubt you could just bolt that on top of what you've created.- no data redundancy - in my extension current versions of entries are stored only once in original table (in table_log - entries are inserted to both original and log table)
Yes, querying for history data is more complex, but i focused on preserving the performance of current queries. That's the reason why I use separate table for old versions.
Table inheritance is very good idea and it will not require so much effort to use it in my solution. Currently, when user queries whole history of entries, table reference in FROM clause is replaced with subselect, which access data in both tables. For example when user executes command:
NONSEQUENCED TRANSACTIONTIME SELECT * FROM person;
The actually executed command is:
SELECT * FROM (SELECT * FROM person UNION ALL SELECT * FROM person_hist) as person
Use of table inheritance can make things simpler and more elegant, but I'm not sure about how it affect performance. Will it cause gain in performance?
The timestamp fields need to have timezone info. If you change the timezone for a connection you will get inconsistent results without it.
_sys_end should either be NULLable or if it's going to have a magic value that magic value should be "Infinity":
Good point. I will use timestamp with timezone and value "Infinity" instead of max timestamp value
* I'd very much like to see you make use of Range Types from 9.2; in
particular, TSTZRANGE would be much better than holding two timestamps.
If a standard requires you to display two timestamps in certain
situations, perhaps you could use ranges internally and display the
boundaries as timestamps when needed.
I agree, new range types will be ideal for this
* There is other useful information that could be recorded, such as the
user who inserted/updated/deleted the record.
Yes I considered addition of user ID and transaction ID columns, because it can be useful in some cases (for example to find all changes made by transaction). However it wasn't necessary, so i omitted it. It can be easily added.
* For some purposes, it's very useful to keep track of the columns that
changed. For instance, a query like "show me any time a salary was
changed over the last month" (or some other rare event) would be very
slow to run if there was not some explicit annotation on the historical
records (e.g. a "columns changed" bitmap or something).
Another useful feature. I can take a look on it
* As Jim mentioned, it might make sense to use something resembling
inheritance so that selecting from the historical table includes the
current data (but with no upper bound for the range).
See reply to Jim's post.
* It might make sense to hammer out as many of the details as we can
with an extension. For instance, exactly what options will be available,
what data types will be used, what objects will be created, the trigger
code, etc. Then, it will be more obvious exactly what we need to add
extra core support for (e.g. if we are going to use some inheritance
like mechanism), and what we need to add syntax sugar for.
I recommend that you start posting more detailed designs on
http://wiki.postgresql.org
In which section of wiki can I post detailed design of my solution?
If you already have code, feel free to submit it for the next commitfest
( http://commitfest.postgresql.org ), but this is a relatively large
project, so it will most likely take several commitfest cycles.
I have working patch for postgresql version 9.0.4, but it needs refactoring before i can submit it, because some parts don't meet formatting requirements yet. And yes, changes are large, so it will be better to discuss design first and then deal with code. Do you insist on compatibility with standard SQL 2011 as Pavel wrote?
On Wed, Jun 13, 2012 at 4:10 PM, Miroslav Šimulčík <simulcik.miro@gmail.com> wrote: > I have working patch for postgresql version 9.0.4, but it needs refactoring > before i can submit it, because some parts don't > meet formatting requirements yet. And yes, changes are large, so it will be > better to discuss design first and then deal with code. Do you insist on > compatibility with standard SQL 2011 as Pavel wrote? Standards compliance is always going to make things easier in terms of gaining community acceptance if you're targeting in core adoption. At the very least it will remove one barrier although you might be in for a slog for other reasons. You may not have known this, but postgres had a time travel feature waaay back in the day (see: http://www.postgresql.org/docs/6.3/static/c0503.htm). It was removed for performance reasons and the first thing I'm wondering is how your stuff performs in various scenarios and various other interesting things. Also, +1 on use of range types Anyways, thanks for submitting and good luck! merlin
On Wed, 2012-06-13 at 23:10 +0200, Miroslav Šimulčík wrote: > I have working patch for postgresql version 9.0.4, but it needs > refactoring before i can submit it, because some parts don't > meet formatting requirements yet. And yes, changes are large, so it > will be better to discuss design first and then deal with code. Do you > insist on compatibility with standard SQL 2011 as Pavel wrote? > Try to work on solving the problem and identify the use cases. I don't think the standard will cause a major problem, we should be able to make the relevant parts of your patch match the standard. That's one reason to work on it as an extension first: we can get a better sense of the problem space and various use cases without worrying about violating any standard. Then, as you need specific backend support (e.g. special syntax), we can take the standards more seriously. Regards,Jeff Davis
On 06/15/2012 03:59 PM, Jeff Davis wrote: <blockquote cite="mid:1339743578.28986.19.camel@jdavis" type="cite"><pre wrap="">OnWed, 2012-06-13 at 23:10 +0200, Miroslav Šimulčík wrote: </pre><blockquote type="cite"><pre wrap="">I have working patch for postgresql version 9.0.4, but it needs refactoring before i can submit it, because some parts don't meet formatting requirements yet. And yes, changes are large, so it will be better to discuss design first and then deal with code. Do you insist on compatibility with standard SQL 2011 as Pavel wrote? </pre></blockquote><pre wrap="">Try to work on solving the problem and identify the use cases. I don't think the standard will cause a major problem, we should be able to make the relevant parts of your patch match the standard. That's one reason to work on it as an extension first: we can get a better sense of the problem space and various use cases without worrying about violating any standard. Then, as you need specific backend support (e.g. special syntax), we can take the standards more seriously. Regards,Jeff Davis </pre></blockquote> What's wrong with SPI/timetravel extension for system versioning?<br /><a href="http://www.postgresql.org/docs/9.1/static/contrib-spi.html">http://www.postgresql.org/docs/9.1/static/contrib-spi.html</a><br /><br/> We are heavily using system-versioned and application-time period tables in our enterprise products (most of themare bi-temporal). However our implementation is based on triggers and views and therefore is not very convenient to use.There are also some locking issues with foreign keys to application-time period tables. It will be great if the new temporalSQL features will be included in the Postgresql core with SQL 2011 syntax support. It is especially important forbi-temporal tables because of complex internal logic of UPDATE/DELETE and huge SELECT queries for such tables.<br />
On Mon, 2012-06-18 at 19:34 +0900, Vlad Arkhipov wrote: > What's wrong with SPI/timetravel extension for system versioning? > http://www.postgresql.org/docs/9.1/static/contrib-spi.html > > We are heavily using system-versioned and application-time period > tables in our enterprise products (most of them are bi-temporal). > However our implementation is based on triggers and views and > therefore is not very convenient to use. There are also some locking > issues with foreign keys to application-time period tables. It will be > great if the new temporal SQL features will be included in the > Postgresql core with SQL 2011 syntax support. It is especially > important for bi-temporal tables because of complex internal logic of > UPDATE/DELETE and huge SELECT queries for such tables. I've already pointed out some missing features in this thread, but the big ones in my mind are: 1. It doesn't use 9.2 Range Types, which would help in a lot of ways (like making the SELECT queries a lot simpler and faster). 2. It's missing a lot of options, like storing the user that modified a row or the changed columns. Regards,Jeff Davis
On 05/31/2012 11:52 AM, Jeff Davis wrote: > On Wed, 2012-05-16 at 23:14 +0200, Miroslav Šimulčík wrote: >> Hi all, >> >> >> as a part of my master's thesis I have created temporal support patch >> for PostgreSQL. It enables the creation of special temporal tables >> with entries versioning. Modifying operations (UPDATE, DELETE, >> TRUNCATE) on these tables don't cause permanent changes to entries, >> but create new versions of them. Thus user can easily get to the past >> states of the table. >> > I would be very interested to see this, thank you for working on it. > > There are quite a few aspects to a temporal database system, and you are > working on a system-maintained transaction-time historical table, right? > Or are there other aspects to your proposal? > > Some general comments: > > * I'd very much like to see you make use of Range Types from 9.2; in > particular, TSTZRANGE would be much better than holding two timestamps. > If a standard requires you to display two timestamps in certain > situations, perhaps you could use ranges internally and display the > boundaries as timestamps when needed. It's not sufficient to store only a period of validity for a row. If two transactions started in the same time change the same record, you have a problem with TSTZRANGE type because it's normalized to empty interval. The other issue is how to handle multiple changes of the same record within the transaction. Should they be stored or not? Also it's necessary to store some kind of operation type that was applied to the record (insert/update/delete). For example, there is a table with one record with validity period [0, ) and value 'A'. First way 1. Delete this record in time 1, now there is [0, 1), A in the history table. 2. Insert a new record in time 1, now there is [0, 1), A in the history table and [1, ), B record in the current data table. Second way 1. Update this record in time 1, now there is [0, 1), A in the history table and [1, ), B record in the current data table. So you have the same data in the tables but the actions that led to this configuration were different and the history has been lost partly. > * There is other useful information that could be recorded, such as the > user who inserted/updated/deleted the record. I'm not sure that the database user is the proper thing to be stored in the history table. Many applications usually connect to a database using some virtual user and have their own users/roles tables to handle with privileges. There should be some way to substitute the stored user in the history table with the application's one. It's also helpful to store transaction id that inserted/updated/deleted the record. > * For some purposes, it's very useful to keep track of the columns that > changed. For instance, a query like "show me any time a salary was > changed over the last month" (or some other rare event) would be very > slow to run if there was not some explicit annotation on the historical > records (e.g. a "columns changed" bitmap or something). It's a great proposal but seems to be impossible to implement with triggers only solution, isn't it? Is there any kind of hooks on ALTER TABLE ... in PostgreSQL to update changed columns bitmaps when table structure changes? > * In general, I'm not fond of adorning queries with TRANSACTION TIME AS > OF... kinds of things. Those constructs are redundant with a WHERE > clause (on a range type, you'd use the "contains" operator). If a > standard requires that, maybe it would be OK to allow such things as > syntactic sugar. In SQL2011 there is only one table with the all data, historical and current. So it's not very convenient to specifiy WHERE condition on system time everywhere and for all tables in the query. By default only the current data is selected with a query like SELECT * FROM table. > * As Jim mentioned, it might make sense to use something resembling > inheritance so that selecting from the historical table includes the > current data (but with no upper bound for the range). We have a success experience with inheritance with our trigger-based solution. It's completely transparent for the existing applications and does not have any impact on performance.
On Mon, 2012-06-25 at 17:46 +0900, Vlad Arkhipov wrote: > It's not sufficient to store only a period of validity for a row. If two > transactions started in the same time change the same record, you have a > problem with TSTZRANGE type because it's normalized to empty interval. That's an interesting point. Let's say you tried setting it to [T1, T2) where T1 is the time of the last transaction to update it and T2 is the time of the current transaction. If T2 <= T1, then TSTZRANGE will throw an error, not store the empty interval. And we don't want to store the empty interval, because it would be a lie. There could have been some transaction T3 that happened during T2 that saw the value from T1, so saying that there were no times where that was visible to the system is false. Throwing an error allows you to retry T2, which should allow a microsecond or so to pass, and the problem should resolve itself (assuming your clock didn't move backwards, which is a different issue). We could also argue about the start versus end times of transactions, and snapshot acquisition times, because that could cause confusion if there are long-running transactions. It might be a good reason to store the modifying transaction ID as well, but then you get into transaction wraparound problems. > The other issue is how to handle multiple changes of the same record > within the transaction. Should they be stored or not? In a typical audit log, I don't see any reason to. The internals of a transaction should be implementation details; invisible to the outside, right? > Also it's necessary to store some kind of operation type that was > applied to the record (insert/update/delete). For example, there is a > table with one record with validity period [0, ) and value 'A'. > > First way > 1. Delete this record in time 1, now there is [0, 1), A in the history > table. > 2. Insert a new record in time 1, now there is [0, 1), A in the history > table and [1, ), B record in the current data table. > > Second way > 1. Update this record in time 1, now there is [0, 1), A in the history > table and [1, ), B record in the current data table. > > So you have the same data in the tables but the actions that led to this > configuration were different and the history has been lost partly. Right. Those are yet more possible options that people might want for an audit log. > > * There is other useful information that could be recorded, such as the > > user who inserted/updated/deleted the record. > I'm not sure that the database user is the proper thing to be stored in > the history table. Many applications usually connect to a database using > some virtual user and have their own users/roles tables to handle with > privileges. There should be some way to substitute the stored user in > the history table with the application's one. It's also helpful to store > transaction id that inserted/updated/deleted the record. If the system is recording it for audit purposes, then it better be sure that it's true. You can't allow the application to pick and choose what gets stored there. While it may be true that many applications just all use the same DB user, if you want an audit log that includes user information you have to let the DB do some authentication. > It's a great proposal but seems to be impossible to implement with > triggers only solution, isn't it? Is there any kind of hooks on ALTER > TABLE ... in PostgreSQL to update changed columns bitmaps when table > structure changes? Column numbers are never reused, so I think it would be stable. But if you do need to be notified of schema changes, the new event triggers mechanism may be able to do that. > In SQL2011 there is only one table with the all data, historical and > current. So it's not very convenient to specifiy WHERE condition on > system time everywhere and for all tables in the query. By default only > the current data is selected with a query like SELECT * FROM table. If there is some syntax that offers a convenient shorthand for WHERE, that's fine with me. Or using two tables, one called foo and one called foo_history, is also fine. But I don't want the DML syntax to introduce new mechanisms that aren't available without the fancy syntax (though new DDL arrangements might be fine). Regards,Jeff Davis
> > If there is some syntax that offers a convenient shorthand for WHERE, > that's fine with me. Or using two tables, one called foo and one called > foo_history, is also fine. But I don't want the DML syntax to introduce > new mechanisms that aren't available without the fancy syntax (though > new DDL arrangements might be fine). > convention foo_history is used often - and it should be significant break for migration Regards Pavel Stehule > Regards, > Jeff Davis > > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Aug 19, 2012 at 6:28 PM, Jeff Davis <pgsql@j-davis.com> wrote: >> The other issue is how to handle multiple changes of the same record >> within the transaction. Should they be stored or not? > > In a typical audit log, I don't see any reason to. The internals of a > transaction should be implementation details; invisible to the outside, > right? I'm not convinced. >> I'm not sure that the database user is the proper thing to be stored in >> the history table. Many applications usually connect to a database using >> some virtual user and have their own users/roles tables to handle with >> privileges. There should be some way to substitute the stored user in >> the history table with the application's one. It's also helpful to store >> transaction id that inserted/updated/deleted the record. > > If the system is recording it for audit purposes, then it better be sure > that it's true. You can't allow the application to pick and choose what > gets stored there. That position would render this feature useless for every application for which I would otherwise have used it. I think it's just nonsense to talk about what we can or can't let the user do. The user is in charge, and our job is to allow him to do what he wants to do more easily, not to dictate what he must do. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers- > owner@postgresql.org] On Behalf Of Robert Haas > Sent: Monday, August 20, 2012 5:04 PM > To: Jeff Davis > Cc: Vlad Arkhipov; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] temporal support patch > > On Sun, Aug 19, 2012 at 6:28 PM, Jeff Davis <pgsql@j-davis.com> wrote: > >> The other issue is how to handle multiple changes of the same record > >> within the transaction. Should they be stored or not? > > > > In a typical audit log, I don't see any reason to. The internals of a > > transaction should be implementation details; invisible to the > > outside, right? > > I'm not convinced. Ideally the decision of whether to do so could be a client decision. Not storing intra-transaction changes is easier than storing all changes. At worse you could stage up all changed then simply fail to store all intermediate results within a given relation. It that case you gain nothing in execution performance but safe both storage and interpretative resources. So the question becomes is it worth doing without the ability to store intermediate results? If you were to ponder both which setup would the default be? If the default is the harder one (all statements) to implement then to avoid upgrade issues the syntax should specify that it is logging transactions only. Random, somewhat related, thought: I do all my working on a temporary staging table and then, as my final action, insert the resultant records onto a separate live table and drop the temporary table. Further changes to said record I perform by deleting the original then inserting a new record (from staging again) with all the values changed. Obviously this has limitations with respect to foreign keys and such but it is possible. What happens to the audit log if the PK changes and if it does not change? Any other implications that need to be address or is it like giving a loaded gun to someone and trust them to use is responsibily? > > >> I'm not sure that the database user is the proper thing to be stored > >> in the history table. Many applications usually connect to a database > >> using some virtual user and have their own users/roles tables to > >> handle with privileges. There should be some way to substitute the > >> stored user in the history table with the application's one. It's > >> also helpful to store transaction id that inserted/updated/deleted the > record. > > > > If the system is recording it for audit purposes, then it better be > > sure that it's true. You can't allow the application to pick and > > choose what gets stored there. > > That position would render this feature useless for every application for > which I would otherwise have used it. I think it's just nonsense to talk about > what we can or can't let the user do. The user is in charge, and our job is to > allow him to do what he wants to do more easily, not to dictate what he must > do. > > -- I see the "user" element as having two components: "Client" - what device/channel/"user" was used to connect to the database - PostgreSQL Role "User" - relative to that "client" which actual "user" performed the action - Application Specified A PostgreSQL role would correspond to "client" whereas the application would be allowed to have full control of what "User" value is stored. This gets a little complicated with respect to "SET ROLE" but gets close to the truth. The idea is that you look at the "client" to determine the "namespace" over which the "user" is defined and identified. So, a better way to phrase the position is that: "You cannot allow the application to choose what is stored to identify itself (client)" - i.e., its credentials identify who it is and those are stored without consulting the application At that point you've basically shifted responsibility for the correctness of the audit log onto that application and away from the database. However, you do provide a place for the application to store an identifier that it is able to resolve to a user if necessary. This is an arbitrary two-layer hierarchy and while conceptually anything with two layers may want more I am not sure whether the extra complexity that would entail would be worth the effort. Depending on what kinds of information you allowed to be stored for "User" it becomes something that can be modeled when desired and ignored otherwise. The issue with adding the PostgreSQL role to the database in this way is that you now can never delete that role or reassign it to another entity. I guess with temporal you could do so and basically have the identity-role relationship define over specific periods of time... I can (have) imagine a whole level of indirection and association to be able to reasonably handle assigning and storing permanent identities while allowing logon credentials to remain outside of permanent storage. David J.
On 8/20/12 4:17 PM, David Johnston wrote: > The issue with adding the PostgreSQL role to the database in this way is > that you now can never delete that role or reassign it to another entity. I > guess with temporal you could do so and basically have the identity-role > relationship define over specific periods of time... I can (have) imagine a > whole level of indirection and association to be able to reasonably handle > assigning and storing permanent identities while allowing logon credentials > to remain outside of permanent storage. This is sounding like a completely runaway spec on what should be a simple feature. If you want something in core which will be useful to a lot of our users, it needs to be simple and flexible. Not ornate with lots of dependancies. The first version of it should be as simple and minimalist as possible. Personally, I would prefer a tool which just made it simpler to build my own triggers, and made it automatic for the history table to track changes in the live table. I think anything we build which controls what goes into the history table, etc., will only narrow the user base. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Josh Berkus <josh@agliodbs.com> wrote: > This is sounding like a completely runaway spec on what should be > a simple feature. I hate to contribute to scope creep (or in this case scope screaming down the tracks at full steam), but I've been watching this with a queasy feeling about interaction with Serializable Snapshot Isolation (SSI). Under SSI the apparent order of execution is not always the transaction commit order, or the transaction start order. So a temporal database would be vulnerable to seeing anomalies like this one unless rw-conflicts (as tracked with predicate locks) are considered: http://wiki.postgresql.org/wiki/SSI#Deposit_Report This raises something I talked vaguely about in Ottawa this year, although it was pretty much at the hand-waving stage and I don't know how well I got the idea across. I've been thinking about the problems with all the various replication technologies being able to present data consistent with serializable transactions, and have the outlines of a technique I think might be more palatable to the community that those previously discussed. Basically, it would involve generating a list of committed XIDs in *apparent order of execution*, and creating snapshots on the replicas based on that instead of just the master's transaction commit order. I've been trying to work through the details to the point where I can present a coherent write-up on it. I wouldn't want to hold up a feature like temporal queries on the basis that it didn't immediately play nice with SSI, but it seems like it would be a good thing if the view of the past wasn't too strictly tied to transaction commit sequence; a little bit of abstraction there might save a lot of pain in tying these features together. Maybe something along the lines of a transaction visibility sequence number, or *maybe* a timestamptz works as long as that can be fudged to a time after the commit time for transactions involved in rw-conflicts with concurrent transactions. (I'm not sure microsecond resolution works for other, reasons, but if it does...) I think either could work. -Kevin
On Mon, 2012-08-20 at 17:04 -0400, Robert Haas wrote: > On Sun, Aug 19, 2012 at 6:28 PM, Jeff Davis <pgsql@j-davis.com> wrote: > >> The other issue is how to handle multiple changes of the same record > >> within the transaction. Should they be stored or not? > > > > In a typical audit log, I don't see any reason to. The internals of a > > transaction should be implementation details; invisible to the outside, > > right? > > I'm not convinced. As I understand it, we are talking about recording data changes in one table to another table. Auditing of reads or the logging of raw statements seem like very different kinds of projects to me; but tell me if you think differently. So if we are recording data changes, I don't see much point in recording uncommitted changes. Perhaps my imagination is failing, and someone else can fill me in on a use case. I'm also struggling with the semantics: if we record uncommitted changes, do we record them even if the transaction aborts? If so, what guarantees do we offer about the change actually being recorded? > >> I'm not sure that the database user is the proper thing to be stored in > >> the history table. Many applications usually connect to a database using > >> some virtual user and have their own users/roles tables to handle with > >> privileges. There should be some way to substitute the stored user in > >> the history table with the application's one. It's also helpful to store > >> transaction id that inserted/updated/deleted the record. > > > > If the system is recording it for audit purposes, then it better be sure > > that it's true. You can't allow the application to pick and choose what > > gets stored there. > > That position would render this feature useless for every application > for which I would otherwise have used it. We could offer a GUC like "audit_context" or "audit_app_context" that takes a text string, and the audit log would record the value stored in that GUC along with the data changes in question. The main thing I object to is an implication that the system is vouching for some particular fact that is supplied by a userset GUC. Remember, there are guaranteed to be application-level problems that allow these GUCs to get set improperly for all kinds of reasons. We don't want bug reports along the lines of "security breach! PG allows application_name to be spoofed in the audit log!". Also, I'd prefer not use existing GUCs, because there may be all kinds of other reasons that people set existing GUCs, and we want them to be able to handle the audit_context one more carefully and have a clear warning in the documentation. > I think it's just nonsense > to talk about what we can or can't let the user do. The user is in > charge, and our job is to allow him to do what he wants to do more > easily, not to dictate what he must do. Remember that the users who depend on the veracity of the audit log are users, too. Let's try to serve both classes of user if we can. Regards,Jeff Davis
On Mon, 2012-08-20 at 19:17 -0400, David Johnston wrote: > Ideally the decision of whether to do so could be a client decision. Not > storing intra-transaction changes is easier than storing all changes. At > worse you could stage up all changed then simply fail to store all > intermediate results within a given relation. It that case you gain nothing > in execution performance but safe both storage and interpretative resources. > So the question becomes is it worth doing without the ability to store > intermediate results? If you were to ponder both which setup would the > default be? If the default is the harder one (all statements) to implement > then to avoid upgrade issues the syntax should specify that it is logging > transactions only. I think the biggest question here is what guarantees can be offered? What if the transaction aborts after having written some data, does the audit log still get updated? > I see the "user" element as having two components: I think this is essentially a good idea, although as I said in my other email, we should be careful how we label the application-supplied information in the audit log. Regards,Jeff Davis
On Mon, 2012-08-20 at 16:32 -0700, Josh Berkus wrote: > This is sounding like a completely runaway spec on what should be a > simple feature. My feeling as well. However, we will eventually want to coalesce around some best practices and make it easy and robust for "typical" cases. > Personally, I would prefer a tool which just made it simpler to build my > own triggers, and made it automatic for the history table to track > changes in the live table. I think anything we build which controls > what goes into the history table, etc., will only narrow the user base. That sounds like a good way to start. Actually, even before the tool, how about just some really good examples of triggers for specific kinds of audit logs, and some ways to run queries on them? I think that might settle a lot of these details. Regards,Jeff Davis
On Mon, 2012-08-20 at 19:32 -0500, Kevin Grittner wrote: > Josh Berkus <josh@agliodbs.com> wrote: > > > This is sounding like a completely runaway spec on what should be > > a simple feature. > > I hate to contribute to scope creep (or in this case scope screaming > down the tracks at full steam), but I've been watching this with a > queasy feeling about interaction with Serializable Snapshot > Isolation (SSI). There are all kinds of challenges here, and I'm glad you're thinking about them. I alluded to some problems here: http://archives.postgresql.org/message-id/1345415312.20987.56.camel@jdavis But those might be a subset of the problems you're talking about. It sounds like, at a high level, there are two problems: 1. capturing the apparent order of execution in the audit log 2. assigning meaningful times to the changes that are consistent with the apparent order of execution Regards,Jeff Davis
On 08/21/2012 12:52 PM, Jeff Davis wrote: > On Mon, 2012-08-20 at 16:32 -0700, Josh Berkus wrote: >> This is sounding like a completely runaway spec on what should be a >> simple feature. > > My feeling as well. However, we will eventually want to coalesce around > some best practices and make it easy and robust for "typical" cases. > >> Personally, I would prefer a tool which just made it simpler to build my >> own triggers, and made it automatic for the history table to track >> changes in the live table. I think anything we build which controls >> what goes into the history table, etc., will only narrow the user base. FWIW, I've found the ability to exclude columns from my history triggers to be important because of: - "optimistic locking" fields used by some clients; and - Trigger-maintained summary fields Without being able to apply some exclusions there's just too much churn in the history of some tables. Here's what I'm using at the moment: http://wiki.postgresql.org/wiki/Audit_trigger_91plus (I know storing both the relation oid and the text-form table and schema name is redundant. The text is handy if the table is dropped and recreated, though, and the oid is quicker & easier much of the time). I use both the per-query and per-row forms depending on the granularity I need. -- Craig Ringer
On 08/21/2012 12:52 PM, Jeff Davis wrote: > On Mon, 2012-08-20 at 16:32 -0700, Josh Berkus wrote: >> This is sounding like a completely runaway spec on what should be a >> simple feature. > > My feeling as well. However, we will eventually want to coalesce around > some best practices and make it easy and robust for "typical" cases. > >> Personally, I would prefer a tool which just made it simpler to build my >> own triggers, and made it automatic for the history table to track >> changes in the live table. I think anything we build which controls >> what goes into the history table, etc., will only narrow the user base. > > That sounds like a good way to start. Actually, even before the tool, > how about just some really good examples of triggers for specific kinds > of audit logs That reminds me: The single biggest improvement I can see for audit triggers would be to provide an _easy_ and _efficient_ way to test whether any fields have changed between OLD and NEW *except* for one or more ignored fields. Right now if I have a four-column table and I want to ignore UPDATEs to col2 for audit purposes, I have to write: CREATE TRIGGER tablename_audit_insert_delete AFTER INSERT OR DELETE ON sometable FOR EACH ROW EXECUTE PROCEDURE audit_func(); CREATE TRIGGER tablename_audit_update_selective AFTER UPDATE ON sometable FOR EACH ROW WHEN ( OLD.col1 IS DISTINCT FROM NEW.col1 OR OLD.col3 IS DISTINCT FROM NEW.col3 OR OLD.col4 IS DISTINCT FROM NEW.col4OR ) EXECUTE PROCEDURE audit.if_modified_func(); ... which is horrible for all sorts of reasons: - If a column is added the audit trigger also needs an update to test for it, otherwise it'll be ignored; - It isn't explicit that "col2" is ignored; and - I have to repeat my trigger definitions twice. An alternative is to create hstores from OLD and NEW, delete the field of interest, and compare them. That's pretty slow though, and may duplicate work done by the already-expensive audit trigger. What I'm imagining is something like a: row_equals_ignorecols(OLD, NEW, 'col2') ... which would solve half the problem, and is simple enough I could implement it with a little C function. A way to avoid splitting the trigger function definition and a built-in "compare rows except columns" would be great, though. -- Craig Ringer
On 08/21/2012 01:52 PM, Jeff Davis wrote: <blockquote cite="mid:1345524771.30161.35.camel@jdavis" type="cite"><pre wrap="">OnMon, 2012-08-20 at 16:32 -0700, Josh Berkus wrote: </pre><blockquote type="cite"><pre wrap="">Personally, I would prefer a tool which just made it simpler to build my own triggers, and made it automatic for the history table to track changes in the live table. I think anything we build which controls what goes into the history table, etc., will only narrow the user base. </pre></blockquote><pre wrap="">That sounds like a good way to start. Actually, even before the tool, how about just some really good examples of triggers for specific kinds of audit logs, and some ways to run queries on them? I think that might settle a lot of these details. </pre></blockquote> Here is the example of triggers we use in our applications. This is the test implementation, the productionone uses similar triggers written in C.<br /><a href="http://softus.org/?page_id=63">http://softus.org/?page_id=63</a><br/><br /> 1. There are 3 tables: test contains onlycurrent data, test_history contains only historical data and test_audit contains all data.<br /> 2. There must be a fieldin an audited table system_time for a period of validity of the row.<br /> 3. Optional fields are: txid_modified, user_modifiedfor txid/user that inserts or updated the row, txid_deleted, user_deleted for txid/user that deleted the row.There may be other information in the audit table that was omitted in the example (client IP, host name, etc.)<br />3. We do not use txid_current() as transaction ID because backup/restore resets it.<br /> 4. User is set by the application(audit.current_user() is just a dummy).<br /> 5. There is no exclusion constraint on (primary key, system_time)in history table, integrity is maintained by triggers (however the user can damage the historical data by modifyingtest_history table).<br /> 6. It's important to understand that when audit triggers are enabled some modificationscan fail because the same row may be concurrently modified by another transaction CURRENT_TIMESTAMP of whichis lower or the same as the current one.<br />
I have written one approach to audit tables, available from https://github.com/akaariai/pgsql_shadow_tables The approach is that every table is backed by a similar audit table + some meta information. The tables and triggers to update the audit tables are managed by plpgsql procedures. While the approach isn't likely that interesting itself there is one interesting aspects. Views similar to the original tables are created automatically in the shadow schema. The views use a session variable for wanted "snapshot" time. The reason is that one can use this to query the database at wanted time: set search_path = 'shadow_public, public'; set test_session_variable.view_time = 'wanted view timestamp'; -- for example '2012-05-06 22:08:00' And now you can use exactly the same queries you use normally to retrieve data from wanted view timestamp. This is very useful if you happen to use an ORM. In addition the "known limitations" mentioned in the README are likely something the temporal support patch needs to tackle. - Anssi
Jeff Davis <pgsql@j-davis.com> wrote: > On Mon, 2012-08-20 at 19:32 -0500, Kevin Grittner wrote: >> Josh Berkus <josh@agliodbs.com> wrote: >> >>> This is sounding like a completely runaway spec on what should >>> be a simple feature. >> >> I hate to contribute to scope creep (or in this case scope >> screaming down the tracks at full steam), but I've been watching >> this with a queasy feeling about interaction with Serializable >> Snapshot Isolation (SSI). > > There are all kinds of challenges here, and I'm glad you're > thinking about them. I alluded to some problems here: > > http://archives.postgresql.org/message-id/1345415312.20987.56.camel@jdavis > > But those might be a subset of the problems you're talking about. > > It sounds like, at a high level, there are two problems: > > 1. capturing the apparent order of execution in the audit log > 2. assigning meaningful times to the changes that are consistent > with the apparent order of execution As far as I can see, transactions which execute DML at any transaction isolation level other than serializable can be considered to have occurred in commit order. Transactions which don't write to the database don't need to be considered as part of the history, at least in terms of viewing prior state. Same with transactions which roll back. (Now, failed transactions and reads might be of interest for some audit reports, but that seems to me like a different issue than a temporal database.) The funny bit is for a serializable transaction (TN) which commits after writing to the database -- you can't know the apparent order of execution as long as there are any serializable transactions active which can't see the work of TN (i.e., the transactions overlap). If such a transaction (TX) executes a read which conflicts with a TN write, TX appears to have executed first, since it doesn't see the work of TN, so I think the sequence number or timestamp for TN has to follow that for TX even though TN committed first. On the other hand, TX might write something that conflicts with a TN read, in which case TN will appear to have executed first and must get a sequence number or timestamp before TX. If there is a cycle, SSI will cancel one of the transactions involved, so that can't occur anywhere in the time line. So, if you want to allow serializable temporal queries, the timing of a read-write serializable transaction can't be locked down until all overlapping read-write serializable transactions complete; and the apparent order of execution must be based on read-write conflicts, which are tracked within SSI. I think that if we can generate a list of committed transactions in order based on this logic, it could feed into replication system -- hot standby as well as trigger-based systems. I think we could generate snapshots which exclude the transactions for which the order of execution has not yet been determined, and avoid the delays involved in other possible solutions. There's a lot of detail missing here in terms of what the API would be, and how we handle the summarization that can occur within SSI so that it can continue to function within bounded memory even in pessimal circumstances, but that's the general outline of my concerns and suggested solution. -Kevin
On Mon, Aug 20, 2012 at 09:33:45PM -0700, Jeff Davis wrote: > On Mon, 2012-08-20 at 19:17 -0400, David Johnston wrote: > > Ideally the decision of whether to do so could be a client > > decision. Not storing intra-transaction changes is easier than > > storing all changes. At worse you could stage up all changed then > > simply fail to store all intermediate results within a given > > relation. It that case you gain nothing in execution performance > > but safe both storage and interpretative resources. So the > > question becomes is it worth doing without the ability to store > > intermediate results? If you were to ponder both which setup > > would the default be? If the default is the harder one (all > > statements) to implement then to avoid upgrade issues the syntax > > should specify that it is logging transactions only. > > I think the biggest question here is what guarantees can be offered? > What if the transaction aborts after having written some data, does > the audit log still get updated? There are definitely use cases for this, but until we have autonomous transactions, a totally separate project, I don't think we should attempt them in the first version. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
<div class="moz-cite-prefix">On 22/08/12 02:16, Kevin Grittner wrote:<br /></div><blockquote cite="mid:503351DA0200002500049981@gw.wicourts.gov"type="cite"><pre wrap="">Jeff Davis <a class="moz-txt-link-rfc2396E" href="mailto:pgsql@j-davis.com"><pgsql@j-davis.com></a>wrote: </pre><blockquote type="cite"><pre wrap="">On Mon, 2012-08-20 at 19:32 -0500, Kevin Grittner wrote: </pre><blockquote type="cite"><pre wrap="">Josh Berkus <a class="moz-txt-link-rfc2396E" href="mailto:josh@agliodbs.com"><josh@agliodbs.com></a>wrote: </pre><blockquote type="cite"><pre wrap="">This is sounding like a completely runaway spec on what should be a simple feature. </pre></blockquote><pre wrap=""> I hate to contribute to scope creep (or in this case scope screaming down the tracks at full steam), but I've been watching this with a queasy feeling about interaction with Serializable Snapshot Isolation (SSI). </pre></blockquote><pre wrap=""> There are all kinds of challenges here, and I'm glad you're thinking about them. I alluded to some problems here: </pre></blockquote><pre wrap=""><a class="moz-txt-link-freetext" href="http://archives.postgresql.org/message-id/1345415312.20987.56.camel@jdavis">http://archives.postgresql.org/message-id/1345415312.20987.56.camel@jdavis</a> </pre><blockquote type="cite"><pre wrap=""> But those might be a subset of the problems you're talking about. It sounds like, at a high level, there are two problems: 1. capturing the apparent order of execution in the audit log 2. assigning meaningful times to the changes that are consistent with the apparent order of execution </pre></blockquote><pre wrap=""> As far as I can see, transactions which execute DML at any transaction isolation level other than serializable can be considered to have occurred in commit order. Transactions which don't write to the database don't need to be considered as part of the history, at least in terms of viewing prior state. Same with transactions which roll back. (Now, failed transactions and reads might be of interest for some audit reports, but that seems to me like a different issue than a temporal database.) The funny bit is for a serializable transaction (TN) which commits after writing to the database -- you can't know the apparent order of execution as long as there are any serializable transactions active which can't see the work of TN (i.e., the transactions overlap). If such a transaction (TX) executes a read which conflicts with a TN write, TX appears to have executed first, since it doesn't see the work of TN, so I think the sequence number or timestamp for TN has to follow that for TX even though TN committed first. On the other hand, TX might write something that conflicts with a TN read, in which case TN will appear to have executed first and must get a sequence number or timestamp before TX. If there is a cycle, SSI will cancel one of the transactions involved, so that can't occur anywhere in the time line. So, if you want to allow serializable temporal queries, the timing of a read-write serializable transaction can't be locked down until all overlapping read-write serializable transactions complete; and the apparent order of execution must be based on read-write conflicts, which are tracked within SSI. I think that if we can generate a list of committed transactions in order based on this logic, it could feed into replication system -- hot standby as well as trigger-based systems. I think we could generate snapshots which exclude the transactions for which the order of execution has not yet been determined, and avoid the delays involved in other possible solutions. There's a lot of detail missing here in terms of what the API would be, and how we handle the summarization that can occur within SSI so that it can continue to function within bounded memory even in pessimal circumstances, but that's the general outline of my concerns and suggested solution. -Kevin </pre></blockquote><p class="western" style="margin-bottom: 0cm">So if I understand correctly...<br /><br /> If there isa very long running transaction, say 1 hour, then all (or just some? - depending) transactions that nominally start andfinish within that time, can not have definitive start times until the very long running transaction finishes, even ifthey are successfully committed?<br /><br /> So if someone looks at the audit log they might not see all the transactionsthey expect to see.<br /><br /> So, if I had an automatic query A which updated statistics based on on transactionscommitted over the last 10 minutes, then many (all?) transactions starting and successfully completing duringthe time of the very long running transaction will never show up! Here I am envisioning a query fired off every tenminutes looking for audit records with timestamps within the previous ten minutes. However, if I ran a query B lookingat audit record numbers with in 10 minute intervals for a week, but kicked off 24 hours after the week finished –then I would see the records I did not see in query A.<p class="western" style="margin-bottom: 0cm">Hmm... if I am at allright, then probably best to have some suitably worded 'government health warning' prominent in the documentation!<br/><p class="western" style="margin-bottom: 0cm"><br /> Cheers,<br /> Gavin<br /><br /><br /><br /><br />
Excerpts from Gavin Flower's message of mar ago 21 16:51:57 -0400 2012: > On 22/08/12 02:16, Kevin Grittner wrote: > > So, if you want to allow serializable temporal queries, the timing > > of a read-write serializable transaction can't be locked down until > > all overlapping read-write serializable transactions complete; and > > the apparent order of execution must be based on read-write > > conflicts, which are tracked within SSI. I think that if we can > > generate a list of committed transactions in order based on this > > logic, it could feed into replication system -- hot standby as well > > as trigger-based systems. I think we could generate snapshots which > > exclude the transactions for which the order of execution has not > > yet been determined, and avoid the delays involved in other possible > > solutions. > If there is a very long running transaction, say 1 hour, then all (or > just some? - depending) transactions that nominally start and finish > within that time, can not have definitive start times until the very > long running transaction finishes, even if they are successfully committed? > > So if someone looks at the audit log they might not see all the > transactions they expect to see. I think there would need to be a way to also list transactions which are "in progress" -- this would include not only live transactions, but also all those transactions that have actually committed but are not yet listed as committed because their position in the commit sequence has not been determined. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Gavin Flower <GavinFlower@archidevsys.co.nz> wrote: > So if I understand correctly... > > If there is a very long running transaction, say 1 hour, then all > (or just some? - depending) transactions that nominally start and > finish within that time, can not have definitive start times until > the very long running transaction finishes, even if they are > successfully committed? That's not correct. Any transaction which started can certainly have a start time. Any transaction which completed can certainly have a commit or rollback time. What they *can't* have is a known position in the apparent order of execution for serializable transactions, which might be different from the order of start and commit times. The fact that it has an unknown sequence number or timestamp for purposes of ordering visibility of transactions doesn't mean you can't show that it completed in an audit log. In other words, I think the needs for a temporal database are significantly different from the needs of an auditing system. And keep in mind, we are only talking about seeing read-write serializable transactions which might yet conflict with other read-write serializable transactions when choosing to look at the prior state within a temporal serializable transaction. That's easy enough to avoid if you want to do so. > So if someone looks at the audit log they might not see all the > transactions they expect to see. I would assume an audit log would have very different needs from tracking changes for a temporal database view. It even seems possible that you might want to see what people *looked* at, versus just changes. You might want to see transactions which were rolled back, which are irrelevant for a temporal view. If we're talking about an auditing system, we're talking about an almost completely different animal from a temporal view of the database. > So, if I had an automatic query A which updated statistics based > on on transactions committed over the last 10 minutes, then many > (all?) transactions starting and successfully completing during > the time of the very long running transaction will never show up! A statistics capture process like that doesn't seem like a place where you care about the apparent order of execution of serializable transactions. > Here I am envisioning a query fired off every ten minutes looking > for audit records with timestamps within the previous ten minutes. Which timestamp would make sense for that? > However, if I ran a query B looking at audit record numbers with > in 10 minute intervals for a week, but kicked off 24 hours after > the week finished -- then I would see the records I did not see in > query A. > > Hmm... if I am at all right, then probably best to have some > suitably worded 'government health warning' prominent in the > documentation! We're clearly talking at cross purposes. I'm discussing what is needed to be able to see a past state of the database in a transaction which would only see states of the database which are consistent with some serial execution of serializable transactions which modified the database, and you're talking about an audit table. If we're actually talking about an audit system, I have a whole different set of concerns, and I would not be bringing this one up. The whole point of my concern is that if you have a business rule enforced by database triggers that rows in a certain table contain some batch identifier and no rows can be added to a batch after some database change flags that batch as closed, then *without* what I suggest, you could view a closed batch and see one set of rows, and view the batch at a later point in time and magically see rows appear in violation of the enforced business rules. I'm talking about *preventing* surprising appearances of data "after the fact". You would need a big warning if you *don't* have what I suggest. Now, it is quite possible that one table (or set of tables) could do double-duty for both temporal queries and auditing, but the fact that something is not needed there for one purpose doesn't mean it isn't needed for the other. There are still some fuzzy areas around how things would look with a *mix* of serializable and other transactions updating the database; but I think in the long run we will find that people either want to do all of their modifications through SSI, or none of them. The guarantees get pretty weak if you don't know that all transactions were part of the review of "dangerous structures" which can cause anomalies. If anyone is still unclear about serializable transactions in PostgreSQL and wants to invest the time needed to read 12 pages (including footnotes and pretty colored performance graphs) to learn about it -- the paper which is going to be presented at the VLDB conference next week goes at it from a different angle than I usually approach it, and it may "click" with many people where my discussions have fallen short. http://vldb.org/pvldb/vol5/p1850_danrkports_vldb2012.pdf -Kevin
Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > I think there would need to be a way to also list transactions > which are "in progress" -- this would include not only live > transactions, but also all those transactions that have actually > committed but are not yet listed as committed because their > position in the commit sequence has not been determined. That might allow the same data to serve both needs with one or two fewer timestamp (or similar) columns that what my post of a few minutes ago suggested. -Kevin
First, note the change in topic. This whole discussion has gone rather far afield from Miroslav's original submission, which was for temporal tables, which is NOT the same thing as audit logs, although the use cases overlap significantly. Miroslav, I know this has been hard to follow, but you're getting a lot of feedback because people are really interested in the feature and related features. > That sounds like a good way to start. Actually, even before the tool, > how about just some really good examples of triggers for specific kinds > of audit logs, and some ways to run queries on them? I think that might > settle a lot of these details. Well, I'm not adverse to solving some problems in the core: 1) That it's difficult/impossible to write a completely generic audit trigger which works with any table without utilizing an external SP language like Python. 2) That there's no obvious way to handle audit triggers and FK relationships intelligently. 3) That audit tables don't automatically track schema changes in the live table. 4) Checking which columns have changed (see Craig Ringer's email) These seem like difficult enough challenges without getting more complicated. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Josh Berkus <josh@agliodbs.com> wrote: > First, note the change in topic. > > This whole discussion has gone rather far afield from Miroslav's > original submission, which was for temporal tables, which is NOT > the same thing as audit logs, although the use cases overlap > significantly. I don't think the concerns I raised about apparent order of execution for serializable transactions apply to audit logs. If we've moved entirely off the topic of the original subject, it is a complete non-issue. -Kevin
<div class="moz-cite-prefix">On 22/08/12 10:56, Kevin Grittner wrote:<br /></div><blockquote cite="mid:5033CBE002000025000499BD@gw.wicourts.gov"type="cite"><pre wrap="">Josh Berkus <a class="moz-txt-link-rfc2396E"href="mailto:josh@agliodbs.com"><josh@agliodbs.com></a> wrote: </pre><blockquote type="cite"><pre wrap="">First, note the change in topic. This whole discussion has gone rather far afield from Miroslav's original submission, which was for temporal tables, which is NOT the same thing as audit logs, although the use cases overlap significantly. </pre></blockquote><pre wrap=""> I don't think the concerns I raised about apparent order of execution for serializable transactions apply to audit logs. If we've moved entirely off the topic of the original subject, it is a complete non-issue. -Kevin </pre></blockquote><p class="western" style="margin-bottom: 0cm">Hmm...<br /><br /> I was simply using an audit log exampleas a more specific case to understand things!<p class="western" style="margin-bottom: 0cm">Right now, I am meant tobe working on a project I'm way behind on! Hopefully later, I will have the time to read more carefully the interestingposts following my mention of the audit log example and to reply as appropriate.<p class="western" style="margin-bottom:0cm">About 10 years ago, I implemented some temporal features in a database to cope with insurance quotesthat had to be valid for a specified number of days in the future that was invariant with respect to future changesin premiums with effective dates within the period of validity of the quote. If anyone is interested, I'll see ifI can find my notes and write it up (but in a different thread!).<p class="western" style="margin-bottom: 0cm"><br /><pclass="western" style="margin-bottom: 0cm">Cheers,<br /> Gavin
On 08/22/2012 08:34 AM, Gavin Flower wrote: <blockquote cite="mid:50341B0C.4020306@archidevsys.co.nz" type="cite"> About10 years ago, I implemented some temporal features in a database to cope with insurance quotes that had to be validfor a specified number of days in the future that was invariant with respect to future changes in premiums with effectivedates within the period of validity of the quote. If anyone is interested, I'll see if I can find my notes and writeit up (but in a different thread!). <p class="western" style="margin-bottom: 0cm">Cheers,<br /> Gavin</blockquote> Whatyou mean is not an audit logs, it's a business time. Pavel Stehule in the beginning of this thread gave a link to a descriptionof SQL2011 design of this feature. Audit logs are more related to system time. For example IBM DB2 uses followingsyntax for system time (which is mostly SQL2011-conformant).<br /><br /> CREATE TABLE policy ( <br /> id INT primarykey not null, <br /> vin VARCHAR(10), <br /> annual_mileage INT, <br /> rental_car CHAR(1), <br /> coverage_amtINT, <br /><br /> sys_start TIMESTAMP(12) GENERATED ALWAYS AS ROW BEGIN NOT NULL, <br /> sys_end TIMESTAMP(12)GENERATED ALWAYS AS ROW END NOT NULL, <br /> trans_start TIMESTAMP(12) GENERATED ALWAYS AS TRANSACTION STARTID IMPLICITLY HIDDEN, <br /><br /> PERIOD SYSTEM_TIME (sys_start, sys_end) <br /> ); <br /><br /> CREATE TABLE policy_historyLIKE policy;<br /><br /> ALTER TABLE policy ADD VERSIONING USE HISTORY TABLE policy_history;<br /><br /> Andthe following syntax for querying for historical data.<br /><br /> SELECT coverage_amt <br /> FROM policy FOR SYSTEM_TIMEAS OF '2010-12-01' <br /> WHERE id = 1111;<br /><br /> SELECT count(*) <br /> FROM policy FOR SYSTEM_TIME FROM'2011-11-30' TO '9999-12-30' <br /> WHERE vin = 'A1111';<br />
2012/8/22 Vlad Arkhipov <arhipov@dc.baikal.ru>: > On 08/22/2012 08:34 AM, Gavin Flower wrote: > > About 10 years ago, I implemented some temporal features in a database to > cope with insurance quotes that had to be valid for a specified number of > days in the future that was invariant with respect to future changes in > premiums with effective dates within the period of validity of the quote. If > anyone is interested, I'll see if I can find my notes and write it up (but > in a different thread!). > > Cheers, > Gavin > > What you mean is not an audit logs, it's a business time. Pavel Stehule in > the beginning of this thread gave a link to a description of SQL2011 design > of this feature. Audit logs are more related to system time. For example IBM > DB2 uses following syntax for system time (which is mostly > SQL2011-conformant). > > CREATE TABLE policy ( > id INT primary key not null, > vin VARCHAR(10), > annual_mileage INT, > rental_car CHAR(1), > coverage_amt INT, > > sys_start TIMESTAMP(12) GENERATED ALWAYS AS ROW BEGIN NOT NULL, > sys_end TIMESTAMP(12) GENERATED ALWAYS AS ROW END NOT NULL, > trans_start TIMESTAMP(12) GENERATED ALWAYS AS TRANSACTION START ID > IMPLICITLY HIDDEN, > > PERIOD SYSTEM_TIME (sys_start, sys_end) > ); > > CREATE TABLE policy_history LIKE policy; > > ALTER TABLE policy ADD VERSIONING USE HISTORY TABLE policy_history; > > And the following syntax for querying for historical data. > > SELECT coverage_amt > FROM policy FOR SYSTEM_TIME AS OF '2010-12-01' > WHERE id = 1111; > > SELECT count(*) > FROM policy FOR SYSTEM_TIME FROM '2011-11-30' TO '9999-12-30' > WHERE vin = 'A1111'; I like this design - it is simple without other objects Regards Pavel
> I don't think the concerns I raised about apparent order of > execution for serializable transactions apply to audit logs. If > we've moved entirely off the topic of the original subject, it is a > complete non-issue. That's true, your discusison is about Miroslav's original patch. But a lot of the other discussion on this thread is not. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Tue, 2012-08-21 at 17:56 -0500, Kevin Grittner wrote: > I don't think the concerns I raised about apparent order of > execution for serializable transactions apply to audit logs. If > we've moved entirely off the topic of the original subject, it is a > complete non-issue. Now I'm confused. The serializability issues you were talking about only seem to matter with respect to system time (a.k.a. transaction time), right? If the user is supplying the time, then it's a non-issue. And audit logs are based on system time, so I thought that audit logs were the case you were talking about. Regards,Jeff Davis
On Tue, 2012-08-21 at 17:07 -0500, Kevin Grittner wrote: > The fact that it has an unknown sequence number or timestamp for > purposes of ordering visibility of transactions doesn't mean you > can't show that it completed in an audit log. In other words, I > think the needs for a temporal database are significantly different > from the needs of an auditing system. ... > I would assume an audit log would have very different needs from > tracking changes for a temporal database view. It even seems > possible that you might want to see what people *looked* at, versus > just changes. You might want to see transactions which were rolled > back, which are irrelevant for a temporal view. If we're talking > about an auditing system, we're talking about an almost completely > different animal from a temporal view of the database. OK, I think I see what you're saying now. Basically, an audit log means different things to different people, so I think it confused the issue. But "temporal" is fairly vague, too. It also seems like there might be a lot of overlap, depending on how we define those terms. I am most interested in the topic you brought up about serializability and system time (transaction time), because it would be a fundamental piece upon which we can build a lot of these other things (including what could be called an audit log). Regards,Jeff Davis
Jeff Davis <pgsql@j-davis.com> wrote: > On Tue, 2012-08-21 at 17:07 -0500, Kevin Grittner wrote: >> The fact that it has an unknown sequence number or timestamp for >> purposes of ordering visibility of transactions doesn't mean you >> can't show that it completed in an audit log. In other words, I >> think the needs for a temporal database are significantly >> different from the needs of an auditing system. > > ... > >> I would assume an audit log would have very different needs from >> tracking changes for a temporal database view. It even seems >> possible that you might want to see what people *looked* at, >> versus just changes. You might want to see transactions which >> were rolled back, which are irrelevant for a temporal view. If >> we're talking about an auditing system, we're talking about an >> almost completely different animal from a temporal view of the >> database. > > OK, I think I see what you're saying now. Basically, an audit log > means different things to different people, so I think it confused > the issue. Probably. When I think of an audit log, I tend to think of viewing "who did what when", without that necessarily caring a lot about viewing interim visible database states. > But "temporal" is fairly vague, too. Yeah, but in this context I have taken it to mean that someone wants to run a query such that it sees the database state "as of" some previous point in time. Even with a read-only transaction, if you want to avoid seeing states of the database which are inconsistent with business rules enforced through serializable transactions, you need to deal with some tricky problems. > I am most interested in the topic you brought up about > serializability and system time (transaction time), because it > would be a fundamental piece upon which we can build a lot of > these other things (including what could be called an audit log). [brain dump follows -- remember, you *said* you were interested] If you think it matters for what you are calling an audit log, then I probably have an incomplete or inaccurate understanding of what you mean by audit log. Perhaps you could sketch that out a bit? (Or point back to where it was described, if I've missed or forgotten something that went before.) The reason it's tricky is that while SSI fully complies with the requirement that the behavior of a set of concurrent serializable transactions running in a database is consistent with some serial (one-at-a-time) execution of those transactions, it does not share certain properties with other types of serializable implementations, so people may be assuming those additional properties where they don't actually exist. The two most common alternatives to SSI are S2PL and OCC. Under both of these techniques, the apparent order of execution (the order in which the transactions could have run to produce the same results as if they were run one-at-a-time) is the commit order. In S2PL this is accomplished by having reads block writes until commit time and writes block everything until commit time. In OCC this is accomplished by checking the read set of a transaction at commit time and rolling back the transaction if there is a single write by another transaction which conflicts with the predicate locks of the read set (i.e., there is a single read-write conflict out from the transaction being committed). SSI dodges the blocking and the high rollback rate, but the technique has these characteristics which may be surprising:- The apparent order of execution is not always the commit order. If two transactions are concurrent, and T1 reads something which would look different if it could see the work of T2 (but it *can't* because the transactions are concurrent), then T1 *appears* to have executed before T2. T2 might actually *start* first and *commit* first, but if there was overlap and the rw-conflict, then T1 ran first *logically*. SSI prevents cycles in this ordering, by canceling a transaction when a possible cycle is detected.- A read-only transactions can cause an anomaly where there would otherwise not be one. This is because a transaction which "appeared" to commit after another transaction based on rw-conflicts may have actually committed first, and would be visible to the read-only transaction while the work of the "earlier" transaction would not show up for it; if no transaction observes that state, then the problem goes away when the "logically earlier" transaction later commits. If the state is observed, even by a read-only transaction, then the "earlier" transaction logically "can't have happened" -- so it must be rolled back with a serialization failure. Within one database, this is tracked and handled by SSI. My concern is that the transactions might both commit, then a "time traveler" goes back and sees the state "that never happened." One of the features added with SSI was DEFERRABLE transactions. The point of this is that when a snapshot is generated, it can often be determined (either immediately or after other transactions have completed) that the snapshot cannot see any such anomalous database states. A SERIALIZABLE READ ONLY DEFERRABLE transaction waits until it can acquire such a snapshot before running its first query. We have talked about the possibility of supporting SERIALIZABLE transactions on hot standby by including some minimal information in the WAL stream to allow the standby to identify safe snapshots and use only those for SERIALIZABLE transactions. I have a harder time seeing that working with temporal views of a database or with audit logs. What I've been on about lately is a half-baked idea about how to possibly construct snapshots more recent than the last "safe" snapshot according to existing rules, that will still give a coherent view of the database from a read-only replica or in this temporal view of things. There are unresolved issues with the idea, and I'm not really sure that once those are ironed out, it will be different from what you would get by just remembering the last safe snapshot and using it for serializable transaction requests, but I still have hope of doing better than that. The problems come in when you consider a mix of SERIALIZABLE and other transactions. If *all* DML transactions are SERIALIZABLE, an ordering of transactions can be created, and you can pick any point in that list and construct a view of the database which will be free of serialization anomalies. The problem is how to interleave that list with non-serializable transactions and still build a coherent snapshot. And the serializable "apparent order of execution" probably doesn't matter for non-serializable transactions reading the database, because in real-time they see modifications based on commit order. They can see serialization anomalies, but by virtue of running at a less strict isolation level, they are saying that they don't care about that. -Kevin
On Mon, Aug 20, 2012 at 7:17 PM, David Johnston <polobo@yahoo.com> wrote: > Ideally the decision of whether to do so could be a client decision. Not > storing intra-transaction changes is easier than storing all changes. Not really. If you don't care about suppressing intra-transaction changes, you can essentially just have a trigger that fires on every update and adds information to the side table. If you do care about suppressing them, you have to do something more complicated. Or so it seems to me. > I see the "user" element as having two components: > > "Client" - what device/channel/"user" was used to connect to the database - > PostgreSQL Role > "User" - relative to that "client" which actual "user" performed the action > - Application Specified > > A PostgreSQL role would correspond to "client" whereas the application would > be allowed to have full control of what "User" value is stored. > > This gets a little complicated with respect to "SET ROLE" but gets close to > the truth. The idea is that you look at the "client" to determine the > "namespace" over which the "user" is defined and identified. > > So, a better way to phrase the position is that: > > "You cannot allow the application to choose what is stored to identify > itself (client)" - i.e., its credentials identify who it is and those are > stored without consulting the application I don't think we can violate the general principle that the database super-user or table owner can do whatever they want. If one of those folks wants to falsify their history, are we really going to tell them "no"? To me that has "I'm sorry, Dave, I can't do that" written all over it, and I think we'll get about the same reaction that Hal did. Now, if user A is inserting into user B's table, and is not the super-user, then, of course, we can and should ensure that no falsification is possible. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> -----Original Message----- > From: Robert Haas [mailto:robertmhaas@gmail.com] > Sent: Saturday, August 25, 2012 12:46 PM > To: David Johnston > Cc: Jeff Davis; Vlad Arkhipov; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] temporal support patch > > On Mon, Aug 20, 2012 at 7:17 PM, David Johnston <polobo@yahoo.com> > wrote: > > Ideally the decision of whether to do so could be a client decision. > > Not storing intra-transaction changes is easier than storing all changes. > > Not really. If you don't care about suppressing intra-transaction changes, you > can essentially just have a trigger that fires on every update and adds > information to the side table. If you do care about suppressing them, you > have to do something more complicated. Or so it seems to me. > My internals knowledge is basically zero but it would seem that If you simply wanted the end-of-transaction result you could just record nothing during the transaction and then copy whatever values are present at commit to whatever logging mechanism you need. If you are recording intra-transaction values you could do so to a temporary storage area and then, at commit, decide whether the recent value for a given relation/attribute is going to be retained in the final log or whether you end up persisting all of the intermediate values as well. > > "You cannot allow the application to choose what is stored to identify > > itself (client)" - i.e., its credentials identify who it is and those > > are stored without consulting the application > > I don't think we can violate the general principle that the database super- > user or table owner can do whatever they want. If one of those folks wants > to falsify their history, are we really going to tell them "no"? To me that has > "I'm sorry, Dave, I can't do that" written all over it, and I think we'll get about > the same reaction that Hal did. > Now, if user A is inserting into user B's table, and is not the super-user, then, > of course, we can and should ensure that no falsification is possible. > With respect to the physical log file there is no way for the super-user to currently falsify (at time of statement execution) the user/role that they are using. Even a "SET ROLE" doesn't change the session user (I forget the exact mechanics but I pretty sure on the general point). I do not see how this is that much different. I agree that it is pointless to even try to maintain true in-database auditing in the presence of god-like super-users so most of what I envision relates to limited permissioned users that are forced to rely upon the standard mechanisms provided by the database. As a matter of principle those wanting a secure and auditable environment should not be using ownership level roles. Since these temporal/audit tables are intended to be maintained by the system if you do not ask the users to identify themselves but instead take the information directly from the environment, you never have to give a "I'm sorry Dave" response because Dave is never given the chance to submit a proposed value. David J.
On Sat, Aug 25, 2012 at 1:30 PM, David Johnston <polobo@yahoo.com> wrote: > My internals knowledge is basically zero but it would seem that If you > simply wanted the end-of-transaction result you could just record nothing > during the transaction and then copy whatever values are present at commit > to whatever logging mechanism you need. "Whatever values are present and commit" could be a terabyte of data. Or it could be a kilobyte of changed data within a terabyte database. You'd need some way to identify which data actually needs to be copied, since you surely don't want to copy the whole database. And even if you can identify it, going back and visiting all those blocks a second time will be expensive. > Since these temporal/audit tables are intended to be maintained by the > system if you do not ask the users to identify themselves but instead take > the information directly from the environment, you never have to give a "I'm > sorry Dave" response because Dave is never given the chance to submit a > proposed value. Well, the point is that I think many people have requirements that are (1) different from each other and (2) more complicated than the simplest case we can come up with. Some people will want to log the application user (or some other piece of extra data); others won't. Some people will want to record every change in a transaction; others won't. Some people will want to log time stamps; others won't; others still may want a "range" per row indicating the time that row version lived. Some people will want to delete history before it fills up the disk; others will want to keep it forever. Some people will want to clean up history created by "accidental" changes; others will want to make sure that the history is as tamper-proof as possible. That's why, of everything that's been said on this topic, I mostly agree with what Josh Berkus said upthread: # If you want something in core which will be useful to a lot of our # users, it needs to be simple and flexible. Not ornate with lots of # dependancies. The first version of it should be as simple and minimalist # as possible. # # Personally, I would prefer a tool which just made it simpler to build my # own triggers, and made it automatic for the history table to track # changes in the live table. I think anything we build which controls # what goes into the history table, etc., will only narrow the user base. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> > Well, the point is that I think many people have requirements that are > (1) different from each other and (2) more complicated than the > simplest case we can come up with. Some people will want to log the > application user (or some other piece of extra data); others won't. > Some people will want to record every change in a transaction; others > won't. Some people will want to log time stamps; others won't; others > still may want a "range" per row indicating the time that row version > lived. Some people will want to delete history before it fills up the > disk; others will want to keep it forever. Some people will want to > clean up history created by "accidental" changes; others will want to > make sure that the history is as tamper-proof as possible. That's > why, of everything that's been said on this topic, I mostly agree with > what Josh Berkus said upthread: > > # If you want something in core which will be useful to a lot of our > # users, it needs to be simple and flexible. Not ornate with lots of > # dependancies. The first version of it should be as simple and minimalist > # as possible. > # > # Personally, I would prefer a tool which just made it simpler to build my > # own triggers, and made it automatic for the history table to track > # changes in the live table. I think anything we build which controls > # what goes into the history table, etc., will only narrow the user base. > I can't agree - why we need a some simple solution based on tools, that are available now? I don't think we have to be hurry in support own proprietary solutions - when isn't difficult do it just with available tools now. Regards Pavel
On Mon, Aug 27, 2012 at 1:50 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > I can't agree - why we need a some simple solution based on tools, > that are available now? I don't think we have to be hurry in support > own proprietary solutions - when isn't difficult do it just with > available tools now. Who said anything about proprietary solutions? I would agree that it is POSSIBLE to do this with the tools that are available now. I am not sure that I'd agree that it is easy. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 8/22/12 3:03 AM, Pavel Stehule wrote: >> SELECT coverage_amt >> >FROM policy FOR SYSTEM_TIME AS OF '2010-12-01' >> >WHERE id = 1111; >> > >> >SELECT count(*) >> >FROM policy FOR SYSTEM_TIME FROM '2011-11-30' TO '9999-12-30' >> >WHERE vin = 'A1111'; > I like this design - it is simple without other objects The thing I don't like about this is it assumes that time is the best way to refer to when things changed in a system. Notonly is that a bad assumption, it also means that relating things to history becomes messy. The concept that we promote at work is that if you're going to "version" something (I don't like the term history becauseit implies you only want a logfile), you should have an explicit way to refer to any given version. So if you want to track the versioning of a specific field on a table: CREATE TABLE customer_status_versions (customer_status_version_id SERIAL, customer_id, previous_customer_status_version_idREFERENCES customer_status_versions, changed_at timestamptz, new_customer_status ); That kind of structure makes it impossible to be ambiguous about the ordering of changes to a single customer's status. Italso means that you have a specific identifier you can use in places of the system that care about that. IE: CREATE TABLE loans(..., customer_id, customer_status_when_issued REFERENCES customer_status_versions ); Now, when you look at a loan there is *zero* question on not only what the customer's status was when the loan was issued.Not only that, you can absolutely reliably know all customer status changes that had taken place up to that point.And you can do this without any complex temporal logic or reliance on a system clock that might not be reliable. -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
2012/8/28 Jim Nasby <jim@nasby.net>: > On 8/22/12 3:03 AM, Pavel Stehule wrote: >>> >>> SELECT coverage_amt >>> >FROM policy FOR SYSTEM_TIME AS OF '2010-12-01' >>> >WHERE id = 1111; >>> > >>> >SELECT count(*) >>> >FROM policy FOR SYSTEM_TIME FROM '2011-11-30' TO '9999-12-30' >>> >WHERE vin = 'A1111'; >> >> I like this design - it is simple without other objects > > > The thing I don't like about this is it assumes that time is the best way to > refer to when things changed in a system. Not only is that a bad assumption, > it also means that relating things to history becomes messy. On second hand I don't have a problem with some optional counter, although I think so database system time is very useful and other counters for versioning are not necessary - because in one time I can have only one version - it doesn't do versions from rollbacked transactions. > > The concept that we promote at work is that if you're going to "version" > something (I don't like the term history because it implies you only want a > logfile), you should have an explicit way to refer to any given version. > > So if you want to track the versioning of a specific field on a table: > > CREATE TABLE customer_status_versions ( > customer_status_version_id SERIAL > , customer_id > , previous_customer_status_version_id REFERENCES > customer_status_versions > , changed_at timestamptz > , new_customer_status > ); > > That kind of structure makes it impossible to be ambiguous about the > ordering of changes to a single customer's status. It also means that you > have a specific identifier you can use in places of the system that care > about that. IE: > > CREATE TABLE loans( > ... > , customer_id > , customer_status_when_issued REFERENCES customer_status_versions > ); > > Now, when you look at a loan there is *zero* question on not only what the > customer's status was when the loan was issued. Not only that, you can > absolutely reliably know all customer status changes that had taken place up > to that point. And you can do this without any complex temporal logic or > reliance on a system clock that might not be reliable. > -- > Jim C. Nasby, Database Architect jim@nasby.net > 512.569.9461 (cell) http://jim.nasby.net
On 8/27/12 12:40 PM, Robert Haas wrote: > On Sat, Aug 25, 2012 at 1:30 PM, David Johnston<polobo@yahoo.com> wrote: >> >My internals knowledge is basically zero but it would seem that If you >> >simply wanted the end-of-transaction result you could just record nothing >> >during the transaction and then copy whatever values are present at commit >> >to whatever logging mechanism you need. > "Whatever values are present and commit" could be a terabyte of data. > Or it could be a kilobyte of changed data within a terabyte database. > You'd need some way to identify which data actually needs to be > copied, since you surely don't want to copy the whole database. And > even if you can identify it, going back and visiting all those blocks > a second time will be expensive. This makes me think about the original time travel, which was to not automatically remove old tuple versions. I think it would be interesting to look at allowing old tuples to be visible as well as doing something different when vacuumcomes around. -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On 8/28/12 2:51 PM, Pavel Stehule wrote: >> >The thing I don't like about this is it assumes that time is the best way to >> >refer to when things changed in a system. Not only is that a bad assumption, >> >it also means that relating things to history becomes messy. > On second hand I don't have a problem with some optional counter, > although I think so database system time is very useful and other > counters for versioning are not necessary - because in one time I can > have only one version - it doesn't do versions from rollbacked > transactions. What happens if the system clock runs backwards? What happens if two transactions start in the same microsecond? (And I know for a fact that's possible, because I've seenit). More importantly, I believe using time to handle recording a versioned history of something is flawed to begin with. Youmight care about what time a new version was created; but what's far more important is recording the correct orderingof things, and time isn't actually a great way to do that. Note that no version control systems use time as their primary attribute. -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On Tue, Aug 28, 2012 at 5:06 PM, Jim Nasby <jim@nasby.net> wrote: > On 8/28/12 2:51 PM, Pavel Stehule wrote: >>> >>> >The thing I don't like about this is it assumes that time is the best >>> > way to >>> >refer to when things changed in a system. Not only is that a bad >>> > assumption, >>> >it also means that relating things to history becomes messy. >> >> On second hand I don't have a problem with some optional counter, >> although I think so database system time is very useful and other >> counters for versioning are not necessary - because in one time I can >> have only one version - it doesn't do versions from rollbacked >> transactions. > > What happens if the system clock runs backwards? > > What happens if two transactions start in the same microsecond? (And I know > for a fact that's possible, because I've seen it). > > More importantly, I believe using time to handle recording a versioned > history of something is flawed to begin with. You might care about what time > a new version was created; but what's far more important is recording the > correct ordering of things, and time isn't actually a great way to do that. > > Note that no version control systems use time as their primary attribute. At one point, I designed a configuration system that used the DNS concept of "serial numbers" to indicate the temporality of the configuration. NULL was treated as 'infinitely in the future'. The notion would be that a new piece of configuration would initially be assigned (latest_serial+1, NULL) as its temporal visibility, and, upon bumping the serial number, that would become "live" configuration. One would mark a piece of configuration as "about to die" by assigning (X, latest_serial+1) as its temporal visibility, and, again, upon bumping the serial number, that terminates its visibility. In that approach to things, it's an "expensive serialization event" to bump the serial number. Not that the action is super-expensive, it's just that that's not something you ought to do in a distributed fashion. ONE process should bump the serial number. I think you're right that for these sorts of cases, the use of time as the source of versioning information is pretty dangerous. In SCM systems, we discovered that it was pretty horrible to try to assign serial numbers in a 'consistent' fashion; you'd get, in CVS, that files would get assigned version '1.2.1.5.3.2.7', and things would get worse from there. It turned out that using a hash code like SHA-* as a version number was more satisfactory; just attach labels to those versions that you care to identify. It's possible that, for a "versioned history," that assigning a hash/UUID as the version ID is pretty satisfactory, even though it's not ordered. Using a sequence to assign a version scales a bit less well, but is probably usually OK. One of the other things discovered in that "config system design" was that it was mighty useful to assign transactions at will: create table config_txns ( tx_id serial primary key, tx_at timestamptz default now() ); , associating data with the tx_id value, and then associate additional information to the "config_txns" table if needful. For instance, sometimes you need a label (akin to an SCM "branch" or "tag" label) create table config_labels ( tx_id integer primary key references config_txns(tx_id) on delete cascade, label text not nullunique ); -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
2012/8/28 Jim Nasby <jim@nasby.net>: > On 8/28/12 2:51 PM, Pavel Stehule wrote: >>> >>> >The thing I don't like about this is it assumes that time is the best >>> > way to >>> >refer to when things changed in a system. Not only is that a bad >>> > assumption, >>> >it also means that relating things to history becomes messy. >> >> On second hand I don't have a problem with some optional counter, >> although I think so database system time is very useful and other >> counters for versioning are not necessary - because in one time I can >> have only one version - it doesn't do versions from rollbacked >> transactions. > > > What happens if the system clock runs backwards? > probably, than you have more significant issues than this - it can be same with overloading any counter > What happens if two transactions start in the same microsecond? (And I know > for a fact that's possible, because I've seen it). yes, it is possible - and probably you need to know end of transaction - commit time - auxilary counter doesn't help - because it can be in different order too - when first transacttion was rollbacked Pavel > > More importantly, I believe using time to handle recording a versioned > history of something is flawed to begin with. You might care about what time > a new version was created; but what's far more important is recording the > correct ordering of things, and time isn't actually a great way to do that. > > Note that no version control systems use time as their primary attribute. > > -- > Jim C. Nasby, Database Architect jim@nasby.net > 512.569.9461 (cell) http://jim.nasby.net