Thread: Re: Implementation of Flashback Query
On 17.10.2018 06:18, Yang Jie wrote:
font{ line-height: 1.6; } ul,ol{ padding-left: 20px; list-style-position: inside; } font{ line-height: 1.6; } ul,ol{ padding-left: 20px; list-style-position: inside; } Hello,We know that in the absence of backups recovery can be cumbersome if the data is erroneously deleted or modified and the transaction is committed. For this reason, we realize the flashback query function based on PostgreSQL10.5, query the misoperated data, at the same time, we also want to know the ideas and suggestions of the flashback feature, hope that the postgres function is more powerful.Take a look at the following example:# create table test(id int, tx text, ts timestamp);# insert into test values (1, 'tx1', now());# update test set tx = 'tx2', ts = now();# update test set tx = 'tx3', ts = now();# update test set tx = 'tx4', ts = now();# SELECT lp, t_ctid AS ctid, t_xmin AS xmin, t_xmax AS xmax,(t_infomask & 128)::boolean AS xmax_is_lock,(t_infomask & 1024)::boolean AS xmax_committedFROM heap_page_item_attrs(get_raw_page('test', 0), 'test');lp | ctid | xmin | xmax | xmax_is_lock | xmax_committed----+-------+------+------+--------------+----------------1 | (0,2) | 6773 | 6774 | f | t2 | (0,3) | 6774 | 6775 | f | t3 | (0,4) | 6775 | 6776 | f | f4 | (0,4) | 6776 | 0 | f | f(4 rows)# select xmin,xmax,* from test flashback between xid 6773 and 6776;xmin | xmax | id | tx | ts------+------+----+-----+----------------------------6773 | 6774 | 1 | tx1 | 2018-10-16 15:43:17.2880236774 | 6775 | 1 | tx2 | 2018-10-16 15:43:40.5283676775 | 6776 | 1 | tx3 | 2018-10-16 15:43:46.9434836776 | 0 | 1 | tx4 | 2018-10-16 15:44:23.295938(4 rows)# select xmin,xmax,* from test flashback timestamp '2018-10-16 15:43:18';xmin | xmax | id | tx | ts------+------+----+-----+----------------------------6773 | 6774 | 1 | tx1 | 2018-10-16 15:43:17.288023(1 row)# select xmin,xmax,* from test flashback xid 6773;xmin | xmax | id | tx | ts------+------+----+-----+----------------------------6773 | 6774 | 1 | tx1 | 2018-10-16 15:43:17.288023(1 row)# select xmin,xmax,* from test flashback between timestamp '2018-10-16 15:43:45' and '2018-10-16 15:43:50';xmin | xmax | id | tx | ts------+------+----+-----+----------------------------6774 | 6775 | 1 | tx2 | 2018-10-16 15:43:40.5283676775 | 6776 | 1 | tx3 | 2018-10-16 15:43:46.943483(2 rows)To achieve:MVCC retains an older version of the data item (dead tuple).Deferred VACUUM, ensures that misoperated data has not been garbage collected.vacuum_defer_cleanup_age = 1000000.Turn on transaction commit time tracking to ensure that transaction commit time is available from xid.track_commit_timestamp = on.Add the syntax for flashback queries and flashback version queries.SELECT column_name FROM table_name.FLASHBACK TIMESTAMP/XID expression.WHERE conditionSELECT column_name FROM table_name.FLASHBACK BETWEEN TIMESTAMP/XID expression AND expression.WHERE conditionWhen a flashback query reads data, the appropriate data item is selected through the visibility rule.Performance:The above realizes the function of flashback query, but there is an inevitable problem, after setting the vacuum_defer_cleanup_age parameter, the, dead tuple accumulates too much, which leads to the bloat problem and obvious performance degradation.I have the following questions.Delayed cleanup, resulting in performance degradation, what are the solutions recommended?What do you suggest for the flashback feature?Although postgres has excellent backup and restore capabilities, have you considered adding flashbacks?Looking forward to your suggestions and improvements---font{ line-height: 1.6; } Yang Jie Highgo Software
Hello,
Did you read this thread:
https://www.postgresql.org/message-id/flat/78aadf6b-86d4-21b9-9c2a-51f1efb8a499%40postgrespro.ru
I have also posted patch with prototype implementation of AS OF queries.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
This is something I'm very interested in. Very helpful for fixing mistakes you didn't realize in time.
One question, would it be possible to allow this to be able to be configured on a hot standby and not the master?
That would be very helpful by being able to keep some arbitrary length of extra data on a replica which does not have to be kept on the primary server, thus not impacting performance.
Just a thought I had and was not sure if it was even feasible.
Happy to see work happening on this though.
-Adam