Re: WIP: System Versioned Temporal Table - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: WIP: System Versioned Temporal Table
Date
Msg-id CAMT0RQR-hyqrBeRCTgQ1jNLtnfK0nbazo9V1qpmkfzq-VADD5A@mail.gmail.com
Whole thread Raw
In response to Re: WIP: System Versioned Temporal Table  (Simon Riggs <simon.riggs@enterprisedb.com>)
Responses Re: WIP: System Versioned Temporal Table
List pgsql-hackers
A side table has the nice additional benefit that we can very easily
version the *table structure* so when we ALTER TABLE and the table
structure changes we just make a new side table with now-currents
structure.

Also we may want different set of indexes on historic table(s) for
whatever reason

And we may even want to partition history tables for speed, storage
cost  or just to drop very ancient history

-----
Hannu Krosing
Google Cloud - We have a long list of planned contributions and we are hiring.
Contact me if interested.

On Sun, Sep 19, 2021 at 8:32 PM Simon Riggs
<simon.riggs@enterprisedb.com> wrote:
>
> On Sun, 19 Sept 2021 at 01:16, Corey Huinker <corey.huinker@gmail.com> wrote:
> >>
> >> 1. Much of what I have read about temporal tables seemed to imply or almost assume that system temporal tables
wouldbe implemented as two actual separate tables. Indeed, SQLServer appears to do it that way [1] with syntax like 
> >>
> >> WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.WebsiteUserInfoHistory));
> >>
> >>
> >> Q 1.1. Was that implementation considered and if so, what made this implementation more appealing?
> >>
> >
> > I've been digging some more on this point, and I've reached the conclusion that a separate history table is the
betterimplementation. It would make the act of removing system versioning into little more than a DROP TABLE, plus
adjustingthe base table to reflect that it is no longer system versioned. 
>
> Thanks for giving this a lot of thought. When you asked the question
> the first time you hadn't discussed how that might work, but now we
> have something to discuss.
>
> > 10. Queries that omit the FOR SYSTEM_TIME clause, as well as ones that use FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP,
wouldsimply use the base table directly with no quals to add. 
> > 11. Queries that use FOR SYSTEM_TIME and not FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP, then the query would do a
unionof the base table and the history table with quals applied to both. 
>
>
> > 14. DROP SYSTEM VERSIONING from a table would be quite straightforward - the history table would be dropped along
withthe triggers that reference it, setting relissystemversioned = 'f' on the base table. 
> >
> > I think this would have some key advantages:
> >
> > 1. MVCC bloat is no worse than it was before.
>
> The number of row versions stored in the database is the same for
> both, just it would be split across two tables in this form.
>
> > 2. No changes whatsoever to referential integrity.
>
> The changes were fairly minor, but I see your thinking about indexes
> as a simplification.
>
> > 3. DROP SYSTEM VERSIONING becomes an O(1) operation.
>
> It isn't top of mind to make this work well. The whole purpose of the
> history is to keep it, not to be able to drop it quickly.
>
>
> > Thoughts?
>
> There are 3 implementation routes that I see, so let me explain so
> that others can join the discussion.
>
> 1. Putting all data in one table. This makes DROP SYSTEM VERSIONING
> effectively impossible. It requires access to the table to be
> rewritten to add in historical quals for non-historical access and it
> requires some push-ups around indexes. (The current patch adds the
> historic quals by kludging the parser, which is wrong place, since it
> doesn't work for joins etc.. However, given that issue, the rest seems
> to follow on naturally).
>
> 2. Putting data in a side table. This makes DROP SYSTEM VERSIONING
> fairly trivial, but it complicates many DDL commands (please make a
> list?) and requires the optimizer to know about this and cater to it,
> possibly complicating plans. Neither issue is insurmountable, but it
> becomes more intrusive.
>
> The current patch could go in either of the first 2 directions with
> further work.
>
> 3. Let the Table Access Method handle it. I call this out separately
> since it avoids making changes to the rest of Postgres, which might be
> a good thing, with the right TAM implementation.
>
> My preferred approach would be to do this "for free" in the table
> access method, but we're a long way from this in terms of actual
> implementation. When Corey  suggested earlier that we just put the
> syntax in there, this was the direction I was thinking.
>
> After waiting a day since I wrote the above, I think we should go with
> (2) as Corey suggests, at least for now, and we can always add (3)
> later.
>
> --
> Simon Riggs                http://www.EnterpriseDB.com/
>
>



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: WIP: System Versioned Temporal Table
Next
From: Corey Huinker
Date:
Subject: Re: Undocumented AT TIME ZONE INTERVAL syntax