Re: Persist MVCC forever - retain history - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Persist MVCC forever - retain history
Date
Msg-id CA+hUKGL6Zm4WgCxJ+8dn6U4=nxXws1t+Y=HRXwswrrqGLKvw8g@mail.gmail.com
Whole thread Raw
In response to Persist MVCC forever - retain history  (Mitar <mmitar@gmail.com>)
Responses Re: Persist MVCC forever - retain history
List pgsql-hackers
On Fri, Jul 3, 2020 at 6:56 AM Mitar <mmitar@gmail.com> wrote:
> I was thinking and reading about how to design the schema to keep
> records of all changes which happen to the table, at row granularity,
> when I realized that all this is already done for me by PostgreSQL
> MVCC. All rows (tuples) are already stored, with an internal version
> field as well.

This was a research topic in ancient times (somewhere I read that in
some ancient version, VACUUM didn't originally remove tuples, it moved
them to permanent write-only storage).  Even after the open source
project began, there was a "time travel" feature, but it was removed
in 6.2:

https://www.postgresql.org/docs/6.3/c0503.htm

> So I wonder, how could I hack PostgreSQL to disable vacuuming a table,
> so that all tuples persist forever, and how could I make those
> internal columns visible so that I could make queries asking for
> results at the particular historical version of table state? My
> understanding is that indices are already indexing over those internal
> columns as well, so those queries over historical versions would be
> efficient as well. Am I missing something which would make this not
> possible?

There aren't indexes on those things.

If you want to keep track of all changes in a way that lets you query
things as of historical times, including joins, and possibly including
multiple time dimensions ("on the 2nd of Feb, what address did we
think Fred lived at on the 1st of Jan?") you might want to read
"Developing Time-Oriented Database Applications in SQL" about this,
freely available as a PDF[1].  There's also a bunch of temporal
support in more recent SQL standards, not supported by PostgreSQL, and
it was designed by the author of that book.  There are people working
on trying to implement parts of the standard support for PostgreSQL.

> Is this something I would have to run a custom version of PostgreSQL
> or is this possible through an extension of sort?

There are some extensions that offer some temporal support inspired by
the standard (I haven't used any of them so I can't comment on them).

[1] http://www2.cs.arizona.edu/~rts/publications.html



pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Persist MVCC forever - retain history
Next
From: Tom Lane
Date:
Subject: Re: Sync vs Flush