Temporal Table Proposal - Mailing list pgsql-hackers
From | Ibrar Ahmed |
---|---|
Subject | Temporal Table Proposal |
Date | |
Msg-id | CALtqXTce8PBHk-mo-V5=5fxWACQmZM=Yd_RkoR2ZJcEw0fdWJQ@mail.gmail.com Whole thread Raw |
In response to | Refactoring the checkpointer's fsync request queue (Thomas Munro <thomas.munro@enterprisedb.com>) |
Responses |
Re: Temporal Table Proposal
|
List | pgsql-hackers |
Hi,
While working on another PostgreSQL feature, I was thinking that we could use a temporal table in PostgreSQL. Some existing databases offer this. I searched for any discussion on the PostgreSQL mailing list, but could not find any. Maybe my search wasn’t accurate enough: if anyone can point me to a discussion, that would be useful.
https://www.percona.com/community-blog/2018/12/14/notes-mariadb-system-versioned-tables/
https://www.mssqltips.com/sqlservertip/3680/introduction-to-sql-server-temporal-tables/
What?
A temporal table feature has two tables “Temporal Table” and “History Table”. The Temporal Table is where our current tuples are stored. This is the main table, just like other PostgreSQL tables. The history table is the other half of the feature and is where all the history of the main table is stored. This table is created automatically. The history table is used to query certain data at a certain time, useful for a point in time analysis. It also offers built-in versioning.
Why?
Normally users write triggers or procedures to write a history of a table’s data. Some time-sensitive applications will have code to write a data history somewhere. By having this functionality, PostgreSQL would do it automatically. For example, if we have a retail table where the price of each product inventory item is stored. The temporal table would hold the current price of the product. When we update the price of a product in the temporal table, then a new row with a timestamp would be added to the history table. That means on each update of the price, a new row containing the previous price would be added to the history table. The same would apply in the case of deletes. When we delete any product from our inventory, then a row would be added to the history table storing the last price of the product prior to delete. For any point in time, we can access the price at which we sold the product.
How?
I was thinking about the implementation of this feature and read the documentation on the internet. Microsoft SQL Server, for example, offers such a feature. If we come to the conclusion we should offer the feature, I will share the complete design.
Here are some ideas I have around this:
- Syntax.
CREATE TABLE tablename
(
…
start_time DATETIME,
end_time DATETIME,
PERIOD FOR SYSTEM_TIME (start_time, end_time)
)
WITH
(
SYSTEM_VERSIONING = ON (HISTORY_TABLE = tablename_history)
);
The tablename is the temporal table and tablename_history is be the history table. The name of the history table is optional, in which case, PostgreSQL will generate a table name. These two columns are a must for a temporal table “start_time” and “end_time”. The PERIOD FOR SYSTEM_TIME is used to identify these columns.
ALTER TABLE SET SYSTEM_VERSIONING = ON/OFF
Due to this syntax addition in CREATE/ALTER TABLE, there are some grammar additions required in the parser.
PERIOD FOR SYSTEM TIME
SYSTEM VERSIONING
- Catalog Changes.
There are two options, one is to have another catalog pg_temporal which will contain the information or we could have that information in the pg_catalog
Table "public.pg_temporal"
Column | Type | Collation | Nullable | Default
-----------------+------+-----------+----------+---------
temporal_id | oid | | |
hist_id | oid | | |
start_date_name | text | | |
end_date_name | text | | |
--
While working on another PostgreSQL feature, I was thinking that we could use a temporal table in PostgreSQL. Some existing databases offer this. I searched for any discussion on the PostgreSQL mailing list, but could not find any. Maybe my search wasn’t accurate enough: if anyone can point me to a discussion, that would be useful.
https://www.percona.com/community-blog/2018/12/14/notes-mariadb-system-versioned-tables/
https://www.mssqltips.com/sqlservertip/3680/introduction-to-sql-server-temporal-tables/
What?
A temporal table feature has two tables “Temporal Table” and “History Table”. The Temporal Table is where our current tuples are stored. This is the main table, just like other PostgreSQL tables. The history table is the other half of the feature and is where all the history of the main table is stored. This table is created automatically. The history table is used to query certain data at a certain time, useful for a point in time analysis. It also offers built-in versioning.
Why?
Normally users write triggers or procedures to write a history of a table’s data. Some time-sensitive applications will have code to write a data history somewhere. By having this functionality, PostgreSQL would do it automatically. For example, if we have a retail table where the price of each product inventory item is stored. The temporal table would hold the current price of the product. When we update the price of a product in the temporal table, then a new row with a timestamp would be added to the history table. That means on each update of the price, a new row containing the previous price would be added to the history table. The same would apply in the case of deletes. When we delete any product from our inventory, then a row would be added to the history table storing the last price of the product prior to delete. For any point in time, we can access the price at which we sold the product.
How?
I was thinking about the implementation of this feature and read the documentation on the internet. Microsoft SQL Server, for example, offers such a feature. If we come to the conclusion we should offer the feature, I will share the complete design.
Here are some ideas I have around this:
- Syntax.
CREATE TABLE tablename
(
…
start_time DATETIME,
end_time DATETIME,
PERIOD FOR SYSTEM_TIME (start_time, end_time)
)
WITH
(
SYSTEM_VERSIONING = ON (HISTORY_TABLE = tablename_history)
);
The tablename is the temporal table and tablename_history is be the history table. The name of the history table is optional, in which case, PostgreSQL will generate a table name. These two columns are a must for a temporal table “start_time” and “end_time”. The PERIOD FOR SYSTEM_TIME is used to identify these columns.
ALTER TABLE SET SYSTEM_VERSIONING = ON/OFF
Due to this syntax addition in CREATE/ALTER TABLE, there are some grammar additions required in the parser.
PERIOD FOR SYSTEM TIME
SYSTEM VERSIONING
- Catalog Changes.
There are two options, one is to have another catalog pg_temporal which will contain the information or we could have that information in the pg_catalog
Table "public.pg_temporal"
Column | Type | Collation | Nullable | Default
-----------------+------+-----------+----------+---------
temporal_id | oid | | |
hist_id | oid | | |
start_date_name | text | | |
end_date_name | text | | |
--
Ibrar Ahmed
pgsql-hackers by date: