Re: Virtual generated columns - Mailing list pgsql-hackers

From Corey Huinker
Subject Re: Virtual generated columns
Date
Msg-id CADkLM=cRG-k314mtMq-S3wka8BWigKZXGR2PSmVNVv5OUYsiFA@mail.gmail.com
Whole thread Raw
In response to Virtual generated columns  (Peter Eisentraut <peter@eisentraut.org>)
Responses Re: Virtual generated columns
List pgsql-hackers
On Mon, Apr 29, 2024 at 4:24 AM Peter Eisentraut <peter@eisentraut.org> wrote:
Here is a patch set to implement virtual generated columns.

I'm very excited about this!



The main feature patch (0005 here) generally works but has a number of
open corner cases that need to be thought about and/or fixed, many of
which are marked in the code or the tests.  I'll continue working on
that.  But I wanted to see if I can get some feedback on the test
structure, so I don't have to keep changing it around later.

I'd be very interested to see virtual generated columns working, as one of my past customers had a need to reclassify data in a partitioned table, and the ability to detach a partition, alter the virtual generated columns, and re-attach would have been great. In case you care, it was basically an "expired" flag, but the rules for what data "expired" varied by country of customer and level of service.

+ * Stored generated columns cannot work: They are computed after
+ * BEFORE triggers, but partition routing is done before all
+ * triggers.  Maybe virtual generated columns could be made to
+ * work, but then they would need to be handled as an expression
+ * below.

I'd say you nailed it with the test structure. The stored/virtual copy/split is the ideal way to approach this, which makes the diff very easy to understand.

+1 for not handling domain types yet.

 -- generation expression must be immutable
-CREATE TABLE gtest_err_4 (a int PRIMARY KEY, b double precision GENERATED ALWAYS AS (random()) STORED);
+CREATE TABLE gtest_err_4 (a int PRIMARY KEY, b double precision GENERATED ALWAYS AS (random()) VIRTUAL);

Does a VIRTUAL generated column have to be immutable? I can see where the STORED one has to be, but consider the following:

CREATE TABLE foo (
created_at timestamptz DEFAULT CURRENT_TIMESTAMP,
row_age interval GENERATED ALWAYS AS CURRENT_TIMESTAMP - created_at
);

 -- can't have generated column that is a child of normal column
 CREATE TABLE gtest_normal (a int, b int);
-CREATE TABLE gtest_normal_child (a int, b int GENERATED ALWAYS AS (a * 2) STORED) INHERITS (gtest_normal);  -- error
+CREATE TABLE gtest_normal_child (a int, b int GENERATED ALWAYS AS (a * 2) VIRTUAL) INHERITS (gtest_normal);  -- error

This is the barrier to the partitioning reorganization scheme I described above. Is there any hard rule why a child table couldn't have a generated column matching the parent's regular column? I can see where it might prevent indexing that column on the parent table, but is there some other dealbreaker or is this just a "it doesn't work yet" situation?

One last thing to keep in mind is that there are two special case expressions in the spec:

GENERATED ALWAYS AS ROW START
GENERATED ALWAYS AS ROW END

and we'll need to be able to fit those into the catalog. I'll start another thread for that unless you prefer I keep it here.

pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: Direct SSL connection with ALPN and HBA rules
Next
From: Heikki Linnakangas
Date:
Subject: Re: Direct SSL connection with ALPN and HBA rules