Re: Allow to collect statistics on virtual generated columns - Mailing list pgsql-hackers

From Yugo Nagata
Subject Re: Allow to collect statistics on virtual generated columns
Date
Msg-id 20260324195727.43416b69c71b5f63eed3b4bc@sraoss.co.jp
Whole thread Raw
In response to Re: Allow to collect statistics on virtual generated columns  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Responses Re: Allow to collect statistics on virtual generated columns
List pgsql-hackers
On Tue, 24 Mar 2026 09:18:06 +0000
Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

Thank you for your comments!

> On Wed, 31 Dec 2025 at 10:45, Yugo Nagata <nagata@sraoss.co.jp> wrote:
> >
> > I've attached an updated patch that fixes the broken test since 10c4fe074a.
> >
> 
> I took a look at this, and the patch appears to work as intended.
> However, I have my doubts as to whether this is the best approach.
> 
> Building stats on a virtual generated column is potentially quite
> expensive, and not something that everyone will want, so I think this
> really should be an optional feature that people can selectively
> enable, if they want.

I agree that collecting statistics on a virtual generated column should
be optional. While it can be disabled or enabled with:

 ALTER TABLE ... ALTER COLUMN ... SET STATISTICS ...

this is not very intuitive or discoverable for users, so a more explicit
interface might be preferable.

> Therefore, I think that the previous approach was probably better. If
> I'm understanding it correctly, that allowed
> 
>   CREATE STATISTICS stat_name ON virt_col FROM tbl;
> 
> as well as allowing statistics to be built on expressions including
> virtual generated columns, making it more flexible.
> 
> The problem with this previous approach was that it didn't work
> correctly if a virtual generated column's expression was changed using
> ALTER TABLE ... SET EXPRESSION. I think that could be solved by
> expanding generated column expressions at ANALYZE time, rather than at
> CREATE STATISTICS time. 

Yes. I think this approach could work. 

One concern is that CREATE STATISTICS does not allow creating extended
statistics on a single colmun. If we try this, it raises the following error:

 ERROR:  extended statistics require at least 2 columns

Therefore, if we take this approach, we would need to add an exception
to this rule for virtual columns.

Regards,
Yugo Nagata

-- 
Yugo Nagata <nagata@sraoss.co.jp>



pgsql-hackers by date:

Previous
From: Srinath Reddy Sadipiralla
Date:
Subject: Re: Teach isolation tester about injection points in background workers
Next
From: John Naylor
Date:
Subject: Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?