On Tue, 24 Mar 2026 09:18:06 +0000
Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
Thank you for your comments!
> On Wed, 31 Dec 2025 at 10:45, Yugo Nagata <nagata@sraoss.co.jp> wrote:
> >
> > I've attached an updated patch that fixes the broken test since 10c4fe074a.
> >
>
> I took a look at this, and the patch appears to work as intended.
> However, I have my doubts as to whether this is the best approach.
>
> Building stats on a virtual generated column is potentially quite
> expensive, and not something that everyone will want, so I think this
> really should be an optional feature that people can selectively
> enable, if they want.
I agree that collecting statistics on a virtual generated column should
be optional. While it can be disabled or enabled with:
ALTER TABLE ... ALTER COLUMN ... SET STATISTICS ...
this is not very intuitive or discoverable for users, so a more explicit
interface might be preferable.
> Therefore, I think that the previous approach was probably better. If
> I'm understanding it correctly, that allowed
>
> CREATE STATISTICS stat_name ON virt_col FROM tbl;
>
> as well as allowing statistics to be built on expressions including
> virtual generated columns, making it more flexible.
>
> The problem with this previous approach was that it didn't work
> correctly if a virtual generated column's expression was changed using
> ALTER TABLE ... SET EXPRESSION. I think that could be solved by
> expanding generated column expressions at ANALYZE time, rather than at
> CREATE STATISTICS time.
Yes. I think this approach could work.
One concern is that CREATE STATISTICS does not allow creating extended
statistics on a single colmun. If we try this, it raises the following error:
ERROR: extended statistics require at least 2 columns
Therefore, if we take this approach, we would need to add an exception
to this rule for virtual columns.
Regards,
Yugo Nagata
--
Yugo Nagata <nagata@sraoss.co.jp>