Re: Allow to collect statistics on virtual generated columns - Mailing list pgsql-hackers

From Dean Rasheed
Subject Re: Allow to collect statistics on virtual generated columns
Date
Msg-id CAEZATCXkZwJ_6FCM7RMKFiNC4ui+CLmL-=Y9AiYmDpnPS+ftWw@mail.gmail.com
Whole thread Raw
In response to Re: Allow to collect statistics on virtual generated columns  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Responses Re: Allow to collect statistics on virtual generated columns
List pgsql-hackers
On Thu, 26 Mar 2026 at 16:00, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> On Thu, 26 Mar 2026 at 15:09, Yugo Nagata <nagata@sraoss.co.jp> wrote:
> >
> > I've attached an updated patch including the documentation and tests.

Looking at get_relation_statistics(), I think that you need to call
expand_generated_columns_in_expr() *before* ChangeVarNodes() so that
Vars in the expanded expression end up with the correct varno.

This obviously affects queries with more than one table in the FROM
clause, e.g.:

drop table if exists foo;
create table foo (a int, b int generated always as (a*2) virtual);
insert into foo select x from generate_series(1,10) x;
insert into foo select 100 from generate_series(1,500);
create statistics s on b from foo;
analyse foo;
explain select * from foo f1, foo f2 where f1.b = 200 and f2.b = 200;

                            QUERY PLAN
-------------------------------------------------------------------
 Nested Loop  (cost=0.00..47.56 rows=1500 width=16)
   ->  Seq Scan on foo f1  (cost=0.00..10.65 rows=500 width=4)
         Filter: ((a * 2) = 200)
   ->  Materialize  (cost=0.00..10.66 rows=3 width=4)
         ->  Seq Scan on foo f2  (cost=0.00..10.65 rows=3 width=4)
               Filter: ((a * 2) = 200)
(6 rows)

Regards,
Dean



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: pg_plan_advice
Next
From: Alvaro Herrera
Date:
Subject: Re: Adding REPACK [concurrently]