On 2018-08-23 11:04:30 -0400, Tom Lane wrote:
> "David G. Johnston" <david.g.johnston@gmail.com> writes:
> > On Monday, August 20, 2018, Wu Ivy <ivywuyzl@gmail.com> wrote:
> >> Why are SELECT query never marked nullable?
>
> > Basically the nullability property is used by the planner for optimization
> > during the joining of physical tables. As soon as you try outputting
> > columns the ability to enforce not null goes away because of, in
> > particular, outer joins. While some changes could maybe be made the
> > cost-benefit to do so doesn't seem favorable.
>
> A further thought on this is that really it's a historical accident that
> the elements of tuple descriptors are exactly pg_attribute rows. There
> are a *whole lot* of fields in pg_attribute that aren't especially
> relevant to tuple sets generated on-the-fly within a query, and typically
> won't get filled with anything except default values. The only fields
> that really mean a lot for a dynamic tuple set are the data type and
> values derived from that, and in some usages the column name.
And arguably there's a fair bit of redundancy in pg_attribute, just
because it's convenient for tupledescs. Given that pg_attribute very
commonly is the largest catalog table by far, that very well could use
some attention. Without tupdescs in mind, there's really not much point
for pg_attribute to repeat a good portion of pg_type again, for example,
nor is attcacheoff really meaningful.
> [ wanders away wondering if it'd be worth our time to design a new,
> more compact TupleDesc struct without the meaningless fields ... ]
Yes, I think it'd would be.
Greetings,
Andres Freund