Home > mailing lists

Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty
Date	March 14, 2011 17:49:33
Msg-id	13639.1300124961@sss.pgh.pa.us Whole thread Raw
In response to	Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

I wrote:
> In any case, I am sure that that what this describes is not what our
> current code does :-(, and that we can't get anywhere close to this with
> the existing infrastructure.  So at this point I'm thinking that the
> safest approach is to rip out the result-collation caching fields and
> perform collation assignment in a parsing post-pass.  That will allow us
> to revise the collation assignment algorithm without further catversion
> bumps.

After further examination of the patch, I'm feeling that removing the
result-collation fields isn't going to be very practical after all.
The problem is that there are now exprCollation() calls all over the
place, many of them in performance-critical places, and there is no
way that exprCollation() will be a cheap operation if we remove the
result-collation fields.  I had thought that we could avoid this if
we still stored collation in a few critical places such as TargetEntry.
But that only appears to take care of some of the call sites, and anyway
it'd be a little weird to store collation there when we don't store
column type there.  For better or worse, the system is designed around
the assumption that expression trees are self-identifying as to output
type and typmod, so I think we've now got to include result collation
in that too.

This implies further bloat in expression trees of course.  By my count,
the following node types also need to store input collation, so will now
have 2 collation fields not
one:AggrefWindowFuncFuncExprOpExprDistinctExprNullIfExprScalarArrayOpExprRowCompareExprMinMaxExpr

(No, wait, DistinctExpr and RowCompareExpr always yield boolean so
they don't need result collation fields, just the input collation.)

There are also several node types that the existing patch neglected to
add collation fields to, but really need to have it AFAICS; one pretty
obvious example being CoerceToDomain.  Actually I think we have to have
a result collation field in every node type that can possibly deliver a
result of a collatable type.

I'm still going to switch over to the post-pass collation assignment
method, because otherwise we're going to need to add collation state
fields as well to all of these node types ...
        regards, tom lane

pgsql-hackers by date:

From: Merlin Moncure
Date: 14 March 2011, 17:04:24
Subject: Re: Avoiding repeated ON COMMIT truncation for temporary tables

From: Magnus Hagander
Date: 14 March 2011, 18:52:54
Subject: Re: initdb -A ident, with params

Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty - Mailing list pgsql-hackers

Previous

Next