Re: PoC/WIP: Extended statistics on expressions - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: PoC/WIP: Extended statistics on expressions
Date
Msg-id 57ad73f5-a979-5b6c-10e4-6f4ce15fce6b@enterprisedb.com
Whole thread Raw
In response to Re: PoC/WIP: Extended statistics on expressions  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
On 9/3/21 5:56 AM, Justin Pryzby wrote:
> On Wed, Sep 01, 2021 at 06:45:29PM +0200, Tomas Vondra wrote:
>> However while polishing the second patch, I realized we're allowing
>> statistics on expressions referencing system attributes. So this fails;
>>
>> CREATE STATISTICS s ON ctid, x FROM t;
>>
>> but this passes:
>>
>> CREATE STATISTICS s ON (ctid::text), x FROM t;
>>
>> IMO we should reject such expressions, just like we reject direct references
>> to system attributes - patch attached.
> 
> Right, same as indexes.  +1
>

I've pushed this check, disallowing extended stats on expressions 
referencing system attributes. This means we'll reject both ctid and 
(ctid::text), just like for indexes.

>> Furthermore, I wonder if we should reject expressions without any Vars? This
>> works now:
>>
>> CREATE STATISTICS s ON (11:text) FROM t;
>>
>> but it seems rather silly / useless, so maybe we should reject it.
> 
> To my surprise, this is also allowed for indexes...
> 
> But (maybe this is what I was remembering) it's prohibited to have a constant
> expression as a partition key.
> 
> Expressions without a var seem like a case where the user did something
> deliberately silly, and dis-similar from the case of making a stats expression
> on a simple column - that seemed like it could be a legitimate
> mistake/confusion (it's not unreasonable to write an extra parenthesis, but
> it's strange if that causes it to behave differently).
> 
> I think it's not worth too much effort to prohibit this: if they're determined,
> they can still write an expresion with a var which is constant.  I'm not going
> to say it's worth zero effort, though.
> 

I've decided not to push this. The statistics objects on expressions not 
referencing any variables seem useless, but maybe not entirely - we 
allow volatile expressions, like

   CREATE STATISTICS s ON (random()) FROM t;

which I suppose might be useful. And we reject similar cases (except for 
the volatility, of course) for indexes too.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: "Jonathan S. Katz"
Date:
Subject: Re: Release 14 Schedule
Next
From: A Z
Date:
Subject: Improved PostgreSQL Mathematics Support.