Re: Transforming IN (...) to ORs, volatility - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Transforming IN (...) to ORs, volatility
Date
Msg-id 4DA2E0C3020000250003C64A@gw.wicourts.gov
Whole thread Raw
In response to Re: Transforming IN (...) to ORs, volatility  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: Transforming IN (...) to ORs, volatility
List pgsql-hackers
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
> On 05.04.2011 18:42, Heikki Linnakangas wrote:
>> On 05.04.2011 13:19, Marti Raudsepp wrote:
>>> On Fri, Apr 1, 2011 at 14:24, Heikki Linnakangas
>>> <heikki.linnakangas@enterprisedb.com> wrote:
>>>> We sometimes transform IN-clauses to a list of ORs:
>>>>
>>>> postgres=# explain SELECT * FROM foo WHERE a IN (b, c);
>>>> QUERY PLAN
>>>> Seq Scan on foo (cost=0.00..39.10 rows=19 width=12)
>>>> Filter: ((a = b) OR (a = c))
>>>>
>>>> But what if you replace "a" with a volatile function? It
>>>> doesn't seem legal to do that transformation in that case, but
>>>> we do it:
>>>>
>>>> postgres=# explain SELECT * FROM foo WHERE
>>>> (random()*2)::integer IN (b, c);
>>>> QUERY PLAN
>>>>
>>>> Seq Scan on foo (cost=0.00..68.20 rows=19 width=12)
>>>> Filter: ((((random() * 2::double precision))::integer = b) OR
>>>> (((random()
>>>> * 2::double precision))::integer = c))
>>>
>>> Is there a similar problem with the BETWEEN clause
>>> transformation into AND expressions?
>>>
>>> marti=> explain verbose select random() between 0.25 and 0.75;
>>> Result (cost=0.00..0.02 rows=1 width=0)
>>> Output: ((random()>= 0.25::double precision) AND (random()<=
>>> 0.75::double precision))
>>
>> Yes, good point.
> 
> Hmm, the SQL specification explicitly says that
> 
> X BETWEEN Y AND Z
> 
> is equal to
> 
> X >= Y AND X <= Z
> 
> It doesn't say anything about side-effects of X. Seems like an
> oversight in the specification. I would not expect X to be
> evaluated twice, and I think we should change BETWEEN to not do
> that.
Does the SQL spec explicitly say anything about how many times X
should be evaluated if you were to code it as?:
X >= Y AND X <= Z
If it does, evaluating it a different number of times for BETWEEN
would seem to be a deviation from standard.  Evaluating it once seem
less surprising, but if we're going to deviate from the standard in
doing that, it at least deserves a clear note to that effect in the
docs.
Evaluating X once for BETWEEN seems better from a POLA perspective,
unless you happen to be massaging a query to another form and
trusting that the equivalence defined in the standard will always
hold.
> Does anyone object to making BETWEEN and IN more strict about the
> data types? At the moment, you can do this:
> 
> postgres=# SELECT '1234' BETWEEN '10001'::text AND 10002::int4;
>   ?column?
> ----------
>   t
> (1 row)
> 
> I'm thinking that it should throw an error. Same with IN, if the
> values in the IN-list can't be coerced to a common type. That will
> probably simplify the code a lot, and is what the SQL standard
> assumes anyway AFAICS.
+1 for more strict.
-Kevin


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: SSI bug?
Next
From: Tatsuo Ishii
Date:
Subject: pgfoundry down?