Re: Transforming IN (...) to ORs, volatility - Mailing list pgsql-hackers
From | Kevin Grittner |
---|---|
Subject | Re: Transforming IN (...) to ORs, volatility |
Date | |
Msg-id | 4DA2E0C3020000250003C64A@gw.wicourts.gov Whole thread Raw |
In response to | Re: Transforming IN (...) to ORs, volatility (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) |
Responses |
Re: Transforming IN (...) to ORs, volatility
|
List | pgsql-hackers |
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > On 05.04.2011 18:42, Heikki Linnakangas wrote: >> On 05.04.2011 13:19, Marti Raudsepp wrote: >>> On Fri, Apr 1, 2011 at 14:24, Heikki Linnakangas >>> <heikki.linnakangas@enterprisedb.com> wrote: >>>> We sometimes transform IN-clauses to a list of ORs: >>>> >>>> postgres=# explain SELECT * FROM foo WHERE a IN (b, c); >>>> QUERY PLAN >>>> Seq Scan on foo (cost=0.00..39.10 rows=19 width=12) >>>> Filter: ((a = b) OR (a = c)) >>>> >>>> But what if you replace "a" with a volatile function? It >>>> doesn't seem legal to do that transformation in that case, but >>>> we do it: >>>> >>>> postgres=# explain SELECT * FROM foo WHERE >>>> (random()*2)::integer IN (b, c); >>>> QUERY PLAN >>>> >>>> Seq Scan on foo (cost=0.00..68.20 rows=19 width=12) >>>> Filter: ((((random() * 2::double precision))::integer = b) OR >>>> (((random() >>>> * 2::double precision))::integer = c)) >>> >>> Is there a similar problem with the BETWEEN clause >>> transformation into AND expressions? >>> >>> marti=> explain verbose select random() between 0.25 and 0.75; >>> Result (cost=0.00..0.02 rows=1 width=0) >>> Output: ((random()>= 0.25::double precision) AND (random()<= >>> 0.75::double precision)) >> >> Yes, good point. > > Hmm, the SQL specification explicitly says that > > X BETWEEN Y AND Z > > is equal to > > X >= Y AND X <= Z > > It doesn't say anything about side-effects of X. Seems like an > oversight in the specification. I would not expect X to be > evaluated twice, and I think we should change BETWEEN to not do > that. Does the SQL spec explicitly say anything about how many times X should be evaluated if you were to code it as?: X >= Y AND X <= Z If it does, evaluating it a different number of times for BETWEEN would seem to be a deviation from standard. Evaluating it once seem less surprising, but if we're going to deviate from the standard in doing that, it at least deserves a clear note to that effect in the docs. Evaluating X once for BETWEEN seems better from a POLA perspective, unless you happen to be massaging a query to another form and trusting that the equivalence defined in the standard will always hold. > Does anyone object to making BETWEEN and IN more strict about the > data types? At the moment, you can do this: > > postgres=# SELECT '1234' BETWEEN '10001'::text AND 10002::int4; > ?column? > ---------- > t > (1 row) > > I'm thinking that it should throw an error. Same with IN, if the > values in the IN-list can't be coerced to a common type. That will > probably simplify the code a lot, and is what the SQL standard > assumes anyway AFAICS. +1 for more strict. -Kevin
pgsql-hackers by date: