Thread: Odd error in complex query (7.2): Sub-SELECT uses un-GROUPed...
This is using an almost up-to-date CVS version. Sorry for the convoluted example: Create table t1(n text, f1 int, f2 int); create table g1(n text, t1n text); create table s1(k1 text, f1a int, f1bint, f2 int, x int, d timestamp); create view v1 as select k1, d, (select g1.n from g1, t1 where g1.t1n=t1.n and t1.f1 = s1.f1a and t1.f2 = s1.f2 limit 1) as a, (select g1.n from g1, t1 where g1.t1n=t1.n and t1.f1 = s1.f1b and t1.f2 = s1.f2 limit 1) as b,x from s1 ; explain select coalesce(a, b, 'other') as name, k1, sum(x) as tot from v1 where d>'28-oct-2001 12:00' andd<current_timestamp group by 1,2 order by tot desc limit 40; ERROR: Sub-SELECT uses un-GROUPed attribute s1.f2from outer query Maybe I am asking too much of views? ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes: > Sorry for the convoluted example: A simplified example is create table t1(n text, f1 int);create table s1(f1a int, x int);create view v1 as select x, (select t1.n from t1 where t1.f1= s1.f1a) as afrom s1;select a from v1 group by 1;ERROR: Sub-SELECT uses un-GROUPed attribute s1.f1a from outer query The expanded-out equivalent of the problem query is select (select t1.n from t1 where t1.f1 = s1.f1a) as a from s1group by 1; which I believe is indeed illegal. But it seems like it ought to be legal with the view in between ... ie, a view isn't purely a macro. The implementation issue here is how to decide not to pull up the view subquery (ie, not to flatten the query into the illegal form). We already do that for certain conditions; we just have to figure out what additional restriction should be used to preclude this case. The restriction should be as tight as possible to avoid losing the ability to optimize queries using views. A simplistic idea is to not pull up views that contain subselects in the targetlist, but I have a feeling that's not the right restriction. Or maybe it is --- maybe the point is that the view targetlist is logically evaluated *before* the outer query executes, and we can't do a pullup if evaluating it later would change the results. Comments? I suspect this is trickier than it looks :-( regards, tom lane
At 14:36 29/10/01 -0500, Tom Lane wrote: >The expanded-out equivalent of the problem query is > > select (select t1.n from t1 where t1.f1 = s1.f1a) as a from s1 > group by 1; > >which I believe is indeed illegal. But it seems like it ought to be >legal with the view in between ... ie, a view isn't purely a macro. FWIW, MS SQL/Server won't even allow the view to be defined Dec/RDB does, and it allows the query as well, with the following plannner output: Reduce Sort Cross block of 2 entries Cross block entry 1 Get Retrieval sequentially of relation S1 Cross block entry 2 Aggregate Conjunct Get Retrieval sequentially of relation T1 It also allows: select (select t1.n from t1 where t1.f1 = s1.f1a) as a from s1 group by (select t1.n from t1 where t1.f1 = s1.f1a); with the same plan. Which does not, on the face of it, seem illegal to me. RDB usually rewrites column-select-expressions as cross-joins (with appropriate checking for multiple/no rows). Which seems to work well with my expectations for both queries, although I presume this it not what the spec says? >The implementation issue here is how to decide not to pull up the view >subquery (ie, not to flatten the query into the illegal form). It's not clear to me that it should be illegal - for every row in s1, it should return the result of the column-select (which may be NULL) - or is that what 'not flattening the query' does? >We >already do that for certain conditions; we just have to figure out what >additional restriction should be used to preclude this case. The >restriction should be as tight as possible to avoid losing the ability >to optimize queries using views. How about whenenever it will throw this error? ;-)., >A simplistic idea is to not pull up views that contain subselects in >the targetlist, but I have a feeling that's not the right restriction. That does seem excessive. I'm way over my head here, but can a column select be implemented as a special JOIN that always returns 1 row (maybe NULL), and throws an error if more than one row? >Or maybe it is --- maybe the point is that the view targetlist is >logically evaluated *before* the outer query executes, This is very nasty, and would really hurt the utility of views. > and we can't do >a pullup if evaluating it later would change the results. Huh? ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes: > At 14:36 29/10/01 -0500, Tom Lane wrote: >> The expanded-out equivalent of the problem query is >> select (select t1.n from t1 where t1.f1 = s1.f1a) as a from s1 >> group by 1; >> which I believe is indeed illegal. > Dec/RDB ... allows the query > It also allows: > select (select t1.n from t1 where t1.f1 = s1.f1a) as a from s1 > group by (select t1.n from t1 where t1.f1 = s1.f1a); > with the same plan. Which does not, on the face of it, seem illegal to me. Hmm. Maybe the query is legal, and the problem is just one of an incorrect check for ungrouped vars in subselects. Need to think more. regards, tom lane
At 14:36 29/10/01 -0500, Tom Lane wrote: >Philip Warner <pjw@rhyme.com.au> writes: >> Sorry for the convoluted example: > >A simplified example is And here's a simpler one that seems to avoid views altogether: create table lkp(f1 int); create table t1(f1 int, x int); Select case when Exists(Select * From lkp where lkp.f1 = t1.f1) then 'known' else 'unknown' end as status, sum(x) from t1 group by 1; It's pretty similar to the sample you gave, but also presents the sort of operation people may well want to perform. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner wrote: > > At 14:36 29/10/01 -0500, Tom Lane wrote: > >Philip Warner <pjw@rhyme.com.au> writes: > >> Sorry for the convoluted example: > > > >A simplified example is > > And here's a simpler one that seems to avoid views altogether: > > create table lkp(f1 int); > create table t1(f1 int, x int); > > Select > case when Exists(Select * From lkp where lkp.f1 = t1.f1) then > 'known' > else > 'unknown' > end as status, > sum(x) > from t1 > group by 1; > A bit off-tppic question, but is our optimiser smart enough to recognize the query inside exists as LIMIT 1 query ? ------------ Hannu
At 10:43 30/10/01 +0200, Hannu Krosing wrote: > >A bit off-tppic question, but is our optimiser smart enough to >recognize the query inside exists as LIMIT 1 query ? > Yep. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes: > Select > case when Exists(Select * From lkp where lkp.f1 = t1.f1) then > 'known' > else > 'unknown' > end as status, > sum(x) > from t1 > group by 1; Okay, I'm convinced: the problem is that the test for ungrouped vars used inside subselects is too simplistic. I think it's failing to consider that if the whole subselect can be considered a grouped expression, we shouldn't object to ungrouped individual vars within it. Will work on it. regards, tom lane
On Tue, 30 Oct 2001 11:49:28 +1100 Philip Warner wrote: > > It's not clear to me that it should be illegal - for every row in s1, it > should return the result of the column-select (which may be NULL) - or is > that what 'not flattening the query' does? > > >We > >already do that for certain conditions; we just have to figure out what > >additional restriction should be used to preclude this case. The > >restriction should be as tight as possible to avoid losing the ability > >to optimize queries using views. > > How about whenenever it will throw this error? ;-)., > > >A simplistic idea is to not pull up views that contain subselects in > >the targetlist, but I have a feeling that's not the right restriction. > > That does seem excessive. I'm way over my head here, but can a column > select be implemented as a special JOIN that always returns 1 row (maybe > NULL), and throws an error if more than one row? > Hi, I wouldn't think most people need a query like this, but also had been in puzzle as to how not to pull up. Finally the problem could be solved by using a statement of an ORDER BY. Therefore, if you add an ORDER BY to a view of your complex query, it will work correctly. And, as long as each of correlative subselects which are in columns always returns one row, I feel it is legal rather than illegal that its subselects can be GROUPed. -- on 7.1.2 create table t1(n text, f1 int, f2 int); create table g1(n text, t1n text); create table s1(k1 text, f1a int, f1b int, f2 int, x int, d timestamp); create view v1 as select k1, d, (select g1.n from g1, t1 where g1.t1n=t1.n and t1.f1 = s1.f1a and t1.f2 = s1.f2 limit 1) as a, (select g1.n from g1, t1 where g1.t1n=t1.n and t1.f1 = s1.f1b and t1.f2 = s1.f2 limit 1) as b, x from s1 order by 1 -- *** an additional statement *** ; explain select coalesce(a, b, 'other') as name, k1, sum(x) as tot from v1 where d > '28-oct-2001 12:00' and d < current_timestamp group by 1,2 order by tot desc limit 40; Regards, Masaru Sugawara
> Okay, I'm convinced: the problem is that the test for ungrouped vars > used inside subselects is too simplistic. Not only was that true, but the handling of GROUP BY expressions was pretty grotty in general: they'd be re-evaluated at multiple levels of the resulting plan tree. Which is not too bad for "GROUP BY a+b", but it's unpleasant when a complex subselect is involved. I've committed fixes to CVS. regards, tom lane