Thread: Strange random() Correlation
Hi, ISTM, there's a problem in the correlation of random() to outer JOINs. Here's a test case: BEGIN; CREATE TEMP TABLE nuc_codes (id serial, code char(1)); COPY nuc_codes (code) FROM stdin; A C D G H K M N R S T U V W X Y \. SELECT id, code FROM nuc_codes; SELECT T1.r1, T1.r2, T2.code, T3.code FROM (SELECT ((random() * 100)::int4 % 17), ((random() * 100)::int4 % 17) FROM generate_series(1, 10) ) AS T1 (r1, r2) LEFT OUTER JOIN nuc_codes T2 ON (T2.id = T1.r1) LEFT OUTER JOIN nuc_codes T3 ON (T3.id = T1.r2); ROLLBACK; If you run above query, you'll realize the inconsistency in the output. Furthermore, if you'd append an "OFFSET 0" to subselect, output becomes more stable but still has some inconsistencies. Regards. P.S. Query tested on 8.1.4 and a 2-3 weeks old cvs tip.
Volkan YAZICI <yazicivo@ttnet.net.tr> writes: > ISTM, there's a problem in the correlation of random() to outer JOINs. The random() functions are being evaluated more than once because the subselect gets "flattened" into the outer query, so that you have the equivalent of select random(), ... where t2.id = random() ... We've previously discussed preventing the planner from flattening if there are any volatile functions in the sub-select's output list, but I think that would probably do about as much harm as good. The cases where this actually matters are rare and the programmer knows darn well what he's doing, so the workaround of inserting an OFFSET 0 to prevent the flattening seems acceptable to me. > Furthermore, if you'd append an "OFFSET 0" to subselect, output becomes > more stable but still has some inconsistencies. I didn't see any... regards, tom lane
On May 27 11:50, Tom Lane wrote: > Volkan YAZICI <yazicivo@ttnet.net.tr> writes: > > ISTM, there's a problem in the correlation of random() to outer JOINs. > > The random() functions are being evaluated more than once because the > subselect gets "flattened" into the outer query, so that you have > the equivalent of > > select random(), ... where t2.id = random() ... Oops, sorry. I've just remembered this. > We've previously discussed preventing the planner from flattening if > there are any volatile functions in the sub-select's output list, but > I think that would probably do about as much harm as good. It can be quite informative to learn the pros and cons of this issue, but I couldn't find related discussion in archives. I'd be so appreciated if you can remember its subject or anything specifier for the thread. > > Furthermore, if you'd append an "OFFSET 0" to subselect, output becomes > > more stable but still has some inconsistencies. > > I didn't see any... That's all caused by a mis-interpretation of the output by me. Replacing "(random() * 100)::int % 17" with "1 + ((random() * 100)::int % 16)" solved my above question. Regards.