Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query - Mailing list pgsql-bugs

From Marko Tiikkaja
Subject Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query
Date
Msg-id CAL9smLBPZJOUWjHKgZOQaDO5FK+AkkX61BT06mpSfiNz4wdtgw@mail.gmail.com
Whole thread Raw
In response to Re: BUG #15324: Non-deterministic behaviour from parallelisedsub-query  (Andres Freund <andres@anarazel.de>)
Responses Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
On Mon, Aug 13, 2018 at 7:35 PM, Andres Freund <andres@anarazel.de> wrote:
On 2018-08-13 16:14:03 +0000, PG Bug reporting form wrote:
> Execute this query (multiple times!)
>
> select * from events where account in (select account from events where
> data->>'page' = 'success.html' limit 3);

Well, the subselect with thelimit going to return different results from
run to run. Unless you add an ORDER BY there's no guaranteed order in
which tuples are returned.  So I don't think it's surprising that you're
getting results that differ between runs.

While this is true, that's missing the point.  This output, for example:

 account |     page     
---------+--------------
      14 | a.html
      14 | success.html
      65 | b.html
      65 | success.html
      80 | b.html
      80 | success.html
   24084 | a.html
   24084 | success.html
   24085 | c.html
   24085 | success.html
   24095 | a.html
   24095 | success.html
(12 rows)

contains data from six different accounts, which should surely be impossible regardless of which three accounts the subquery returns.

The one in repro1 is also problematic, because it shows that 304873, 304875 and 304885 were all selected, but not all rows for those accounts were returned.


.m

pgsql-bugs by date:

Previous
From: Andrew Fletcher
Date:
Subject: Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query
Next
From: Tom Lane
Date:
Subject: Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query