On 2017-11-30 08:06:44 -0800, Andres Freund wrote:
> Hi,
>
> On 2017-11-30 23:53:55 +1300, Thomas Munro wrote:
> > On Thu, Nov 30, 2017 at 4:47 PM, Thomas Munro
> > <thomas.munro@enterprisedb.com> wrote:
> > > On Thu, Nov 30, 2017 at 4:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >> Andres Freund <andres@anarazel.de> writes:
> > >>> Add some regression tests that exercise hash join code.
> > >>
> > >> At least one buildfarm member doesn't like this ...
> > >
> > > $$);
> > > initially_multibatch | increased_batches
> > > ----------------------+-------------------
> > > ! t | f
> > > (1 row)
> > >
> > > rollback to settings;
> > > --- 6002,6008 ----
> > > $$);
> > > initially_multibatch | increased_batches
> > > ----------------------+-------------------
> > > ! |
> > > (1 row)
> > >
> > > Hmm. aholehole didn't give me the hash join EXPLAIN ANALYZE output.
> > > All other animals are fine so far. Gah. My guess is that this is the
> > > following unlikely sequence:
> > >
> > > 1. Worker launched.
> > > 2. Leader descheduled.
> > > 3. Worker runs whole join.
> > > 4. Leader awakes from slumber, begins running plan and tries to scan
> > > outer relation, sees EOF, takes empty-outer optimisation and skips
> > > building hash table.
> > > 5. Explain has no data.
> > >
> > > It's arguably a bug that EXPLAIN ANALYZE can fail to give you what you
> > > asked for because of details of timing and I have a patch in
> > > development to fix that but it's not yet baked. Hmm. Wondering if
> > > there is a quick way to avoid that case in the meantime...
> >
> > A couple more have failed in the same way. If my theory above is
> > correct, the attached should fix the problem by skipping the affected
> > bits of the test for now. Once we fix EXPLAIN so that it reliably
> > prints hash table info no matter what (for which I should have a patch
> > fairly soon), we can uncomment them again. Is that an acceptable
> > solution for now? BTW this commit tipped nodeHash.c from yellow to
> > green on coverage.postgresql.org :-)
>
> At the current rate of failure I'm slightly inclined to just leave the
> occasionally failing test, which sounds like an actual bug imo, in place
> till we have the fix. But if there's any pushback I'll go with your
> disablign patch till then.
Thomas, any updates on the status of that explain fix?
Greetings,
Andres Freund