Re: Wrong results from Parallel Hash Full Join - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Wrong results from Parallel Hash Full Join
Date
Msg-id CAAKRu_YPJkVsNwq-ejDV6P-NUdLOArOn-z22o7apqVC+8y04hA@mail.gmail.com
Whole thread Raw
In response to Re: Wrong results from Parallel Hash Full Join  (Melanie Plageman <melanieplageman@gmail.com>)
List pgsql-hackers
On Wed, Apr 19, 2023 at 8:43 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> On Wed, Apr 19, 2023 at 3:20 PM Andres Freund <andres@anarazel.de> wrote:
>> On 2023-04-19 12:16:24 -0500, Justin Pryzby wrote:
>> > On Wed, Apr 19, 2023 at 11:17:04AM -0400, Melanie Plageman wrote:
>> > > Ultimately this is probably fine. If we wanted to modify one of the
>> > > existing tests to cover the multi-batch case, changing the select
>> > > count(*) to a select * would do the trick. I imagine we wouldn't want to
>> > > do this because of the excessive output this would produce. I wondered
>> > > if there was a pattern in the tests for getting around this.
>> >
>> > You could use explain (ANALYZE).  But the output is machine-dependant in
>> > various ways (which is why the tests use "explain analyze so rarely).
>>
>> I think with sufficient options it's not machine specific. We have a bunch of
>>  EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) ..
>> in our tests.
>
>
> Cool. Yea, so ultimately these options are almost enough but memory
> usage changes from execution to execution. There are some tests which do
> regexp_replace() on the memory usage part of the EXPLAIN ANALYZE output
> to allow us to still compare the plans. However, I figured if I was
> already going to go to the trouble of using regexp_replace(), I might as
> well write a function that returns the "Actual Rows" field from the
> EXPLAIN ANALYZE output.
>
> The attached patch does that. I admittedly mostly copy-pasted the
> plpgsql function from similar examples in other tests, and I suspect it
> may be overkill and also poorly written.

I renamed the function to join_hash_actual_rows to avoid potentially
affecting other tests. Nothing about the function is specific to a hash
join plan, so I think it is more clear to prefix the function with the
test file name. v2 attached.

- Melanie

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: LLVM strip -x fails
Next
From: Justin Pryzby
Date:
Subject: Re: Wrong results from Parallel Hash Full Join