Home > mailing lists

Re: Wrong results from Parallel Hash Full Join - Mailing list pgsql-hackers

From	Melanie Plageman
Subject	Re: Wrong results from Parallel Hash Full Join
Date	April 20, 2023 15:49:49
Msg-id	CAAKRu_YPJkVsNwq-ejDV6P-NUdLOArOn-z22o7apqVC+8y04hA@mail.gmail.com Whole thread Raw
In response to	Re: Wrong results from Parallel Hash Full Join (Melanie Plageman <melanieplageman@gmail.com>)
List	pgsql-hackers

Tree view

On Wed, Apr 19, 2023 at 8:43 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> On Wed, Apr 19, 2023 at 3:20 PM Andres Freund <andres@anarazel.de> wrote:
>> On 2023-04-19 12:16:24 -0500, Justin Pryzby wrote:
>> > On Wed, Apr 19, 2023 at 11:17:04AM -0400, Melanie Plageman wrote:
>> > > Ultimately this is probably fine. If we wanted to modify one of the
>> > > existing tests to cover the multi-batch case, changing the select
>> > > count(*) to a select * would do the trick. I imagine we wouldn't want to
>> > > do this because of the excessive output this would produce. I wondered
>> > > if there was a pattern in the tests for getting around this.
>> >
>> > You could use explain (ANALYZE).  But the output is machine-dependant in
>> > various ways (which is why the tests use "explain analyze so rarely).
>>
>> I think with sufficient options it's not machine specific. We have a bunch of
>>  EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) ..
>> in our tests.
>
>
> Cool. Yea, so ultimately these options are almost enough but memory
> usage changes from execution to execution. There are some tests which do
> regexp_replace() on the memory usage part of the EXPLAIN ANALYZE output
> to allow us to still compare the plans. However, I figured if I was
> already going to go to the trouble of using regexp_replace(), I might as
> well write a function that returns the "Actual Rows" field from the
> EXPLAIN ANALYZE output.
>
> The attached patch does that. I admittedly mostly copy-pasted the
> plpgsql function from similar examples in other tests, and I suspect it
> may be overkill and also poorly written.

I renamed the function to join_hash_actual_rows to avoid potentially
affecting other tests. Nothing about the function is specific to a hash
join plan, so I think it is more clear to prefix the function with the
test file name. v2 attached.

- Melanie

Attachment

v2-0001-Test-multi-batch-PHJ-match-bit-initialization.patch

pgsql-hackers by date:

From: Andres Freund
Date: 20 April 2023, 15:33:38
Subject: Re: LLVM strip -x fails

From: Justin Pryzby
Date: 20 April 2023, 15:50:45
Subject: Re: Wrong results from Parallel Hash Full Join

Re: Wrong results from Parallel Hash Full Join - Mailing list pgsql-hackers

Attachment

Previous

Next