On Wed, Apr 19, 2023 at 8:43 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> On Wed, Apr 19, 2023 at 3:20 PM Andres Freund <andres@anarazel.de> wrote:
>> On 2023-04-19 12:16:24 -0500, Justin Pryzby wrote:
>> > On Wed, Apr 19, 2023 at 11:17:04AM -0400, Melanie Plageman wrote:
>> > > Ultimately this is probably fine. If we wanted to modify one of the
>> > > existing tests to cover the multi-batch case, changing the select
>> > > count(*) to a select * would do the trick. I imagine we wouldn't want to
>> > > do this because of the excessive output this would produce. I wondered
>> > > if there was a pattern in the tests for getting around this.
>> >
>> > You could use explain (ANALYZE). But the output is machine-dependant in
>> > various ways (which is why the tests use "explain analyze so rarely).
>>
>> I think with sufficient options it's not machine specific. We have a bunch of
>> EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) ..
>> in our tests.
>
>
> Cool. Yea, so ultimately these options are almost enough but memory
> usage changes from execution to execution. There are some tests which do
> regexp_replace() on the memory usage part of the EXPLAIN ANALYZE output
> to allow us to still compare the plans. However, I figured if I was
> already going to go to the trouble of using regexp_replace(), I might as
> well write a function that returns the "Actual Rows" field from the
> EXPLAIN ANALYZE output.
>
> The attached patch does that. I admittedly mostly copy-pasted the
> plpgsql function from similar examples in other tests, and I suspect it
> may be overkill and also poorly written.
I renamed the function to join_hash_actual_rows to avoid potentially
affecting other tests. Nothing about the function is specific to a hash
join plan, so I think it is more clear to prefix the function with the
test file name. v2 attached.
- Melanie