Re: [PATCH] Add extra statistics to explain for Nested Loop - Mailing list pgsql-hackers

From Ekaterina Sokolova
Subject Re: [PATCH] Add extra statistics to explain for Nested Loop
Date
Msg-id 420960372f05563984984f195522ff01@postgrespro.ru
Whole thread Raw
In response to Re: [PATCH] Add extra statistics to explain for Nested Loop  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: [PATCH] Add extra statistics to explain for Nested Loop  (Julien Rouhaud <rjuju123@gmail.com>)
List pgsql-hackers
Hi, hackers!

We started discussion about overheads and how to calculate it correctly.

Julien Rouhaud wrote:
> Can you give a bit more details on your bench scenario?  I see 
> contradictory
> results, where the patched version with more code is sometimes way 
> faster,
> sometimes way slower.  If you're using pgbench
> default queries (including write queries) I don't think that any of 
> them will
> hit the loop code, so it's really a best case scenario.  Also write 
> queries
> will make tests less stable for no added value wrt. this code.
> 
> Ideally you would need a custom scenario with a single read-only query
> involving a nested loop or something like that to check how much 
> overhead you
> really get when you cumulate those values.
I created 2 custom scenarios. First one contains VERBOSE flag so this 
scenario uses extra statistics. Second one doesn't use new feature and 
doesn't disable its use (therefore still collect data).
I attach scripts for pgbench to this letter.

Main conclusions are:
1) the use of additional statistics affects no more than 4.5%;
2) data collection affects no more than 1.5%.
I think testing on another machine would be very helpful, so if you get 
a chance, I'd be happy if you share your observations.

Some fixes:

> Sure, but if we're going to have a branch for nloops == 0, I think it 
> would be
> better to avoid redundant / useless instructions
Right. I done it.

Justin Pryzby wrote:
> Maybe set parallel_leader_participation=no for this test.
Thanks for reporting the issue and advice. I set 
parallel_leader_participation = off. I hope this helps to solve the 
problem of inconsistencies in the outputs.

If you have any comments on this topic or want to share your 
impressions, please write to me.
Thank you very much for your contribution to the development of this 
patch.

-- 
Ekaterina Sokolova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: WIP Patch: Add a function that returns binary JSONB as a bytea
Next
From: Andrey Lepikhov
Date:
Subject: Re: Implement hook for self-join simplification