Re: Gather performance analysis - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: Gather performance analysis |
Date | |
Msg-id | CAFiTN-s2BxH17zi+_xcVjVfkayrykHkhWR52K2bJimhMBWCE7A@mail.gmail.com Whole thread Raw |
In response to | Re: Gather performance analysis (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Gather performance analysis
|
List | pgsql-hackers |
On Fri, Sep 24, 2021 at 2:01 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Sep 23, 2021 at 4:00 PM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: > > I did find some suspicious behavior on the bigger box I have available > > (with 2x xeon e5-2620v3), see the attached spreadsheet. But it seems > > pretty weird because the worst affected case is with no parallel workers > > (so the queue changes should affect it). Not sure how to explain it, but > > the behavior seems consistent. > > That is pretty odd. I'm inclined to mostly discount the runs with > 10000 tuples because sending such a tiny number of tuples doesn't > really take any significant amount of time, and it seems possible that > variations in the runtime of other code due to code movement effects > could end up mattering more than the changes to the performance of > shm_mq. However, the results with a million tuples seem like they're > probably delivering statistically significant results ... and I guess > maybe what's happening is that the patch hurts when the tuples are too > big relative to the queue size. I am looking at the "query-results.ods" file shared by Tomas, with a million tuple I do not really see where the patch hurts? because I am seeing in most of the cases the time taken by the patch is 60-80% compared to the head. And the worst case with a million tuple is 100.32% are are we pointing to that 0.32% or there is something else that I am missing here. > > I guess your columns are an md5 value each, which is 32 bytes + > overhead, so a 20-columns tuple is ~1kB. Since Dilip's patch flushes > the value to shared memory when more than a quarter of the queue has > been filled, that probably means we flush every 4-5 tuples. I wonder > if that means we need a smaller threshold, like 1/8 of the queue size? > Or maybe the behavior should be adaptive somehow, depending on whether > the receiver ends up waiting for data? Or ... perhaps only small > tuples are worth batching, so that the threshold for posting to shared > memory should be a constant rather than a fraction of the queue size? > I guess we need to know why we see the time spike up in those cases, > if we want to improve them. I will test with the larger tuple sizes and will see the behavior with different thresholds. With 250 bytes tuple size, I have tested with different thresholds and it appeared that 1/4 of the queue size works best. But I will do more detailed testing and share the results. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: