Thread: Re: Proposal: Progressive explain

Re: Proposal: Progressive explain

From
Rafael Thofehrn Castro
Date:
Hello,

Was recently benchmarking the last version of the patch and found room for
improvement when GUC progressive_explain is enabled globally.

Results with the last published version of the patch:


- progressive_explain = off:

/usr/local/pgsql/bin/pgbench -S -n -T 10 -c 30
tps = 18249.363540 (without initial connection time)

- progressive_explain = 'explain':

/usr/local/pgsql/bin/pgbench -S -n -T 10 -c 30
tps = 3536.635125 (without initial connection time)

This is because progressive explains are being printed for every query, including
the ones that finish instantly.

If we think about it, those printed plans for instant queries are useless as
other backends won't have a chance to look at the plans before they get removed
from pg_stat_progress_explain.

So this new version of the patch implements new GUC progressive_explain_min_duration
to be a used as a threshold for the plan to be printed for the first time:

- progressive_explain_min_duration: min query duration until progressive
explain starts.
  - type: int
  - default: 1s
  - min: 0
  - context: user

Results with the new version:

- progressive_explain = off:

/usr/local/pgsql/bin/pgbench -S -n -T 10 -c 30
tps = 18871.800242 (without initial connection time)

- progressive_explain = 'explain' and progressive_explain_min_duration = '5s':

/usr/local/pgsql/bin/pgbench -S -n -T 10 -c 30
tps = 18896.266538 (without initial connection time)

Implementation of the new GUC progressive_explain_min_duration was done with
timeouts. The timeout callback function is used to initialize the progressive
explain.

There is a catch to this implementation. In thread https://www.postgresql.org/message-id/flat/d68c3ae31672664876b22d2dcbb526d2%40postgrespro.ru
where torikoshia proposes logging of query plans it was raised concerns about
logging plans in the CFI, a sensible part of the code. So torikoshia implemented
a smart workaround consisting in adjusting the execProcNode wrapper of all nodes
so that the plan printing can be done there.

I'm not sure if this same concern applies to timeout callbacks so I also implemented
a second version of the latest patch that uses that execProcNode wrapper strategy.

The wrapper code was implemented by torikoshia (torikoshia@oss.nttdata.com), so
adding the credits here.

Rafael.
Attachment

Re: Proposal: Progressive explain

From
Rafael Thofehrn Castro
Date:
 
Implementation of the new GUC progressive_explain_min_duration was done with
timeouts. The timeout callback function is used to initialize the progressive
explain.

There is a catch to this implementation. In thread https://www.postgresql.org/message-id/flat/d68c3ae31672664876b22d2dcbb526d2%40postgrespro.ru
where torikoshia proposes logging of query plans it was raised concerns about
logging plans in the CFI, a sensible part of the code. So torikoshia implemented
a smart workaround consisting in adjusting the execProcNode wrapper of all nodes
so that the plan printing can be done there.

I'm not sure if this same concern applies to timeout callbacks so I also implemented
a second version of the latest patch that uses that execProcNode wrapper strategy.

The wrapper code was implemented by torikoshia (torikoshia@oss.nttdata.com), so
adding the credits here.

Did additional benchmarks and found issues with the patch that doesn't do execProcNode
wrapping. There are sporadic crashes with double free or corruption (top)

So making the patch that uses the wrapper the current one. Again, giving the credits to
torikoshia as being the owner of that section of the code.

Rafael.
Attachment