Re: Parallel leader process info in EXPLAIN - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Parallel leader process info in EXPLAIN
Date
Msg-id CAAKRu_YPTn=-Su2PRX1=SdPmV7MRzoCfim1mDpcniA__mftMdg@mail.gmail.com
Whole thread Raw
In response to Parallel leader process info in EXPLAIN  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Parallel leader process info in EXPLAIN
Re: Parallel leader process info in EXPLAIN
List pgsql-hackers

On Wed, Oct 23, 2019 at 12:30 AM Thomas Munro <thomas.munro@gmail.com> wrote:

While working on some slides explaining EXPLAIN, I couldn't resist the
urge to add the missing $SUBJECT.  The attached 0001 patch gives the
following:

Gather  ... time=0.146..33.077 rows=1 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=4425
  ->  Parallel Seq Scan on public.t ... time=19.421..30.092 rows=0 loops=3)
        Filter: (t.i = 42)
        Rows Removed by Filter: 333333
        Leader: actual time=0.013..32.025 rows=1 loops=1      <--- NEW
          Buffers: shared hit=1546                            <--- NEW
        Worker 0: actual time=29.069..29.069 rows=0 loops=1
          Buffers: shared hit=1126
        Worker 1: actual time=29.181..29.181 rows=0 loops=1
          Buffers: shared hit=1753

Without that, you have to deduce what work was done in the leader, but
I'd rather just show it.

The 0002 patch adjusts Sort for consistency with that scheme, so you get:

Sort  ... time=84.303..122.238 rows=333333 loops=3)
   Output: t1.i
   Sort Key: t1.i
   Leader:  Sort Method: external merge  Disk: 5864kB       <--- DIFFERENT
   Worker 0:  Sort Method: external merge  Disk: 3376kB
   Worker 1:  Sort Method: external merge  Disk: 4504kB
   Leader: actual time=119.624..165.949 rows=426914 loops=1
   Worker 0: actual time=61.239..90.984 rows=245612 loops=1
   Worker 1: actual time=72.046..109.782 rows=327474 loops=1

Without the "Leader" label, it's not really clear to the uninitiated
whether you're looking at combined, average or single process numbers.

Cool! I dig it.
Checked out the patches a bit and noticed that the tuplesort
instrumentation uses spaceUsed and I saw this comment in
tuplesort_get_stats()

/*
* Note: it might seem we should provide both memory and disk usage for a
* disk-based sort.  However, the current code doesn't track memory space
* accurately once we have begun to return tuples to the caller (since we
* don't account for pfree's the caller is expected to do), so we cannot
* rely on availMem in a disk sort.  This does not seem worth the overhead
* to fix.  Is it worth creating an API for the memory context code to
* tell us how much is actually used in sortcontext?
*/

might it be worth trying out the memory accounting API
5dd7fc1519461548eebf26c33eac6878ea3e8788 here?
 

Of course there are some more things that could be reported in a
similar way eventually, such as filter counters and hash join details.


This made me think about other explain wishlist items.
For parallel hashjoin, I would find it useful to know which batches
each worker participated in (maybe just probing to start with, but
loading would be great too).

I'm not sure anyone else (especially users) would care about this,
though.

--
Melanie Plageman

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Proposal: Global Index
Next
From: Peter Geoghegan
Date:
Subject: Re: Proposal: Global Index