Re: WAL usage calculation patch - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: WAL usage calculation patch
Date
Msg-id 20200329121944.GA79261@nol
Whole thread Raw
In response to Re: WAL usage calculation patch  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: WAL usage calculation patch
List pgsql-hackers
On Mon, Mar 23, 2020 at 11:24:50PM +0900, Fujii Masao wrote:
> 
> > Here are the comments for 0001 patch.
> > 
> > +            /*
> > +             * Report a full page image constructed for the WAL record
> > +             */
> > +            pgWalUsage.wal_fp_records++;
> > 
> > Isn't it better to use "fpw" or "fpi" for the variable name rather than
> > "fp" here? In other places, "fpw" and "fpi" are used for full page
> > writes/image.

Agreed, I went with fpw.

> > ISTM that this counter could be incorrect if XLogInsertRecord() determines to
> > calculate again whether FPI is necessary or not. No? IOW, this issue could
> > happen if XLogInsert() calls  XLogRecordAssemble() multiple times in
> > its do-while loop. Isn't this problematic?

Yes probably.  I also see while adding support for EXPLAIN/auto_explain that
the previous approach was incrementing both records and fpw_records, while it
should be only one of those for each record.  I fixed this using the approach I
previously mentionned in [1] which seems to work just fine.

> > +    long        wal_bytes;        /* size of wal records produced */
> > 
> > Isn't it safer to use uint64 (i.e., XLogRecPtr) as the type of this variable
> > rather than long?

Yes indeed.  I switched to uint64, and modified everything accordingly (and
changed pgss to output numeric as there's no other way to handle unsigned int8)

> > +    shm_toc_insert(pcxt->toc, PARALLEL_KEY_WAL_USAGE, bufusage_space);
> > 
> > bufusage_space should be walusage_space here?

Good catch, fixed.

> > /*
> >   * Finish parallel execution.  We wait for parallel workers to finish, and
> >   * accumulate their buffer usage.
> >   */
> > 
> > There are some comments mentioning buffer usage, in execParallel.c.
> > For example, the top comment for ExecParallelFinish(), as the above.
> > These should be updated.

I went through all the file and quickly checked in other places, and I think I
fixed all required comments.

> Here are the comments for 0002 patch.
> 
> +    OUT wal_write_bytes int8,
> +    OUT wal_write_records int8,
> +    OUT wal_write_fp_records int8
> 
> Isn't "write" part in the column names confusing because it's WAL
> *generated* (not written) by the statement?

Agreed, I simply dropped the "_write" part everywhere.

> +RETURNS SETOF record
> +AS 'MODULE_PATHNAME', 'pg_stat_statements_1_4'
> +LANGUAGE C STRICT VOLATILE;
> 
> PARALLEL SAFE should be specified?

Indeed, fixed.

> +/* contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql */
> 
> ISTM it's good timing to have also pg_stat_statements--1.8.sql since
> the definition of pg_stat_statements() is changed. Thought?

As mentionned in other pgss thread, I think the general agreement is to never
provide full script anymore, so I didn't changed that.

> +-- CHECKPOINT before WAL tests to ensure test stability
> +CHECKPOINT;
> 
> Is this true? I thought you added this because the number of FPI
> should be larger than zero in the subsequent test. No? But there
> seems no such test. I'm not excited about adding the test checking
> the number of FPI because it looks fragile, though...

It should ensure a FPW for each new block touch, but yes that's quite fragile.

Since I fixed the record / FPW record counters, I saw that this was actually
already broken as there was a mix of FPW and non-FPW, so I dropped the
checkpoint and just tested (wal_record + wal_fpw_record) instead.

> +UPDATE pgss_test SET b = '333' WHERE a = 3 \;
> +UPDATE pgss_test SET b = '444' WHERE a = 4 ;
> 
> Could you tell me why several queries need to be run to test
> the WAL usage? Isn't running a few query enough for the test purpase?

As far as I can see it's used to test multiple scenario (single command /
multiple commands in or outside explicit transaction).  It shouldn't add a lot
of overhead and since some commands are issues with "\;" it's also testing
proper query string isolation when multi-command query string is provided,
which doesn't seem like a bad idea.  I didn't changed that but I'm not opposed
to remove some of the updates if needed.

Also, to answer Amit Kapila's comments about WAL records and parallel query, I
added support for both EXPLAIN and auto_explain (tab completion and
documentation are also updated), and using a simple table with an index, with
forced parallelism and no leader participation and concurrent update on the
same table, I could test that WAL usage is working as expected:

rjuju=# explain (analyze, wal, verbose) select * from t1;
                                                          QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=0.00..8805.05 rows=100010 width=14) (actual time=8.695..47.592 rows=100010 loops=1)
   Output: id, val
   Workers Planned: 2
   Workers Launched: 2
   WAL: records=204 bytes=86198
   ->  Parallel Seq Scan on public.t1  (cost=0.00..8805.05 rows=50005 width=14) (actual time=0.056..29.112 rows=50005
loops
         Output: id, val
         WAL: records=204 bytes=86198
         Worker 0:  actual time=0.060..28.995 rows=49593 loops=1
           WAL: records=105 bytes=44222
         Worker 1:  actual time=0.052..29.230 rows=50417 loops=1
           WAL: records=99 bytes=41976
 Planning Time: 0.038 ms
 Execution Time: 53.957 ms
(14 rows)

and the same query when nothing end up being modified:

rjuju=# explain (analyze, wal, verbose) select * from t1;
                                                          QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=0.00..8805.05 rows=100010 width=14) (actual time=9.413..48.187 rows=100010 loops=1)
   Output: id, val
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on public.t1  (cost=0.00..8805.05 rows=50005 width=14) (actual time=0.033..24.697 rows=50005
loops
         Output: id, val
         Worker 0:  actual time=0.028..24.786 rows=50447 loops=1
         Worker 1:  actual time=0.038..24.609 rows=49563 loops=1
 Planning Time: 0.282 ms
 Execution Time: 55.643 ms
(10 rows)

So it seems to me that WAL usage infrastructure for parallel query is working
just fine.  I added the EXPLAIN/auto_explain in a separate commit just in case.

[1] https://www.postgresql.org/message-id/CAOBaU_aECK1Z7Nn+x=MhvEwrJzK8wyPsPtWAafjqtZN1fYjEmg@mail.gmail.com

Attachment

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)
Next
From: Julien Rouhaud
Date:
Subject: Re: WAL usage calculation patch