Re: WAL usage calculation patch - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: WAL usage calculation patch
Date
Msg-id 20200402144438.GF64485@nol
Whole thread Raw
In response to Re: WAL usage calculation patch  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Thu, Apr 02, 2020 at 06:40:51PM +0530, Amit Kapila wrote:
> On Thu, Apr 2, 2020 at 6:18 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > =# select query, calls, wal_bytes, wal_records, wal_num_fpw from pg_stat_statements where query ilike '%create
index%';
> >                   query                       | calls | wal_bytes | wal_records | wal_num_fpw
> > ----------------------------------------------+-------+-----------+-------------+-------------
> >  create index t1_idx_parallel_0 ON t1(id)     |     1 |  20389743 |        2762 |        2758
> >  create index t1_idx_parallel_0_bis ON t1(id) |     1 |  20394391 |        2762 |        2758
> >  create index t1_idx_parallel_0_ter ON t1(id) |     1 |  20395155 |        2762 |        2758
> >  create index t1_idx_parallel_1 ON t1(id)     |     1 |  20388335 |        2762 |        2758
> >  create index t1_idx_parallel_2 ON t1(id)     |     1 |  20389091 |        2762 |        2758
> >  create index t1_idx_parallel_3 ON t1(id)     |     1 |  20389847 |        2762 |        2758
> >  create index t1_idx_parallel_4 ON t1(id)     |     1 |  20390603 |        2762 |        2758
> >  create index t1_idx_parallel_5 ON t1(id)     |     1 |  20391359 |        2762 |        2758
> >  create index t1_idx_parallel_6 ON t1(id)     |     1 |  20392115 |        2762 |        2758
> >  create index t1_idx_parallel_7 ON t1(id)     |     1 |  20392871 |        2762 |        2758
> >  create index t1_idx_parallel_8 ON t1(id)     |     1 |  20393627 |        2762 |        2758
> > (11 rows)
> >
> > =# select relname, pg_relation_size(oid) from pg_class where relname like '%t1_id%';
> >       relname          | pg_relation_size
> > -----------------------+------------------
> >  t1_idx_parallel_0     |         22487040
> >  t1_idx_parallel_0_bis |         22487040
> >  t1_idx_parallel_0_ter |         22487040
> >  t1_idx_parallel_2     |         22487040
> >  t1_idx_parallel_1     |         22487040
> >  t1_idx_parallel_4     |         22487040
> >  t1_idx_parallel_3     |         22487040
> >  t1_idx_parallel_5     |         22487040
> >  t1_idx_parallel_6     |         22487040
> >  t1_idx_parallel_7     |         22487040
> >  t1_idx_parallel_8     |         22487040
> > (9 rows)
> >
> >
> > So while the number of WAL records and full page images stay constant, we can
> > see some small fluctuations in the total amount of generated WAL data, even for
> > multiple execution of the sequential create index.  I'm wondering if the
> > fluctuations are due to some other internal details or if the WalUsage support
> > is just completely broken (although I don't see any obvious issue ATM).
> >
> 
> I think we need to know the reason for this.  Can you try with small
> size indexes and see if the problem is reproducible? If it is, then it
> will be easier to debug the same.


I did some quick testing using the attached shell script:

- one a 1k line base number of lines, scales 1 10 100 1000 (suffix _s)
- parallel workers from 0 to 8 (suffix _w)
- each index created twice (suffix _pa and _pb)
- with a vacuum;checkpoint;pg_switch_wal executed each time

I get the following results:

                   query                    | wal_bytes | wal_records | wal_num_fpw 
--------------------------------------------+-----------+-------------+-------------
 CREATE INDEX t1_idx_s001_pa_w0 ON t1 (id)  |     61871 |          22 |          18
 CREATE INDEX t1_idx_s001_pa_w1 ON t1 (id)  |     62394 |          21 |          18
 CREATE INDEX t1_idx_s001_pa_w2 ON t1 (id)  |     63150 |          21 |          18
 CREATE INDEX t1_idx_s001_pa_w3 ON t1 (id)  |     63906 |          21 |          18
 CREATE INDEX t1_idx_s001_pa_w4 ON t1 (id)  |     64662 |          21 |          18
 CREATE INDEX t1_idx_s001_pa_w5 ON t1 (id)  |     65418 |          21 |          18
 CREATE INDEX t1_idx_s001_pa_w6 ON t1 (id)  |     65450 |          21 |          18
 CREATE INDEX t1_idx_s001_pa_w7 ON t1 (id)  |     66206 |          21 |          18
 CREATE INDEX t1_idx_s001_pa_w8 ON t1 (id)  |     66962 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w0 ON t1 (id)  |     67718 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w1 ON t1 (id)  |     68474 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w2 ON t1 (id)  |     68418 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w3 ON t1 (id)  |     69174 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w4 ON t1 (id)  |     69930 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w5 ON t1 (id)  |     70686 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w6 ON t1 (id)  |     71442 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w7 ON t1 (id)  |     64922 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w8 ON t1 (id)  |     65682 |          21 |          18
 CREATE INDEX t1_idx_s010_pa_w0 ON t1 (id)  |    250460 |          47 |          44
 CREATE INDEX t1_idx_s010_pa_w1 ON t1 (id)  |    251216 |          47 |          44
 CREATE INDEX t1_idx_s010_pa_w2 ON t1 (id)  |    251972 |          47 |          44
 CREATE INDEX t1_idx_s010_pa_w3 ON t1 (id)  |    252728 |          47 |          44
 CREATE INDEX t1_idx_s010_pa_w4 ON t1 (id)  |    253484 |          47 |          44
 CREATE INDEX t1_idx_s010_pa_w5 ON t1 (id)  |    254240 |          47 |          44
 CREATE INDEX t1_idx_s010_pa_w6 ON t1 (id)  |    253552 |          47 |          44
 CREATE INDEX t1_idx_s010_pa_w7 ON t1 (id)  |    254308 |          47 |          44
 CREATE INDEX t1_idx_s010_pa_w8 ON t1 (id)  |    255064 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w0 ON t1 (id)  |    255820 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w1 ON t1 (id)  |    256576 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w2 ON t1 (id)  |    257332 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w3 ON t1 (id)  |    258088 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w4 ON t1 (id)  |    258844 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w5 ON t1 (id)  |    259600 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w6 ON t1 (id)  |    260356 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w7 ON t1 (id)  |    260012 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w8 ON t1 (id)  |    260768 |          47 |          44
 CREATE INDEX t1_idx_s1000_pa_w0 ON t1 (id) |  20400595 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pa_w1 ON t1 (id) |  20401351 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pa_w2 ON t1 (id) |  20402107 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pa_w3 ON t1 (id) |  20402863 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pa_w4 ON t1 (id) |  20403619 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pa_w5 ON t1 (id) |  20404375 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pa_w6 ON t1 (id) |  20403687 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pa_w7 ON t1 (id) |  20404443 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pa_w8 ON t1 (id) |  20405199 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w0 ON t1 (id) |  20405955 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w1 ON t1 (id) |  20406711 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w2 ON t1 (id) |  20407467 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w3 ON t1 (id) |  20408223 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w4 ON t1 (id) |  20408979 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w5 ON t1 (id) |  20409735 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w6 ON t1 (id) |  20410491 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w7 ON t1 (id) |  20410147 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w8 ON t1 (id) |  20410903 |        2762 |        2759
 CREATE INDEX t1_idx_s100_pa_w0 ON t1 (id)  |   2082194 |         293 |         290
 CREATE INDEX t1_idx_s100_pa_w1 ON t1 (id)  |   2082950 |         293 |         290
 CREATE INDEX t1_idx_s100_pa_w2 ON t1 (id)  |   2083706 |         293 |         290
 CREATE INDEX t1_idx_s100_pa_w3 ON t1 (id)  |   2084462 |         293 |         290
 CREATE INDEX t1_idx_s100_pa_w4 ON t1 (id)  |   2085218 |         293 |         290
 CREATE INDEX t1_idx_s100_pa_w5 ON t1 (id)  |   2085974 |         293 |         290
 CREATE INDEX t1_idx_s100_pa_w6 ON t1 (id)  |   2085286 |         293 |         290
 CREATE INDEX t1_idx_s100_pa_w7 ON t1 (id)  |   2086042 |         293 |         290
 CREATE INDEX t1_idx_s100_pa_w8 ON t1 (id)  |   2086798 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w0 ON t1 (id)  |   2087554 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w1 ON t1 (id)  |   2088310 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w2 ON t1 (id)  |   2089066 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w3 ON t1 (id)  |   2089822 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w4 ON t1 (id)  |   2090578 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w5 ON t1 (id)  |   2091334 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w6 ON t1 (id)  |   2092090 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w7 ON t1 (id)  |   2091746 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w8 ON t1 (id)  |   2092502 |         293 |         290
(72 rows)

The fluctuations exist for all scales, but doesn't seem to depend on the input
size.


Just to be sure I tried to measure the amount of WAL for various INSERT size
using roughly the same approach, and results are stable:

                        query                        | wal_bytes | wal_records | wal_num_fpw
-----------------------------------------------------+-----------+-------------+-------------
 INSERT INTO t_001_a SELECT generate_series($1, $2)  |     59000 |        1000 |           0
 INSERT INTO t_001_b SELECT generate_series($1, $2)  |     59000 |        1000 |           0
 INSERT INTO t_010_a SELECT generate_series($1, $2)  |    590000 |       10000 |           0
 INSERT INTO t_010_b SELECT generate_series($1, $2)  |    590000 |       10000 |           0
 INSERT INTO t_1000_a SELECT generate_series($1, $2) |  59000000 |     1000000 |           0
 INSERT INTO t_1000_b SELECT generate_series($1, $2) |  59000000 |     1000000 |           0
 INSERT INTO t_100_a SELECT generate_series($1, $2)  |   5900000 |      100000 |           0
 INSERT INTO t_100_b SELECT generate_series($1, $2)  |   5900000 |      100000 |           0
(8 rows)


At this point I tend to think that this is somehow due to btbuild specific
behavior, or somewhere nearby.


> Few other minor comments
> ------------------------------------
> pg_stat_statements patch
> 1.
> +--
> +-- CRUD: INSERT SELECT UPDATE DELETE on test non-temp table to
> validate WAL generation metrics
> +--
> 
> The word 'non-temp' in the above comment appears out of place.  We
> don't need to specify it.


Fixed.


> 2.
> +-- SELECT usage data, check WAL usage is reported, wal_records equal
> rows count for INSERT/UPDATE/DELETE
> +SELECT query, calls, rows,
> +wal_bytes > 0 as wal_bytes_generated,
> +wal_records > 0 as wal_records_generated,
> +wal_records = rows as wal_records_as_rows
> +FROM pg_stat_statements ORDER BY query COLLATE "C";
> 
> The comment doesn't seem to match what we are doing in the statement.
> I think we can simplify it to something like "check WAL is generated
> for above statements:


Done.


> 3.
> @@ -185,6 +185,9 @@ typedef struct Counters
>   int64 local_blks_written; /* # of local disk blocks written */
>   int64 temp_blks_read; /* # of temp blocks read */
>   int64 temp_blks_written; /* # of temp blocks written */
> + uint64 wal_bytes; /* total amount of WAL bytes generated */
> + int64 wal_records; /* # of WAL records generated */
> + int64 wal_num_fpw; /* # of WAL full page image generated */
>   double blk_read_time; /* time spent reading, in msec */
>   double blk_write_time; /* time spent writing, in msec */
>   double usage; /* usage factor */
> 
> It is better to keep wal_bytes should be after wal_num_fpw as it is in
> the main patch.  Also, consider changing at other places in this
> patch.  I think we should add these new fields after blk_write_time or
> at the end after usage.


Done.


> 4.
> /* # of WAL full page image generated */
> Can we change it to "/* # of WAL full page image records generated */"?
> 
> If you agree, then a similar comment exists in
> v11-0001-Add-infrastructure-to-track-WAL-usage, consider changing that
> as well.


Agreed, and fixed in both place.


> v11-0002-Add-option-to-report-WAL-usage-in-EXPLAIN-and-au
> 5.
> Specifically, include the
> +      number of records, full page images and bytes generated.
> 
> How about making the above slightly clear?  "Specifically, include the
> number of records, number of full page image records and amount of WAL
> bytes generated.


Thanks, that's clearer.  Done

Attachment

pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: WAL usage calculation patch
Next
From: Tom Lane
Date:
Subject: Re: Berserk Autovacuum (let's save next Mandrill)