Home > mailing lists

Re: Track IO times in pg_stat_io - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: Track IO times in pg_stat_io
Date	March 21, 2023 05:34:51
Msg-id	20230321023451.7rzy4kjj2iktrg2r@awork3.anarazel.de Whole thread Raw
In response to	Re: Track IO times in pg_stat_io (Melanie Plageman <melanieplageman@gmail.com>)
Responses	Re: Track IO times in pg_stat_io (Melanie Plageman <melanieplageman@gmail.com>)
List	pgsql-hackers

Tree view

Hi,

On 2023-03-16 17:19:16 -0400, Melanie Plageman wrote:
> > > > I wonder if we should get rid of pgStatBlockReadTime, pgStatBlockWriteTime,
> > >
> > > And then have pg_stat_reset_shared('io') reset pg_stat_database IO
> > > stats?
> >
> > Yes.
> 
> I think this makes sense but I am hesitant to do it in this patchset,
> because it feels a bit hidden...maybe?

I'd not do it in the same commit, but I don't see a problem with doing it in
the same patchset.

Now that I think about it again, this wouldn't make pg_stat_reset_shared('io')
affect pg_stat_database - I was thinking we should use pgstat_io.c stats to
provide the information for pgstat_database.c, using its own pending counter.


> > No, I don't think I am suggesting that. What I am trying to suggest is that
> > PendingIOStats should contain instr_time, but that PgStat_IO should contain
> > PgStat_Counter as microseconds, as before.
> 
> So, I've modified the code to make a union of instr_time and
> PgStat_Counter in PgStat_BktypeIO. I am not quite sure if this is okay.
> I store in microsec and then in pg_stat_io, I multiply to get
> milliseconds for display.

Not a fan - what do we gain by having this union? It seems considerably
cleaner to have a struct local to pgstat_io.c that uses instr_time and have a
clean type in PgStat_BktypeIO.  In fact, the code worked after just changing
that.

I don't think it makes sense to have pgstat_io_start()/end() as well as
pgstat_count_io*. For one, the name seems in a "too general namespace" - why
not a pgstat_count*?


> I considered refactoring pgstat_io_end() to use INSTR_TIME_ACCUM_DIFF()
> like [1], but, in the end I actually think I would end up with more
> operations because of the various different counters needing to be
> updated. As it is now, I do a single subtract and a few adds (one for
> each of the different statistics objects tracking IO times
> (pgBufferUsage, pgStatBlockWrite/ReadTime). Whereas, I would need to do
> an accum diff for every one of those.

Right - that only INSTR_TIME_ACCUM_DIFF() only makes sense if there's just a
single counter to update.


WRT:
                /* TODO: AFAICT, pgstat_count_buffer_write_time is only called */
                /* for shared buffers whereas pgstat_count_buffer_read_time is */
                /* called for temp relations and shared buffers. */
                /*
                 * is this intentional and should I match current behavior or
                 * not?
                 */

It's hard to see how that behaviour could be intentional.  Probably worth
fixing in a separate patch. I don't think we're going to backpatch, but it
would make this clearer nonetheless.


Incremental patch with some of the above changed attached.



Btw, it's quite nice how one now can attribute time more easily:

20 connections copying an 8MB file 50 times each:
SELECT reuses, evictions, writes, write_time, extends, extend_time FROM pg_stat_io WHERE backend_type = 'client
backend'AND io_object = 'relation' AND io_context='bulkwrite';
 
┌────────┬───────────┬────────┬────────────┬─────────┬─────────────┐
│ reuses │ evictions │ writes │ write_time │ extends │ extend_time │
├────────┼───────────┼────────┼────────────┼─────────┼─────────────┤
│  36112 │         0 │  36112 │        141 │ 1523176 │        8676 │
└────────┴───────────┴────────┴────────────┴─────────┴─────────────┘

20 connections copying an 80MB file 5 times each:
┌─────────┬───────────┬─────────┬────────────┬─────────┬─────────────┐
│ reuses  │ evictions │ writes  │ write_time │ extends │ extend_time │
├─────────┼───────────┼─────────┼────────────┼─────────┼─────────────┤
│ 1318539 │         0 │ 1318539 │       5013 │ 1523339 │        7873 │
└─────────┴───────────┴─────────┴────────────┴─────────┴─────────────┘
(1 row)


Greetings,

Andres

Attachment

pgsql-hackers by date:

From: Thomas Munro
Date: 21 March 2023, 05:25:04
Subject: Re: Assertion failure with barriers in parallel hash join

From: Pavel Luzanov
Date: 21 March 2023, 06:37:22
Subject: Re: psql: Add role's membership options to the \du+ command

Re: Track IO times in pg_stat_io - Mailing list pgsql-hackers

Attachment

Previous

Next