Re: Use pgBufferUsage for block reporting in analyze - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Use pgBufferUsage for block reporting in analyze |
Date | |
Msg-id | CAD21AoAqmKpPZHZTibLk08XWcec61TZ3eJoWymK8kSA94ffuHA@mail.gmail.com Whole thread Raw |
In response to | Re: Use pgBufferUsage for block reporting in analyze (Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>) |
Responses |
Re: Use pgBufferUsage for block reporting in analyze
|
List | pgsql-hackers |
Hi, On Mon, Jul 29, 2024 at 12:12 AM Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> wrote: > > On Sat, Jul 27, 2024 at 12:35 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > An alternative idea would > > be to pass StringInfo to AcquireSampleRowsFunc() so that callback can > > write its messages there. This is somewhat similar to what we do in > > the EXPLAIN command (cf, ExplainPropertyText() etc). It could be too > > much but I think it could be better than writing logs in the single > > format. > I have one comment on 0001 patch: /* * Calculate the difference in the Page Hit/Miss/Dirty that * happened as part of the analyze by subtracting out the * pre-analyze values which we saved above. */ - AnalyzePageHit = VacuumPageHit - AnalyzePageHit; - AnalyzePageMiss = VacuumPageMiss - AnalyzePageMiss; - AnalyzePageDirty = VacuumPageDirty - AnalyzePageDirty; + memset(&bufferusage, 0, sizeof(BufferUsage)); + BufferUsageAccumDiff(&bufferusage, &pgBufferUsage, &startbufferusage); + + total_blks_hit = bufferusage.shared_blks_hit + + bufferusage.local_blks_hit; + total_blks_read = bufferusage.shared_blks_read + + bufferusage.local_blks_read; + total_blks_dirtied = bufferusage.shared_blks_dirtied + + bufferusage.local_blks_dirtied; The comment should also be updated or removed. And here are some comments on 0002 patch: - TimestampDifference(starttime, endtime, &secs_dur, &usecs_dur); + delay_in_ms = TimestampDifferenceMilliseconds(starttime, endtime); I think that this change is to make vacuum code consistent with analyze code, particularly the following part: /* * We do not expect an analyze to take > 25 days and it simplifies * things a bit to use TimestampDifferenceMilliseconds. */ delay_in_ms = TimestampDifferenceMilliseconds(starttime, endtime); However, as the above comment says, delay_in_ms can have a duration up to 25 days. I guess it would not be a problem for analyze cases but could be in vacuum cases as vacuum could sometimes be running for a very long time. I've seen vacuums running even for almost 1 month. So I think we should keep this part. --- /* measure elapsed time iff autovacuum logging requires it */ - if (AmAutoVacuumWorkerProcess() && params->log_min_duration >= 0) + if (instrument) The comment should also be updated. --- Could you split the 0002 patch into two patches? One is to have ANALYZE command (with VERBOSE option) write the buffer usage, and second one is to add WAL usage to both ANALYZE command and autoanalyze. I think adding WAL usage to ANALYZE could be controversial as it should not be WAL-intensive operation, so I'd like to keep them separate. > I've tested this approach, it definitely looks better. I've added a > logbuf StringInfo to AcquireSampleRowsFunc which will receive the > logs. elevel was removed as it is not used anymore. Since everything > is in the same log line, I've removed the relation name in the acquire > sample functions. > > For partitioned tables, I've also added the processed partition table > being sampled. The output will look like: > > INFO: analyze of table "postgres.public.test_partition" > Sampling rows from child "public.test_partition_1" > pages: 5 of 5 scanned > tuples: 999 live tuples, 0 are dead; 999 tuples in sample, 999 > estimated total tuples > Sampling rows from child "public.test_partition_2" > pages: 5 of 5 scanned > tuples: 1000 live tuples, 0 are dead; 1000 tuples in sample, 1000 > estimated total tuples > avg read rate: 2.604 MB/s, avg write rate: 0.000 MB/s > ... > > For a file_fdw, the output will be: > > INFO: analyze of table "postgres.public.pglog" > tuples: 60043 tuples; 30000 tuples in sample > avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s > ... Thank you for updating the patch. With your patch, I got the following logs for when executing ANALYZE VERBOSE on a partitioned table: postgres(1:3971560)=# analyze (verbose) p; INFO: analyzing "public.p" inheritance tree INFO: analyze of table "postgres.public.p" Sampling rows from child "public.c1" pages: 10000 of 14750 scanned tuples: 2259833 live tuples, 0 are dead; 10000 tuples in sample, 3333254 estimated total tuples Sampling rows from child "public.c2" pages: 10000 of 14750 scanned tuples: 2260000 live tuples, 0 are dead; 10000 tuples in sample, 3333500 estimated total tuples Sampling rows from child "public.c3" pages: 10000 of 14750 scanned tuples: 2259833 live tuples, 0 are dead; 10000 tuples in sample, 3333254 estimated total tuples avg read rate: 335.184 MB/s, avg write rate: 0.031 MB/s buffer usage: 8249 hits, 21795 reads, 2 dirtied WAL usage: 6 records, 1 full page images, 8825 bytes system usage: CPU: user: 0.46 s, system: 0.03 s, elapsed: 0.50 s : Whereas the current log messages are like follow: INFO: analyzing "public.p" inheritance tree INFO: "c1": scanned 10000 of 14750 pages, containing 2259833 live rows and 0 dead rows; 10000 rows in sample, 3333254 estimated total rows INFO: "c2": scanned 10000 of 14750 pages, containing 2259834 live rows and 0 dead rows; 10000 rows in sample, 3333255 estimated total rows INFO: "c3": scanned 10000 of 14750 pages, containing 2259833 live rows and 0 dead rows; 10000 rows in sample, 3333254 estimated total rows : It seems to me that the current style is more concise and readable (3 rows per table vs. 1 row per table). We might need to consider a better output format for partitioned tables as the number of partitions could be high. I don't have a good idea now, though. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: