Re: Logging parallel worker draught - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Logging parallel worker draught |
Date | |
Msg-id | 2cccd814-d7b3-4c9a-a66a-6df4c45fd95e@enterprisedb.com Whole thread Raw |
In response to | Re: Logging parallel worker draught (Benoit Lobréau <benoit.lobreau@dalibo.com>) |
Responses |
Re: Logging parallel worker draught
|
List | pgsql-hackers |
On 2/27/24 10:55, Benoit Lobréau wrote: > On 2/25/24 20:13, Tomas Vondra wrote: >> 1) name of the GUC > ... >> 2) logging just the failures provides an incomplete view >> log_parallel_workers = {none | failures | all}> >> where "failures" only logs when at least one worker fails to start, and >> "all" logs everything. >> >> AFAIK Sami made the same observation/argument in his last message [1]. > > I like the name and different levels you propose. I was initially > thinking that the overall picture would be better summarized (an easier > to implement) with pg_stat_statements. But with the granularity you > propose, the choice is in the hands of the DBA, which is great and > provides more options when we don't have full control of the installation. > Good that you like the idea with multiple levels. I agree pg_stat_statements may be an easier way to get a summary, but it has limitations too (e.g. no built-in ability to show how the metrics evolves over time, which is easier to restore from logs). So I think those approaches are complementary. >> 3) node-level or query-level? > ... >> There's no value in having information about every individual temp file, >> because we don't even know which node produced it. It'd be perfectly >> sufficient to log some summary statistics (number of files, total space, >> ...), either for the query as a whole, or perhaps per node (because >> that's what matters for work_mem). So I don't think we should mimic this >> just because log_temp_files does this. > > I must admit that, given my (poor) technical level, I went for the > easiest way. > That's understandable, I'd do the same thing. > If we go for the three levels you proposed, "node-level" makes even less > sense (and has great "potential" for spam). > Perhaps. > I can see one downside to the "query-level" approach: it might be more > difficult to to give information in cases where the query doesn't end > normally. It's sometimes useful to have hints about what was going wrong > before a query was cancelled or crashed, which log_temp_files kinda does. > That is certainly true, but it's not a new thing, I believe. IIRC we may not report statistics until the end of the transaction, so no progress updates, I'm not sure what happens if the doesn't end correctly (e.g. backend dies, ...). Similarly for the temporary files, we don't report those until the temporary file gets closed, so I'm not sure you'd get a lot of info about the progress either. I admit I haven't tried and maybe I don't remember the details wrong. Might be useful to experiment with this first a little bit ... >> I don't know how difficult would it be to track/collect this information >> for the whole query. > > I am a worried this will be a little too much for me, given the time and > the knowledge gap I have (both in C and PostgreSQL internals). I'll try > to look anyway. > I certainly understand that, and I realize it may feel daunting and even discouraging. What I can promise is that I'm willing to help, either by suggesting ways to approach the problems, doing reviews, and so on. Would that be sufficient for you to continue working on this patch? Some random thoughts/ideas about how to approach this: - For parallel workers I think this might be as simple as adding some counters into QueryDesc, and update those during query exec (instead of just logging stuff directly). I'm not sure if it should be added to "Instrumentation" or separately. - I was thinking maybe we could just walk the execution plan and collect the counts that way. But I'm not sure that'd work if the Gather node happens to be executed repeatedly (in a loop). Also, not sure we'd want to walk all plans. - While log_temp_files is clearly out of scope for this patch, it might be useful to think about it and how it should behave. We've already used log_temp_files to illustrate some issues with logging the info right away, so maybe there's something to learn here ... - For temporary files I think it'd be more complicated, because we can create temporary files from many different places, not just in executor, so we can't simply update QueryDesc etc. Also, the places that log info about temporary files (using ReportTemporaryFileUsage) only really know about individual temporary files, so if a Sort or HashJoin creates a 1000 files, we'll get one LOG for each of those temp files. But they're really a single "combined" file. So maybe we should introduce some sort of "context" to group those files, and only accumulate/log the size for the group as a whole? Maybe it'd even allow printing some info about what the temporary file is for (e.g. tuplestore / tuplesort / ...). regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: