Re: AIO v2.5 - Mailing list pgsql-hackers
From | Noah Misch |
---|---|
Subject | Re: AIO v2.5 |
Date | |
Msg-id | 20250330024513.ac.nmisch@google.com Whole thread Raw |
In response to | Re: AIO v2.5 (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
On Sat, Mar 29, 2025 at 08:39:54PM -0400, Andres Freund wrote: > On 2025-03-29 14:29:29 -0700, Noah Misch wrote: > > On Wed, Mar 26, 2025 at 09:07:40PM -0400, Andres Freund wrote: > > The choice between LOG and LOG_SERVER_ONLY doesn't matter much for $SUBJECT. > > If a client has decided to set client_min_messages that high, the client might > > be interested in the fact that it got side-tracked completing someone else's > > IO. On the other hand, almost none of those sidetrack events will produce > > messages. The main argument I'd envision for LOG_SERVER_ONLY is that we > > consider the message content sensitive, but I don't see the message content as > > materially sensitive. > > I don't think it's sensitive - it just seems a bit sillier to send the same > thing to the client twice than to the server log. Ah, that adds weight to the benefit of LOG_SERVER_ONLY. > I'm happy to change it to > LOG if you prefer. Your points below mean some comments need to be updated in > smgr/md.c anyway. Nah. > > > +/* > > > + * smgrstartreadv() -- asynchronous version of smgrreadv() > > > + * > > > + * This starts an asynchronous readv IO using the IO handle `ioh`. Other than > > > + * `ioh` all parameters are the same as smgrreadv(). > > > > I would add a comment starting with: > > > > Compared to smgrreadv(), more responsibilities fall on layers above smgr. > > Higher layers handle partial reads. smgr will ereport(LOG_SERVER_ONLY) some > > problems, but higher layers are responsible for pgaio_result_report() to > > mirror that news to the user and (for ERROR) abort the (sub)transaction. > > Hm - if we document that in all the smgrstart* we'd end up with something like > that in a lot of places - but OTOH, this is the first one so far... Alternatively, to avoid duplication: See $PLACE for the tasks that the caller's layer owns, in contrast to smgr owning them for smgrreadv(). > > I say "comment starting with", because I think there's a remaining decision > > about who owns the zeroing currently tied to smgrreadv(). An audit of > > mdreadv() vs. AIO counterparts found this part of mdreadv(): > > > > if (nbytes == 0) > > { > > /* > > * We are at or past EOF, or we read a partial block at EOF. > > * Normally this is an error; upper levels should never try to > > * read a nonexistent block. However, if zero_damaged_pages > > * is ON or we are InRecovery, we should instead return zeroes > > * without complaining. This allows, for example, the case of > > * trying to update a block that was later truncated away. > > */ > > if (zero_damaged_pages || InRecovery) > > { > > > > I didn't write a test to prove its absence, but I'm not finding such code in > > the AIO path. > > Yes, there is no such codepath > > A while ago I had started a thread about whether the above codepath is > necessary postgr.es/m/3qxxsnciyffyf3wyguiz4besdp5t5uxvv3utg75cbcszojlz7p@uibfzmnukkbd which I had forgotten completely. I've redone your audit, and I agree the InRecovery case is dead code. check-world InRecovery reaches mdstartreadv() and mdreadv() only via XLogReadBufferExtended(), vm_readbuf(), and fsm_readbuf(). The zero_damaged_pages case entails more of a judgment call about whether or not its rule breakage eclipses its utility. Fortunately, a disappointed zero_damaged_pages user could work around that code's absence by stopping the server and using "dd" to extend the segment with zeros. > I had planned to put in an error into mdreadv() at the time, but somehow lost > track of that - I kind of mentally put this issue into the "done" category :( > At the very least we need to add a comment about this though. I'm fine with either of: 1. Replace that mdreadv() code with an error. 2. Update comment in mdreadv() that we're phasing out this code due the InRecovery case's dead code status and the zero_damaged_pages problems; AIO intentionally doesn't replicate it. Maybe add Assert(false). I'd do (2) for v18, then do (1) first thing in v19 development. > > > + *zeroed_or_error_count = rem_error & ((1 << 7) - 1); > > > + rem_error >>= 7; > > > > These raw "7" are good places to use your new #define values. Likewise in > > buffer_readv_encode_error(). > > Which define value are you thinking of here? I don't think any of the ones I > added apply? But I think you're right it'd be good to have some define for > it, at least locally. It was just my imagination. Withdrawn. > It's now: > > /* > * We need a backend-local completion callback for shared buffers, to be able > * to report checksum errors correctly. Unfortunately that can only safely > * happen if the reporting backend has previously called > * pgstat_prepare_report_checksum_failure(), which we can only guarantee in > * the backend that started the IO. Hence this callback. > */ Sounds good. > Updated to: > /* > * Throw a WARNING/LOG, as instructed by PIV_LOG_*, if the checksum fails, > * but only after we've checked for the all-zeroes case. > */ > > I found one more, the newly added comment about checksum_failure_p was still > talking about ignore_checksum_failure, but it should now be IGNORE_CHECKSUM_FAILURE. That works. > > > Subject: [PATCH v2.14 18/29] aio: Add test_aio module > > > > I didn't yet re-review the v2.13 or 2.14 changes to this one. That's still in > > my queue. > > That's good - I think some of the tests need to expand a bit more. Since > that's at the end of the dependency chain... Okay, I'll delay on re-reviewing that one. When it's a good time, please put the CF entry back in Needs Review. The patches before it are all ready for commit after the above points of this mail.
pgsql-hackers by date: