Re: [HACKERS] pg_dump disaster - Mailing list pgsql-hackers
From | Alfred Perlstein |
---|---|
Subject | Re: [HACKERS] pg_dump disaster |
Date | |
Msg-id | 20000121134638.U14030@fw.wintelcom.net Whole thread Raw |
In response to | Re: [HACKERS] pg_dump disaster (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] pg_dump disaster
|
List | pgsql-hackers |
* Tom Lane <tgl@sss.pgh.pa.us> [000121 08:14] wrote: > Alfred Perlstein <bright@wintelcom.net> writes: > >>>> The answer appears to be that Perlstein's "nonblocking mode" patches > >>>> have broken psql copy, and doubtless a lot of other applications as > >>>> well, because pqPutBytes no longer feels any particular compulsion > >>>> to actually send the data it's been handed. (Moreover, if it does > >>>> do only a partial send, there is no way to discover how much it sent; > >>>> while its callers might be blamed for not having checked for an error > >>>> return, they'd have no way to recover anyhow.) > > > pqPutBytes _never_ felt any compulsion to flush the buffer to the backend, > > or at least not since I started using it. > > Sorry, I was insufficiently careful about my wording. It's true that > pqPutBytes doesn't worry about actually flushing the data out to the > backend. (It shouldn't, either, since it is typically called with small > fragments of a message rather than complete messages.) It did, however, > take care to *accept* all the data it was given and ensure that the data > was queued in the output buffer. As the code now stands, it's > impossible to tell whether all the passed data was queued or not, or how > much of it was queued. This is a fundamental design error, because the > caller has no way to discover what to do after a failure return (nor > even a way to tell if it was a hard failure or just I-won't-block). > Moreover, no existing caller of PQputline thinks it should have to worry > about looping around the call, so even if you put in a usable return > convention, existing apps would still be broken. > > Similarly, PQendcopy is now willing to return without having gotten > the library out of the COPY state, but the caller can't easily tell > what to do about it --- nor do existing callers believe that they > should have to do anything about it. > > > The implications of this is trully annoying, exporting the socket to > > the backend to the client application causes all sorts of problems because > > the person select()'ing on the socket sees that it's 'clear' but yet > > all thier data has not been sent... > > Yeah, the original set of exported routines was designed without any > thought of handling a nonblock mode. But you aren't going to be able > to fix them this way. There will need to be a new set of entry points > that add a concept of "operation not complete" to their API, and apps > that want to avoid blocking will need to call those instead. Compare > what's been done for connecting (PQconnectPoll) and COPY TO STDOUT > (PQgetlineAsync). > > It's possible that things were broken before you got to them --- there > have been several sets of not-very-carefully-reviewed patches to libpq > during the current development cycle, so someone else may have created > the seeds of the problem. However, we weren't seeing failures in psql > before this week... We both missed it, and yes it was my fault. All connections are behaving as if PQsetnonblocking(conn, TRUE) have been called on them. The original non-blocking patches did something weird, they seemed to _always_ stick the socket into non-blocking mode. This would activate my non-blocking stuff for all connections. I assumed the only code that called the old makenonblocking function was setup to handle this, unfortunatly it's not and you know what they say about assumptions. :( I should have a fix tonight. sorry, -Alfred
pgsql-hackers by date: