Thread: pg_dump disaster

pg_dump disaster

From
Patrick Welche
Date:
Trying to load a 6 January pg_dumpall file with today's postgresql gives
many

invalid command \N

probably because
PQexec: you gotta get out of a COPY state yourself.
db.out4:11621: invalid command \.

Somehow psql is out of sync and thinks the \N isn't within a COPY block.
The work around was to redump as insert statements..

It's tricky (for me) to debug as "stdin;" file not found..

How do you get of a COPY state yourself?

Cheers,

Patrick


Re: [HACKERS] pg_dump disaster

From
Tom Lane
Date:
Patrick Welche <prlw1@newn.cam.ac.uk> writes:
> Trying to load a 6 January pg_dumpall file with today's postgresql gives
> many
> invalid command \N
> probably because
> PQexec: you gotta get out of a COPY state yourself.
> db.out4:11621: invalid command \.
> Somehow psql is out of sync and thinks the \N isn't within a COPY block.

The answer appears to be that Perlstein's "nonblocking mode" patches
have broken psql copy, and doubtless a lot of other applications as
well, because pqPutBytes no longer feels any particular compulsion
to actually send the data it's been handed.  (Moreover, if it does
do only a partial send, there is no way to discover how much it sent;
while its callers might be blamed for not having checked for an error
return, they'd have no way to recover anyhow.)

I thought these patches should not have been applied without more
peer review, and now I'm sure of it.  I recommend reverting 'em.
        regards, tom lane


Re: [HACKERS] pg_dump disaster

From
Bruce Momjian
Date:
> Patrick Welche <prlw1@newn.cam.ac.uk> writes:
> > Trying to load a 6 January pg_dumpall file with today's postgresql gives
> > many
> > invalid command \N
> > probably because
> > PQexec: you gotta get out of a COPY state yourself.
> > db.out4:11621: invalid command \.
> > Somehow psql is out of sync and thinks the \N isn't within a COPY block.
> 
> The answer appears to be that Perlstein's "nonblocking mode" patches
> have broken psql copy, and doubtless a lot of other applications as
> well, because pqPutBytes no longer feels any particular compulsion
> to actually send the data it's been handed.  (Moreover, if it does
> do only a partial send, there is no way to discover how much it sent;
> while its callers might be blamed for not having checked for an error
> return, they'd have no way to recover anyhow.)
> 
> I thought these patches should not have been applied without more
> peer review, and now I'm sure of it.  I recommend reverting 'em.

Can we give the submitter a few days to address the issue?

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] pg_dump disaster

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> I thought these patches should not have been applied without more
>> peer review, and now I'm sure of it.  I recommend reverting 'em.

> Can we give the submitter a few days to address the issue?

Sure, we have plenty of time.  But I don't think the problem can be
fixed without starting over.  He's taken routines that had two possible
return conditions ("Success" and "Hard failure: he's dead, Jim") and
added a third condition ("I didn't do what I was supposed to do, or
perhaps only some of what I was supposed to do, because I was afraid
of blocking").  If dealing with that third condition could be hidden
entirely inside libpq, that'd be OK, but the entire point of this
set of changes is to bounce control back to the application rather
than block.  Therefore, what we are looking at is a fundamental change
in the API of existing routines, which is *guaranteed* to break existing
applications.  (Worse, it breaks them subtly: the code will compile,
and may even work under light load.)

I think the correct approach is to leave the semantics of the existing
exported routines alone, and add parallel entry points with new
semantics to be used by applications that need to avoid blocking.
That's what we've done for queries, and later for connecting, and it
hasn't broken anything.
        regards, tom lane


Re: [HACKERS] pg_dump disaster

From
Alfred Perlstein
Date:
* Bruce Momjian <pgman@candle.pha.pa.us> [000120 22:13] wrote:
> > Patrick Welche <prlw1@newn.cam.ac.uk> writes:
> > > Trying to load a 6 January pg_dumpall file with today's postgresql gives
> > > many
> > > invalid command \N
> > > probably because
> > > PQexec: you gotta get out of a COPY state yourself.
> > > db.out4:11621: invalid command \.
> > > Somehow psql is out of sync and thinks the \N isn't within a COPY block.
> > 
> > The answer appears to be that Perlstein's "nonblocking mode" patches
> > have broken psql copy, and doubtless a lot of other applications as
> > well, because pqPutBytes no longer feels any particular compulsion
> > to actually send the data it's been handed.  (Moreover, if it does
> > do only a partial send, there is no way to discover how much it sent;
> > while its callers might be blamed for not having checked for an error
> > return, they'd have no way to recover anyhow.)
> > 
> > I thought these patches should not have been applied without more
> > peer review, and now I'm sure of it.  I recommend reverting 'em.
> 
> Can we give the submitter a few days to address the issue?

Tom is wrong in his assessment, this is exactly the reason I brought
the patches in.

pqPutBytes _never_ felt any compulsion to flush the buffer to the backend,
or at least not since I started using it.  The only change I made was
for send reservations to be made for non-blocking connections, nothing
inside of postgresql uses non-blocking connections.

pqPutBytes from revision 1.33: (plus my commentary)

static int
pqPutBytes(const char *s, size_t nbytes, PGconn *conn)
{size_t avail = Max(conn->outBufSize - conn->outCount, 0);

/* while the amount to send is greater than the output buffer */while (nbytes > avail){
/* copy as much as we can into the buffer */    memcpy(conn->outBuffer + conn->outCount, s, avail);    conn->outCount
+=avail;    s += avail;    nbytes -= avail;
 
/* try to flush it.... */    if (pqFlush(conn))        return EOF;    avail = conn->outBufSize;}

/* XXX: ok, we have enough room... where is the flush? */memcpy(conn->outBuffer + conn->outCount, s,
nbytes);conn->outCount+= nbytes;
 
return 0;
}

I may have introduced bugs elsewhere, but i doubt it was in pqPutBytes.

This is the exact problem I was describing and why I felt that routines
that checked for data needed to flush beforehand, there may have been
data that still needed to be sent.

I'll be reviewing my own changes again to make sure I haven't broken
something else that could be causing this.

The implications of this is trully annoying, exporting the socket to
the backend to the client application causes all sorts of problems because
the person select()'ing on the socket sees that it's 'clear' but yet
all thier data has not been sent...

-Alfred


> 
> -- 
>   Bruce Momjian                        |  http://www.op.net/~candle
>   pgman@candle.pha.pa.us               |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]


Re: [HACKERS] pg_dump disaster

From
The Hermit Hacker
Date:
On Fri, 21 Jan 2000, Bruce Momjian wrote:

> > I thought these patches should not have been applied without more
> > peer review, and now I'm sure of it.  I recommend reverting 'em.
> 
> Can we give the submitter a few days to address the issue?

Considering that we haven't even gone BETA yet, I *heavily* second this
... Alfred, so far as I've seen, has a) spent alot of time on these
patches and b) tried to address any concerns as they've been presented
concerning them...

IMHO, if we hadn't commit'd the patches, we wouldn't have found the bug,
and getting feedback on the pathes without applying them, so far, has been
about as painful as pulling teeth ...

>From what I've seen, nobody has been spending much time in libpq *anyway*,
so it isn't as if reverting them if we have to is a big issue ... 

But, on the other side of hte coin ... Alfred, we need relatively quick
turnaround on fixing this, as libpq is kinda crucial :)


Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: [HACKERS] pg_dump disaster

From
Bruce Momjian
Date:
> On Fri, 21 Jan 2000, Bruce Momjian wrote:
> 
> > > I thought these patches should not have been applied without more
> > > peer review, and now I'm sure of it.  I recommend reverting 'em.
> > 
> > Can we give the submitter a few days to address the issue?
> 
> Considering that we haven't even gone BETA yet, I *heavily* second this
> ... Alfred, so far as I've seen, has a) spent alot of time on these
> patches and b) tried to address any concerns as they've been presented
> concerning them...
> 
> IMHO, if we hadn't commit'd the patches, we wouldn't have found the bug,
> and getting feedback on the pathes without applying them, so far, has been
> about as painful as pulling teeth ...

Yes, he has worked very hard on this.  That's why I want to give him
some time to address the issues.  In fact, he already has replied, and I
am sure a dialog will resolve the issue.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] pg_dump disaster

From
Tom Lane
Date:
Alfred Perlstein <bright@wintelcom.net> writes:
>>>> The answer appears to be that Perlstein's "nonblocking mode" patches
>>>> have broken psql copy, and doubtless a lot of other applications as
>>>> well, because pqPutBytes no longer feels any particular compulsion
>>>> to actually send the data it's been handed.  (Moreover, if it does
>>>> do only a partial send, there is no way to discover how much it sent;
>>>> while its callers might be blamed for not having checked for an error
>>>> return, they'd have no way to recover anyhow.)

> pqPutBytes _never_ felt any compulsion to flush the buffer to the backend,
> or at least not since I started using it.

Sorry, I was insufficiently careful about my wording.  It's true that
pqPutBytes doesn't worry about actually flushing the data out to the
backend.  (It shouldn't, either, since it is typically called with small
fragments of a message rather than complete messages.)  It did, however,
take care to *accept* all the data it was given and ensure that the data
was queued in the output buffer.  As the code now stands, it's
impossible to tell whether all the passed data was queued or not, or how
much of it was queued.  This is a fundamental design error, because the
caller has no way to discover what to do after a failure return (nor
even a way to tell if it was a hard failure or just I-won't-block).
Moreover, no existing caller of PQputline thinks it should have to worry
about looping around the call, so even if you put in a usable return
convention, existing apps would still be broken.

Similarly, PQendcopy is now willing to return without having gotten
the library out of the COPY state, but the caller can't easily tell
what to do about it --- nor do existing callers believe that they
should have to do anything about it.

> The implications of this is trully annoying, exporting the socket to
> the backend to the client application causes all sorts of problems because
> the person select()'ing on the socket sees that it's 'clear' but yet
> all thier data has not been sent...

Yeah, the original set of exported routines was designed without any
thought of handling a nonblock mode.  But you aren't going to be able
to fix them this way.  There will need to be a new set of entry points
that add a concept of "operation not complete" to their API, and apps
that want to avoid blocking will need to call those instead.  Compare
what's been done for connecting (PQconnectPoll) and COPY TO STDOUT
(PQgetlineAsync).

It's possible that things were broken before you got to them --- there
have been several sets of not-very-carefully-reviewed patches to libpq
during the current development cycle, so someone else may have created
the seeds of the problem.  However, we weren't seeing failures in psql
before this week...
        regards, tom lane


Re: [HACKERS] pg_dump disaster

From
Alfred Perlstein
Date:
* Tom Lane <tgl@sss.pgh.pa.us> [000121 08:14] wrote:
> Alfred Perlstein <bright@wintelcom.net> writes:
> >>>> The answer appears to be that Perlstein's "nonblocking mode" patches
> >>>> have broken psql copy, and doubtless a lot of other applications as
> >>>> well, because pqPutBytes no longer feels any particular compulsion
> >>>> to actually send the data it's been handed.  (Moreover, if it does
> >>>> do only a partial send, there is no way to discover how much it sent;
> >>>> while its callers might be blamed for not having checked for an error
> >>>> return, they'd have no way to recover anyhow.)
> 
> > pqPutBytes _never_ felt any compulsion to flush the buffer to the backend,
> > or at least not since I started using it.
> 
> Sorry, I was insufficiently careful about my wording.  It's true that
> pqPutBytes doesn't worry about actually flushing the data out to the
> backend.  (It shouldn't, either, since it is typically called with small
> fragments of a message rather than complete messages.)  It did, however,
> take care to *accept* all the data it was given and ensure that the data
> was queued in the output buffer.  As the code now stands, it's
> impossible to tell whether all the passed data was queued or not, or how
> much of it was queued.  This is a fundamental design error, because the
> caller has no way to discover what to do after a failure return (nor
> even a way to tell if it was a hard failure or just I-won't-block).
> Moreover, no existing caller of PQputline thinks it should have to worry
> about looping around the call, so even if you put in a usable return
> convention, existing apps would still be broken.
> 
> Similarly, PQendcopy is now willing to return without having gotten
> the library out of the COPY state, but the caller can't easily tell
> what to do about it --- nor do existing callers believe that they
> should have to do anything about it.
> 
> > The implications of this is trully annoying, exporting the socket to
> > the backend to the client application causes all sorts of problems because
> > the person select()'ing on the socket sees that it's 'clear' but yet
> > all thier data has not been sent...
> 
> Yeah, the original set of exported routines was designed without any
> thought of handling a nonblock mode.  But you aren't going to be able
> to fix them this way.  There will need to be a new set of entry points
> that add a concept of "operation not complete" to their API, and apps
> that want to avoid blocking will need to call those instead.  Compare
> what's been done for connecting (PQconnectPoll) and COPY TO STDOUT
> (PQgetlineAsync).
> 
> It's possible that things were broken before you got to them --- there
> have been several sets of not-very-carefully-reviewed patches to libpq
> during the current development cycle, so someone else may have created
> the seeds of the problem.  However, we weren't seeing failures in psql
> before this week...

We both missed it, and yes it was my fault.  All connections are
behaving as if PQsetnonblocking(conn, TRUE) have been called on them.

The original non-blocking patches did something weird, they seemed
to _always_ stick the socket into non-blocking mode.  This would
activate my non-blocking stuff for all connections.

I assumed the only code that called the old makenonblocking function
was setup to handle this, unfortunatly it's not and you know what they
say about assumptions. :(

I should have a fix tonight.

sorry,
-Alfred



Re: [HACKERS] pg_dump disaster

From
Tom Lane
Date:
Alfred Perlstein <bright@wintelcom.net> writes:
> We both missed it, and yes it was my fault.  All connections are
> behaving as if PQsetnonblocking(conn, TRUE) have been called on them.
> The original non-blocking patches did something weird, they seemed
> to _always_ stick the socket into non-blocking mode.  This would
> activate my non-blocking stuff for all connections.

Yes, the present state of the code seems to activate nonblocking socket
mode all the time; possibly we could band-aid our way back to a working
psql by turning off nonblock mode by default.  But this doesn't address
the fact that the API of these routines cannot support nonblock mode
without being redesigned.
        regards, tom lane


pg_dump possible fix, need testers. (was: Re: [HACKERS] pg_dump disaster)

From
Alfred Perlstein
Date:
* Tom Lane <tgl@sss.pgh.pa.us> [000121 14:35] wrote:
> Alfred Perlstein <bright@wintelcom.net> writes:
> > We both missed it, and yes it was my fault.  All connections are
> > behaving as if PQsetnonblocking(conn, TRUE) have been called on them.
> > The original non-blocking patches did something weird, they seemed
> > to _always_ stick the socket into non-blocking mode.  This would
> > activate my non-blocking stuff for all connections.
> 
> Yes, the present state of the code seems to activate nonblocking socket
> mode all the time; possibly we could band-aid our way back to a working
> psql by turning off nonblock mode by default.  But this doesn't address
> the fact that the API of these routines cannot support nonblock mode
> without being redesigned.

These patches revert the default setting of the non-block flag back
to the old way connections were done.  Since i'm unable to reproduce
this bug I'm hoping people can _please_ give me feedback.

This is just a first shot at fixing the issue, I'll supply changes
to the docs if this all goes well (that you must explicitly set the
blocking status after a connect/disconnect)

I'm a bit concerned about busy looping because the connection is
left in a non-blocking state after the connect, however most of
the code performs select() and checks for EWOULDBLOCK/EAGAIN so it
might not be an issue.

Thanks for holding off on backing out the changes.

Summary:
don't set the nonblock flag during connections
PQsetnonblocking doesn't fiddle with socket flags anymore as the libraryseems to be setup to deal with the socket being
innon-block mode atall times
 
turn off the nonblock flag if/when the connection is torn down to ensurethat a reconnect behaves like it used to.


Index: fe-connect.c
===================================================================
RCS file: /home/pgcvs/pgsql/src/interfaces/libpq/fe-connect.c,v
retrieving revision 1.114
diff -u -c -I$Header: -r1.114 fe-connect.c
cvs diff: conflicting specifications of output style
*** fe-connect.c    2000/01/23 01:27:39    1.114
--- fe-connect.c    2000/01/23 08:56:17
***************
*** 391,397 ****     PGconn       *conn;     char       *tmp;    /* An error message from some service we call. */
bool       error = FALSE;    /* We encountered an error. */
 
-     int            i;      conn = makeEmptyPGconn();     if (conn == NULL)
--- 391,396 ----
***************
*** 586,591 ****
--- 585,614 ---- }  /* ----------
+  * connectMakeNonblocking -
+  * Make a connection non-blocking.
+  * Returns 1 if successful, 0 if not.
+  * ----------
+  */
+ static int
+ connectMakeNonblocking(PGconn *conn)
+ {
+ #ifndef WIN32
+     if (fcntl(conn->sock, F_SETFL, O_NONBLOCK) < 0)
+ #else
+     if (ioctlsocket(conn->sock, FIONBIO, &on) != 0)
+ #endif
+     {
+         printfPQExpBuffer(&conn->errorMessage,
+                           "connectMakeNonblocking -- fcntl() failed: errno=%d\n%s\n",
+                           errno, strerror(errno));
+         return 0;
+     }
+ 
+     return 1;
+ }
+ 
+ /* ----------  * connectNoDelay -  * Sets the TCP_NODELAY socket option.  * Returns 1 if successful, 0 if not.
***************
*** 755,761 ****      *   Ewan Mellor <eem21@cam.ac.uk>.      * ---------- */ #if (!defined(WIN32) ||
defined(WIN32_NON_BLOCKING_CONNECTIONS))&& !defined(USE_SSL)
 
!     if (PQsetnonblocking(conn, TRUE) != 0)         goto connect_errReturn; #endif     
--- 778,784 ----      *   Ewan Mellor <eem21@cam.ac.uk>.      * ---------- */ #if (!defined(WIN32) ||
defined(WIN32_NON_BLOCKING_CONNECTIONS))&& !defined(USE_SSL)
 
!     if (connectMakeNonblocking(conn) == 0)         goto connect_errReturn; #endif     
***************
*** 868,874 ****     /* This makes the connection non-blocking, for all those cases which forced us        not to do it
above.*/ #if (defined(WIN32) && !defined(WIN32_NON_BLOCKING_CONNECTIONS)) || defined(USE_SSL)
 
!     if (PQsetnonblocking(conn, TRUE) != 0)         goto connect_errReturn; #endif     
--- 891,897 ----     /* This makes the connection non-blocking, for all those cases which forced us        not to do it
above.*/ #if (defined(WIN32) && !defined(WIN32_NON_BLOCKING_CONNECTIONS)) || defined(USE_SSL)
 
!     if (connectMakeNonblocking(conn) == 0)         goto connect_errReturn; #endif     
***************
*** 1785,1790 ****
--- 1808,1820 ----         (void) pqPuts("X", conn);         (void) pqFlush(conn);     }
+ 
+     /* 
+      * must reset the blocking status so a possible reconnect will work
+      * don't call PQsetnonblocking() because it will fail if it's unable
+      * to flush the connection.
+      */
+     conn->nonblocking = FALSE;      /*      * Close the connection, reset all transient state, flush I/O buffers.
Index: fe-exec.c
===================================================================
RCS file: /home/pgcvs/pgsql/src/interfaces/libpq/fe-exec.c,v
retrieving revision 1.87
diff -u -c -I$Header: -r1.87 fe-exec.c
cvs diff: conflicting specifications of output style
*** fe-exec.c    2000/01/18 06:09:24    1.87
--- fe-exec.c    2000/01/23 08:56:29
***************
*** 2116,2122 **** int PQsetnonblocking(PGconn *conn, int arg) {
-     int    fcntlarg;      arg = (arg == TRUE) ? 1 : 0;     /* early out if the socket is already in the state
requested*/
 
--- 2116,2121 ----
***************
*** 2131,2174 ****      *  _from_ or _to_ blocking mode, either way we can block them.      */     /* if we are going
fromblocking to non-blocking flush here */
 
!     if (!pqIsnonblocking(conn) && pqFlush(conn))
!         return (-1);
! 
! 
! #ifdef USE_SSL
!     if (conn->ssl)
!     {
!         printfPQExpBuffer(&conn->errorMessage,
!             "PQsetnonblocking() -- not supported when using SSL\n");
!         return (-1);
!     }
! #endif /* USE_SSL */
! 
! #ifndef WIN32
!     fcntlarg = fcntl(conn->sock, F_GETFL, 0);
!     if (fcntlarg == -1)
!         return (-1);
! 
!     if ((arg == TRUE && 
!         fcntl(conn->sock, F_SETFL, fcntlarg | O_NONBLOCK) == -1) ||
!         (arg == FALSE &&
!         fcntl(conn->sock, F_SETFL, fcntlarg & ~O_NONBLOCK) == -1)) 
! #else
!     fcntlarg = arg;
!     if (ioctlsocket(conn->sock, FIONBIO, &fcntlarg) != 0)
! #endif
!     {
!         printfPQExpBuffer(&conn->errorMessage,
!             "PQsetblocking() -- unable to set nonblocking status to %s\n",
!             arg == TRUE ? "TRUE" : "FALSE");         return (-1);
-     }      conn->nonblocking = arg;
- 
-     /* if we are going from non-blocking to blocking flush here */
-     if (pqIsnonblocking(conn) && pqFlush(conn))
-         return (-1);      return (0); }
--- 2130,2139 ----      *  _from_ or _to_ blocking mode, either way we can block them.      */     /* if we are going
fromblocking to non-blocking flush here */
 
!     if (pqFlush(conn))         return (-1);      conn->nonblocking = arg;      return (0); }

--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]


Re: pg_dump possible fix, need testers. (was: Re: [HACKERS] pg_dump disaster)

From
Bruce Momjian
Date:
Patch appled.  And thanks very much for the patch.  I am sorry you were
given a bad time about your previous patch.  We will try to not let it
happen again.



> 
> These patches revert the default setting of the non-block flag back
> to the old way connections were done.  Since i'm unable to reproduce
> this bug I'm hoping people can _please_ give me feedback.
> 
> This is just a first shot at fixing the issue, I'll supply changes
> to the docs if this all goes well (that you must explicitly set the
> blocking status after a connect/disconnect)
> 
> I'm a bit concerned about busy looping because the connection is
> left in a non-blocking state after the connect, however most of
> the code performs select() and checks for EWOULDBLOCK/EAGAIN so it
> might not be an issue.
> 
> Thanks for holding off on backing out the changes.
> 
> Summary:
> don't set the nonblock flag during connections
> PQsetnonblocking doesn't fiddle with socket flags anymore as the library
>  seems to be setup to deal with the socket being in non-block mode at
>  all times
> turn off the nonblock flag if/when the connection is torn down to ensure
>  that a reconnect behaves like it used to.
> 
> 
> Index: fe-connect.c
> ===================================================================
> RCS file: /home/pgcvs/pgsql/src/interfaces/libpq/fe-connect.c,v
> retrieving revision 1.114
> diff -u -c -I$Header: -r1.114 fe-connect.c
> cvs diff: conflicting specifications of output style
> *** fe-connect.c    2000/01/23 01:27:39    1.114
> --- fe-connect.c    2000/01/23 08:56:17
> ***************
> *** 391,397 ****
>       PGconn       *conn;
>       char       *tmp;    /* An error message from some service we call. */
>       bool        error = FALSE;    /* We encountered an error. */
> -     int            i;
>   
>       conn = makeEmptyPGconn();
>       if (conn == NULL)
> --- 391,396 ----
> ***************
> *** 586,591 ****
> --- 585,614 ----
>   }
>   
>   /* ----------
> +  * connectMakeNonblocking -
> +  * Make a connection non-blocking.
> +  * Returns 1 if successful, 0 if not.
> +  * ----------
> +  */
> + static int
> + connectMakeNonblocking(PGconn *conn)
> + {
> + #ifndef WIN32
> +     if (fcntl(conn->sock, F_SETFL, O_NONBLOCK) < 0)
> + #else
> +     if (ioctlsocket(conn->sock, FIONBIO, &on) != 0)
> + #endif
> +     {
> +         printfPQExpBuffer(&conn->errorMessage,
> +                           "connectMakeNonblocking -- fcntl() failed: errno=%d\n%s\n",
> +                           errno, strerror(errno));
> +         return 0;
> +     }
> + 
> +     return 1;
> + }
> + 
> + /* ----------
>    * connectNoDelay -
>    * Sets the TCP_NODELAY socket option.
>    * Returns 1 if successful, 0 if not.
> ***************
> *** 755,761 ****
>        *   Ewan Mellor <eem21@cam.ac.uk>.
>        * ---------- */
>   #if (!defined(WIN32) || defined(WIN32_NON_BLOCKING_CONNECTIONS)) && !defined(USE_SSL)
> !     if (PQsetnonblocking(conn, TRUE) != 0)
>           goto connect_errReturn;
>   #endif    
>   
> --- 778,784 ----
>        *   Ewan Mellor <eem21@cam.ac.uk>.
>        * ---------- */
>   #if (!defined(WIN32) || defined(WIN32_NON_BLOCKING_CONNECTIONS)) && !defined(USE_SSL)
> !     if (connectMakeNonblocking(conn) == 0)
>           goto connect_errReturn;
>   #endif    
>   
> ***************
> *** 868,874 ****
>       /* This makes the connection non-blocking, for all those cases which forced us
>          not to do it above. */
>   #if (defined(WIN32) && !defined(WIN32_NON_BLOCKING_CONNECTIONS)) || defined(USE_SSL)
> !     if (PQsetnonblocking(conn, TRUE) != 0)
>           goto connect_errReturn;
>   #endif    
>   
> --- 891,897 ----
>       /* This makes the connection non-blocking, for all those cases which forced us
>          not to do it above. */
>   #if (defined(WIN32) && !defined(WIN32_NON_BLOCKING_CONNECTIONS)) || defined(USE_SSL)
> !     if (connectMakeNonblocking(conn) == 0)
>           goto connect_errReturn;
>   #endif    
>   
> ***************
> *** 1785,1790 ****
> --- 1808,1820 ----
>           (void) pqPuts("X", conn);
>           (void) pqFlush(conn);
>       }
> + 
> +     /* 
> +      * must reset the blocking status so a possible reconnect will work
> +      * don't call PQsetnonblocking() because it will fail if it's unable
> +      * to flush the connection.
> +      */
> +     conn->nonblocking = FALSE;
>   
>       /*
>        * Close the connection, reset all transient state, flush I/O buffers.
> Index: fe-exec.c
> ===================================================================
> RCS file: /home/pgcvs/pgsql/src/interfaces/libpq/fe-exec.c,v
> retrieving revision 1.87
> diff -u -c -I$Header: -r1.87 fe-exec.c
> cvs diff: conflicting specifications of output style
> *** fe-exec.c    2000/01/18 06:09:24    1.87
> --- fe-exec.c    2000/01/23 08:56:29
> ***************
> *** 2116,2122 ****
>   int
>   PQsetnonblocking(PGconn *conn, int arg)
>   {
> -     int    fcntlarg;
>   
>       arg = (arg == TRUE) ? 1 : 0;
>       /* early out if the socket is already in the state requested */
> --- 2116,2121 ----
> ***************
> *** 2131,2174 ****
>        *  _from_ or _to_ blocking mode, either way we can block them.
>        */
>       /* if we are going from blocking to non-blocking flush here */
> !     if (!pqIsnonblocking(conn) && pqFlush(conn))
> !         return (-1);
> ! 
> ! 
> ! #ifdef USE_SSL
> !     if (conn->ssl)
> !     {
> !         printfPQExpBuffer(&conn->errorMessage,
> !             "PQsetnonblocking() -- not supported when using SSL\n");
> !         return (-1);
> !     }
> ! #endif /* USE_SSL */
> ! 
> ! #ifndef WIN32
> !     fcntlarg = fcntl(conn->sock, F_GETFL, 0);
> !     if (fcntlarg == -1)
> !         return (-1);
> ! 
> !     if ((arg == TRUE && 
> !         fcntl(conn->sock, F_SETFL, fcntlarg | O_NONBLOCK) == -1) ||
> !         (arg == FALSE &&
> !         fcntl(conn->sock, F_SETFL, fcntlarg & ~O_NONBLOCK) == -1)) 
> ! #else
> !     fcntlarg = arg;
> !     if (ioctlsocket(conn->sock, FIONBIO, &fcntlarg) != 0)
> ! #endif
> !     {
> !         printfPQExpBuffer(&conn->errorMessage,
> !             "PQsetblocking() -- unable to set nonblocking status to %s\n",
> !             arg == TRUE ? "TRUE" : "FALSE");
>           return (-1);
> -     }
>   
>       conn->nonblocking = arg;
> - 
> -     /* if we are going from non-blocking to blocking flush here */
> -     if (pqIsnonblocking(conn) && pqFlush(conn))
> -         return (-1);
>   
>       return (0);
>   }
> --- 2130,2139 ----
>        *  _from_ or _to_ blocking mode, either way we can block them.
>        */
>       /* if we are going from blocking to non-blocking flush here */
> !     if (pqFlush(conn))
>           return (-1);
>   
>       conn->nonblocking = arg;
>   
>       return (0);
>   }
> 
> --
> -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
> 


--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: pg_dump possible fix, need testers. (was: Re: [HACKERS] pg_dump disaster)

From
Alfred Perlstein
Date:
* Bruce Momjian <pgman@candle.pha.pa.us> [000122 21:50] wrote:
>
> >
> > These patches revert the default setting of the non-block flag back
> > to the old way connections were done.  Since i'm unable to reproduce
> > this bug I'm hoping people can _please_ give me feedback.
>  
> Patch appled.  And thanks very much for the patch.  I am sorry you were
> given a bad time about your previous patch.  We will try to not let it
> happen again.

I really hope the originator of the problem report will get back to
us to make sure it's settled.

*poking Patrick Welche*

:)

thanks,
-Alfred



Alfred Perlstein <bright@wintelcom.net> writes:
> I really hope the originator of the problem report will get back to
> us to make sure it's settled.

> *poking Patrick Welche*

Um, I didn't have any trouble at all reproducing Patrick's complaint.
pg_dump any moderately large table (I used tenk1 from the regress
database) and try to load the script with psql.  Kaboom.

Also, I still say that turning off nonblock mode by default is only
a band-aid: this code *will fail* whenever nonblock mode is enabled,
because it does not return enough info to the caller to allow the
caller to do the right thing.  If you haven't seen it fail, that
only proves that you haven't actually stressed it to the point of
exercising the buffer-overrun code paths.
        regards, tom lane


Re: pg_dump possible fix, need testers. (was: Re: [HACKERS] pg_dump disaster)

From
Alfred Perlstein
Date:
* Tom Lane <tgl@sss.pgh.pa.us> [000122 22:17] wrote:
> Alfred Perlstein <bright@wintelcom.net> writes:
> > I really hope the originator of the problem report will get back to
> > us to make sure it's settled.
> 
> > *poking Patrick Welche*
> 
> Um, I didn't have any trouble at all reproducing Patrick's complaint.
> pg_dump any moderately large table (I used tenk1 from the regress
> database) and try to load the script with psql.  Kaboom.

Can you try it on sources from before my initial patches were
applied, and from before the initial non-blocking connections
patches from Ewan Mellor were applied?

>From my point of view none of my code should be affecting those that
don't explicitly use PQsetnonblocking(), which is nothing besides
the application i'm developing in-house.

I'm currently investigating that possibility.

thanks,
-Alfred


>> Um, I didn't have any trouble at all reproducing Patrick's complaint.
>> pg_dump any moderately large table (I used tenk1 from the regress
>> database) and try to load the script with psql.  Kaboom.

> This is after or before my latest patch?

Before.  I haven't updated since yesterday...

> I can't seem to reproduce this problem,

Odd.  Maybe there is something different about the kernel's timing of
message sending on your platform.  I see it very easily on HPUX 10.20,
and Patrick sees it very easily on whatever he's using (netbsd I think).
You might try varying the situation a little, saypsql mydb <dumpfilepsql -f dumpfile mydbpsql mydb    \i dumpfile
and the same with -h localhost (to get a TCP/IP connection instead of
Unix domain).  At the moment (pre-patch) I see failures with the
first two of these, but not with the \i method.  -h doesn't seem to
matter for me, but it might for you.

> Telling me something is wrong without giving suggestions on how
> to fix it, nor direct pointers to where it fails doesn't help me
> one bit.  You're not offering constructive critism, you're not
> even offering valid critism, you're just waving your finger at
> "problems" that you say exist but don't pin down to anything specific.

I have been explaining it as clearly as I could.  Let's try it
one more time.

> I spent hours looking over what I did to pqFlush and pqPutnBytes
> because of what you said earlier when all the bug seems to have
> come down to is that I missed that the socket is set to non-blocking
> in all cases now.

Letting the socket mode default to blocking will hide the problems from
existing clients that don't care about non-block mode.  But people who
try to actually use the nonblock mode are going to see the same kinds of
problems that psql is exhibiting.

> The old sequence of events that happened was as follows:

>   user sends data almost filling the output buffer...
>   user sends another line of text overflowing the buffer...
>   pqFlush is invoked blocking the user until the output pipe clears...
>   and repeat.

Right.

> The nonblocking code allows sends to fail so the user can abort
> sending stuff to the backend in order to process other work:

>   user sends data almost filling the output buffer...
>   user sends another line of text that may overflow the buffer...
>   pqFlush is invoked, 
>     if the pipe can't be cleared an error is returned allowing the user to
>       retry the send later.
>     if the flush succeeds then more data is queued and success is returned

But you haven't thought through the mechanics of the "error is returned
allowing the user to retry" code path clearly enough.  Let's take
pqPutBytes for an example.  If it returns EOF, is that a hard error or
does it just mean that the application needs to wait a while?  The
application *must* distinguish these cases, or it will do the wrong
thing: for example, if it mistakes a hard error for "wait a while",
then it will wait forever without making any progress or producing
an error report.

You need to provide a different return convention that indicates
what happened, sayEOF (-1)    => hard error (same as old code)0        => OK1        => no data was queued due to risk
ofblocking
 
And you need to guarantee that the application knows what the state is
when the can't-do-it-yet return is made; note that I specified "no data
was queued" above.  If pqPutBytes might queue some of the data before
returning 1, the application is in trouble again.  While you apparently
foresaw that in recoding pqPutBytes, your code doesn't actually work.
There is the minor code bug that you fail to update "avail" after the
first pqFlush call, and the much more fundamental problem that you
cannot guarantee to have queued all or none of the data.  Think about
what happens if the passed nbytes is larger than the output buffer size.
You may pass the first pqFlush successfully, then get into the loop and
get a won't-block return from pqFlush in the loop.  What then?
You can't simply refuse to support the case nbytes > bufsize at all,
because that will cause application failures as well (too long query
sends it into an infinite loop trying to queue data, most likely).

A possible answer is to specify that a return of +N means "N bytes
remain unqueued due to risk of blocking" (after having queued as much
as you could).  This would put the onus on the caller to update his
pointers/counts properly; propagating that into all the internal uses
of pqPutBytes would be no fun.  (Of course, so far you haven't updated
*any* of the internal callers to behave reasonably in case of a
won't-block return; PQfn is just one example.)

Another possible answer is to preserve pqPutBytes' old API, "queue or
bust", by the expedient of enlarging the output buffer to hold whatever
we can't send immediately.  This is probably more attractive, even
though a long query might suck up a lot of space that won't get
reclaimed as long as the connection lives.  If you don't do this then
you are going to have to make a lot of ugly changes in the internal
callers to deal with won't-block returns.  Actually, a bulk COPY IN
would probably be the worst case --- the app could easily load data into
the buffer far faster than it could be sent.  It might be best to extend
PQputline to have a three-way return and add code there to limit the
growth of the output buffer, while allowing all internal callers to
assume that the buffer is expanded when they need it.

pqFlush has the same kind of interface design problem: the same EOF code
is returned for either a hard error or can't-flush-yet, but it would be
disastrous to treat those cases alike.  You must provide a 3-way return
code.

Furthermore, the same sort of 3-way return code convention will have to
propagate out through anything that calls pqFlush (with corresponding
documentation updates).  pqPutBytes can be made to hide a pqFlush won't-
block return by trying to enlarge the output buffer, but in most other
places you won't have a choice except to punt it back to the caller.

PQendcopy has the same interface design problem.  It used to be that
(unless you passed a null pointer) PQendcopy would *guarantee* that
the connection was no longer in COPY state on return --- by resetting
it, if necessary.  So the return code was mainly informative; the
application didn't have to do anything different if PQendcopy reported
failure.  But now, a nonblocking application does need to pay attention
to whether PQendcopy completed or not --- and you haven't provided a way
for it to tell.  If 1 is returned, the connection might still be in
COPY state, or it might not (PQendcopy might have reset it).  If the
application doesn't distinguish these cases then it will fail.

I also think that you want to take a hard look at the automatic "reset"
behavior upon COPY failure, since a PQreset call will block the
application until it finishes.  Really, what is needed to close down a
COPY safely in nonblock mode is a pair of entry points along the line of
"PQendcopyStart" and "PQendcopyPoll", with API conventions similar to
PQresetStart/PQresetPoll.  This gives you the ability to do the reset
(if one is necessary) without blocking the application.  PQendcopy
itself will only be useful to blocking applications.

> I'm sorry if they don't work for some situations other than COPY IN,
> but it's functionality that I needed and I expect to be expanded on
> by myself and others that take interest in nonblocking operation.

I don't think that the nonblock code is anywhere near production quality
at this point.  It may work for you, if you don't stress it too hard and
never have a communications failure; but I don't want to see us ship it
as part of Postgres unless these issues get addressed.
        regards, tom lane


Re: pg_dump possible fix, need testers. (was: Re: [HACKERS] pg_dump disaster)

From
Alfred Perlstein
Date:
* Tom Lane <tgl@sss.pgh.pa.us> [000123 10:19] wrote:
> >> Um, I didn't have any trouble at all reproducing Patrick's complaint.
> >> pg_dump any moderately large table (I used tenk1 from the regress
> >> database) and try to load the script with psql.  Kaboom.
> 
> > This is after or before my latest patch?
> 
> Before.  I haven't updated since yesterday...
> 
> > I can't seem to reproduce this problem,
> 
> Odd.  Maybe there is something different about the kernel's timing of
> message sending on your platform.  I see it very easily on HPUX 10.20,
> and Patrick sees it very easily on whatever he's using (netbsd I think).
> You might try varying the situation a little, say
>     psql mydb <dumpfile
>     psql -f dumpfile mydb
>     psql mydb
>         \i dumpfile
> and the same with -h localhost (to get a TCP/IP connection instead of
> Unix domain).  At the moment (pre-patch) I see failures with the
> first two of these, but not with the \i method.  -h doesn't seem to
> matter for me, but it might for you.

Ok, the latest patch I posted fixes that, with and without the -h
flag, at least for me.

> > Telling me something is wrong without giving suggestions on how
> > to fix it, nor direct pointers to where it fails doesn't help me
> > one bit.  You're not offering constructive critism, you're not
> > even offering valid critism, you're just waving your finger at
> > "problems" that you say exist but don't pin down to anything specific.
> 
> I have been explaining it as clearly as I could.  Let's try it
> one more time.

What I needed was the above steps to validate that the problem is fixed.

> > I spent hours looking over what I did to pqFlush and pqPutnBytes
> > because of what you said earlier when all the bug seems to have
> > come down to is that I missed that the socket is set to non-blocking
> > in all cases now.
> 
> Letting the socket mode default to blocking will hide the problems from
> existing clients that don't care about non-block mode.  But people who
> try to actually use the nonblock mode are going to see the same kinds of
> problems that psql is exhibiting.

There is no non-block mode, there's the old mode, and the _real_
non-blocking mode that I'm trying to get working.

> > The old sequence of events that happened was as follows:
> 
> >   user sends data almost filling the output buffer...
> >   user sends another line of text overflowing the buffer...
> >   pqFlush is invoked blocking the user until the output pipe clears...
> >   and repeat.
> 
> Right.

You agree that it's somewhat broken to do that right?

> 
> > The nonblocking code allows sends to fail so the user can abort
> > sending stuff to the backend in order to process other work:
> 
> >   user sends data almost filling the output buffer...
> >   user sends another line of text that may overflow the buffer...
> >   pqFlush is invoked, 
> >     if the pipe can't be cleared an error is returned allowing the user to
> >       retry the send later.
> >     if the flush succeeds then more data is queued and success is returned
> 
> But you haven't thought through the mechanics of the "error is returned
> allowing the user to retry" code path clearly enough.  Let's take
> pqPutBytes for an example.  If it returns EOF, is that a hard error or
> does it just mean that the application needs to wait a while?  The
> application *must* distinguish these cases, or it will do the wrong
> thing: for example, if it mistakes a hard error for "wait a while",
> then it will wait forever without making any progress or producing
> an error report.
> 
> You need to provide a different return convention that indicates
> what happened, say
>     EOF (-1)    => hard error (same as old code)
>     0        => OK
>     1        => no data was queued due to risk of blocking
> And you need to guarantee that the application knows what the state is
> when the can't-do-it-yet return is made; note that I specified "no data
> was queued" above.  If pqPutBytes might queue some of the data before
> returning 1, the application is in trouble again.  While you apparently
> foresaw that in recoding pqPutBytes, your code doesn't actually work.
> There is the minor code bug that you fail to update "avail" after the
> first pqFlush call, and the much more fundamental problem that you
> cannot guarantee to have queued all or none of the data.  Think about
> what happens if the passed nbytes is larger than the output buffer size.
> You may pass the first pqFlush successfully, then get into the loop and
> get a won't-block return from pqFlush in the loop.  What then?
> You can't simply refuse to support the case nbytes > bufsize at all,
> because that will cause application failures as well (too long query
> sends it into an infinite loop trying to queue data, most likely).

I don't have to think about this too much (nbytes > conn->outBufSize),
see:  http://www.postgresql.org/docs/programmer/libpq-chapter4142.htm
the Caveats section for libpq:
 Caveats
 The query buffer is 8192 bytes long, and queries over that length will be rejected.

If I can enforce this limit then i'm fine, also there's nothing
stopping me from temporarily realloc()'ing the send buffer, or
chaining another sendbuffer if the first fills mid query, it would
be easy to restrict the application to only a single overcommit of
the send buffer, or a single circular buffer to avoid memory
copies.

> A possible answer is to specify that a return of +N means "N bytes
> remain unqueued due to risk of blocking" (after having queued as much
> as you could).  This would put the onus on the caller to update his
> pointers/counts properly; propagating that into all the internal uses
> of pqPutBytes would be no fun.  (Of course, so far you haven't updated
> *any* of the internal callers to behave reasonably in case of a
> won't-block return; PQfn is just one example.)

No way dude. :)  I would like to get started on a PQfnstart()/PQfnpoll
interface soon.

> Another possible answer is to preserve pqPutBytes' old API, "queue or
> bust", by the expedient of enlarging the output buffer to hold whatever
> we can't send immediately.  This is probably more attractive, even
> though a long query might suck up a lot of space that won't get
> reclaimed as long as the connection lives.  If you don't do this then
> you are going to have to make a lot of ugly changes in the internal
> callers to deal with won't-block returns.  Actually, a bulk COPY IN
> would probably be the worst case --- the app could easily load data into
> the buffer far faster than it could be sent.  It might be best to extend
> PQputline to have a three-way return and add code there to limit the
> growth of the output buffer, while allowing all internal callers to
> assume that the buffer is expanded when they need it.

It's not too difficult to only allow one over-commit into the buffer,
and enforcing the 8k limit, what do you think about that?

> pqFlush has the same kind of interface design problem: the same EOF code
> is returned for either a hard error or can't-flush-yet, but it would be
> disastrous to treat those cases alike.  You must provide a 3-way return
> code.
> 
> Furthermore, the same sort of 3-way return code convention will have to
> propagate out through anything that calls pqFlush (with corresponding
> documentation updates).  pqPutBytes can be made to hide a pqFlush won't-
> block return by trying to enlarge the output buffer, but in most other
> places you won't have a choice except to punt it back to the caller.

I'm not sure about this, the old pqFlush would reset the connection
if it went bad, (I was suprised to see that it didn't) it doesn't do
this so it can read the dying words from the backend,  imo all that's
needed is a:conn->status = CONNECTION_BAD;

before returning EOF in pqFlush()

pqReadData will happily read from pgconn marked 'CONNECTION_BAD' and
user applications can then QPstatus() and reset the connection.

> PQendcopy has the same interface design problem.  It used to be that
> (unless you passed a null pointer) PQendcopy would *guarantee* that
> the connection was no longer in COPY state on return --- by resetting
> it, if necessary.  So the return code was mainly informative; the
> application didn't have to do anything different if PQendcopy reported
> failure.  But now, a nonblocking application does need to pay attention
> to whether PQendcopy completed or not --- and you haven't provided a way
> for it to tell.  If 1 is returned, the connection might still be in
> COPY state, or it might not (PQendcopy might have reset it).  If the
> application doesn't distinguish these cases then it will fail.

PQstatus will allow you to determine if the connection has gone to
CONNECTION_BAD.

> I also think that you want to take a hard look at the automatic "reset"
> behavior upon COPY failure, since a PQreset call will block the
> application until it finishes.  Really, what is needed to close down a
> COPY safely in nonblock mode is a pair of entry points along the line of
> "PQendcopyStart" and "PQendcopyPoll", with API conventions similar to
> PQresetStart/PQresetPoll.  This gives you the ability to do the reset
> (if one is necessary) without blocking the application.  PQendcopy
> itself will only be useful to blocking applications.

I'd be willing to work on fixing this up, but currently other issues
you've mentioned seem higher on the priority list.

> > I'm sorry if they don't work for some situations other than COPY IN,
> > but it's functionality that I needed and I expect to be expanded on
> > by myself and others that take interest in nonblocking operation.
> 
> I don't think that the nonblock code is anywhere near production quality
> at this point.  It may work for you, if you don't stress it too hard and
> never have a communications failure; but I don't want to see us ship it
> as part of Postgres unless these issues get addressed.

I'd really appreciate if it was instead left as undocumented until we
have it completed.

Doing that allows people like myself to see that work is in progress
to provide this functionality and possibly contribute to polishing
it up and expanding its usefullness instead of giving up entirely
or starting from scratch.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]


Alfred Perlstein <bright@wintelcom.net> writes:
> There is no non-block mode, there's the old mode, and the _real_
> non-blocking mode that I'm trying to get working.

Ah --- well, I guess we can agree that it doesn't work yet ;-)

>>>> user sends data almost filling the output buffer...
>>>> user sends another line of text overflowing the buffer...
>>>> pqFlush is invoked blocking the user until the output pipe clears...
>>>> and repeat.
>> 
>> Right.

> You agree that it's somewhat broken to do that right?

It's only broken for a client that doesn't want to block --- but,
clearly, for such a client we can't do it that way.  I like the way
that fe-connect.c has been rewritten over the past couple of months:
provide a "start" and a "poll" function for each operation, and then
implement the old blocking-style function as "start" followed by a
loop around the "poll" function.  Note (just to beat the drum one
more time) that neither the start nor the poll function has the
same API as the old-style function.

>> You can't simply refuse to support the case nbytes > bufsize at all,
>> because that will cause application failures as well (too long query
>> sends it into an infinite loop trying to queue data, most likely).

> I don't have to think about this too much (nbytes > conn->outBufSize),
> see:  http://www.postgresql.org/docs/programmer/libpq-chapter4142.htm
> the Caveats section for libpq:

>   The query buffer is 8192 bytes long, and queries over that length
>   will be rejected.

Is that still there?  Well, I'm sorry if you relied on that statement,
but it's obsolete.  Postgres no longer has any fixed limit on the
length of queries, and it won't do for libpq to re-introduce one.

(Quick grep: yipes, that statement is indeed still in the docs, in
more than one place even.  Thanks for pointing this out.)

> If I can enforce this limit then i'm fine, also there's nothing
> stopping me from temporarily realloc()'ing the send buffer, or
> chaining another sendbuffer if the first fills mid query, it would
> be easy to restrict the application to only a single overcommit of
> the send buffer, or a single circular buffer to avoid memory
> copies.

I think reallocing the send buffer is perfectly acceptable.  I'm not
quite following what you mean by "only one overcommit", though.

>> PQendcopy has the same interface design problem.  It used to be that
>> (unless you passed a null pointer) PQendcopy would *guarantee* that
>> the connection was no longer in COPY state on return --- by resetting
>> it, if necessary.  So the return code was mainly informative; the
>> application didn't have to do anything different if PQendcopy reported
>> failure.  But now, a nonblocking application does need to pay attention
>> to whether PQendcopy completed or not --- and you haven't provided a way
>> for it to tell.  If 1 is returned, the connection might still be in
>> COPY state, or it might not (PQendcopy might have reset it).  If the
>> application doesn't distinguish these cases then it will fail.

> PQstatus will allow you to determine if the connection has gone to
> CONNECTION_BAD.

Neither of the states that we need to distinguish will be CONNECTION_BAD.

I suppose the application could look at the asyncStatus to see if it is
COPY_IN/COPY_OUT, but I think that would be very poor software design:
asyncStatus is intended to be private to libpq and readily redefinable.
(It's not presently visible in libpq-fe.h at all.)  If we start
exporting it as public data we will regret it, IMHO.  Much better to
provide a new exported function whose API is appropriate for this purpose.

>> I don't think that the nonblock code is anywhere near production quality
>> at this point.  It may work for you, if you don't stress it too hard and
>> never have a communications failure; but I don't want to see us ship it
>> as part of Postgres unless these issues get addressed.

> I'd really appreciate if it was instead left as undocumented until we
> have it completed.

I'd be willing to consider that if I were sure that it couldn't break
any ordinary blocking-mode cases.  Right now I don't have much
confidence about that...
        regards, tom lane


Re: pg_dump possible fix, need testers.

From
Patrick Welche
Date:
On Sat, Jan 22, 2000 at 10:02:56PM -0800, Alfred Perlstein wrote:
> 
> I really hope the originator of the problem report will get back to
> us to make sure it's settled.
> 
> *poking Patrick Welche*
> 
> :)

Morning all!

Things are still not so good for me. The pg_dumpall > file, psql < file did
work, but later:

newnham=> select * from crsids,"tblPerson" where
newnham-> crsids.crsid != "tblPerson"."CRSID";
Backend sent B message without prior T
D21Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 

which smells like a similar problem. (Note that this is a join. Straight
selects didn't cause problems)

While running that query, I ran the regression tests, so the ERRORs in the
log are OK, but just in case, here are the last two lines before the above
message:

postmaster: dumpstatus:       sock 5

After typing \. at the prompt

Unknown protocol character 'M' read from backend.  (The protocol character
is the first character the backend sends in response to a query it
receives).
PQendcopy: resetting connection
Asynchronous NOTIFY 'ndropoulou' from backend with pid '1818589281'
received.
Asynchronous NOTIFY 'ndropoulou' from backend with pid '1818589281'
received.

and in the log:

pq_flush: send() failed: Broken pipe
FATAL: pq_endmessage failed: errno=32
pq_flush: send() failed: Broken pipe
...

"ndropoulou" is an incomplete piece of select * data.

(New also, though probably unrelated: the sanity check fails with number of
index tuples exactly half number in heap - not equal)

For any more info, just ask!

Cheers,

Patrick


Re: [HACKERS] Re: pg_dump possible fix, need testers.

From
Tom Lane
Date:
Patrick Welche <prlw1@newn.cam.ac.uk> writes:
> Things are still not so good for me. The pg_dumpall > file, psql < file did
> work, but later:

> newnham=> select * from crsids,"tblPerson" where
newnham-> crsids.crsid != "tblPerson"."CRSID";
> Backend sent B message without prior T
> D21Enter data to be copied followed by a newline.
> End with a backslash and a period on a line by itself.
>>> 

> which smells like a similar problem. (Note that this is a join. Straight
> selects didn't cause problems)

Bizarre.  Obviously, the frontend and backend have gotten out of sync,
but it's not too clear who's to blame.  Since you also mention

> (New also, though probably unrelated: the sanity check fails with number of
> index tuples exactly half number in heap - not equal)

I think that you may have some subtle platform-specific problems in the
backend.

What exactly is your platform/compiler/configuration, anyway?
        regards, tom lane


Re: [HACKERS] Re: pg_dump possible fix, need testers.

From
Patrick Welche
Date:
On Mon, Jan 24, 2000 at 12:29:43PM -0500, Tom Lane wrote:
> Patrick Welche <prlw1@newn.cam.ac.uk> writes:
> > Things are still not so good for me. The pg_dumpall > file, psql < file did
> > work, but later:
> 
> > newnham=> select * from crsids,"tblPerson" where
> newnham-> crsids.crsid != "tblPerson"."CRSID";
> > Backend sent B message without prior T
> > D21Enter data to be copied followed by a newline.
> > End with a backslash and a period on a line by itself.
> >>> 
> 
> > which smells like a similar problem. (Note that this is a join. Straight
> > selects didn't cause problems)
> 
> Bizarre.  Obviously, the frontend and backend have gotten out of sync,
> but it's not too clear who's to blame.  Since you also mention
> 
> > (New also, though probably unrelated: the sanity check fails with number of
> > index tuples exactly half number in heap - not equal)
> 
> I think that you may have some subtle platform-specific problems in the
> backend.
> 
> What exactly is your platform/compiler/configuration, anyway?

NetBSD-1.4P/i386 / egcs-2.91.66 19990314 (egcs-1.1.2 release)
configure --enable-debug

This morning I cvs'd made, installed, moved data->data2, initdb, started
up postmaster, reloaded data (this worked!), then tried the join. It's a
big one, so I thought I might as well stress it at the same time, and did
a regression test.

Anything I could try to narrow the problem down?

Cheers,

Patrick


Re: [HACKERS] Re: pg_dump possible fix, need testers.

From
Tom Lane
Date:
Patrick Welche <prlw1@newn.cam.ac.uk> writes:
> This morning I cvs'd made, installed, moved data->data2, initdb, started
> up postmaster, reloaded data (this worked!), then tried the join. It's a
> big one, so I thought I might as well stress it at the same time, and did
> a regression test.

> Anything I could try to narrow the problem down?

Hmm.  Why don't you try running the parallel regression test, to see
if that blows up too?
        regards, tom lane


Re: [HACKERS] Re: pg_dump possible fix, need testers.

From
Alfred Perlstein
Date:
* Tom Lane <tgl@sss.pgh.pa.us> [000124 10:54] wrote:
> Patrick Welche <prlw1@newn.cam.ac.uk> writes:
> > This morning I cvs'd made, installed, moved data->data2, initdb, started
> > up postmaster, reloaded data (this worked!), then tried the join. It's a
> > big one, so I thought I might as well stress it at the same time, and did
> > a regression test.
> 
> > Anything I could try to narrow the problem down?
> 
> Hmm.  Why don't you try running the parallel regression test, to see
> if that blows up too?

I just ran the regression tests as best as I know how:

~/pgcvs/pgsql/src/test/regress % gmake runcheck
~/pgcvs/pgsql/src/test/regress % grep FAIL run_check.out            test int2                 ... FAILED          test
int4                ... FAILED          test float8               ... FAILED
 
sequential test geometry             ... FAILED
~/pgcvs/pgsql/src/test/regress %

no int2/int4? yipes!

I ran it 10 more times and one time I got:          test constraints          ... FAILED

but i got no weird parse errors or anything from the backend.

Have you been able to find any weirdness with the fix I posted,
or is this more likely an issue with Patrick Welche's setup?

-Alfred


Re: [HACKERS] Re: pg_dump possible fix, need testers.

From
Patrick Welche
Date:
Earlier I wrote:
> (New also, though probably unrelated: the sanity check fails with number of
> index tuples exactly half number in heap - not equal) 
TL> I think that you may have some subtle platform-specific problems in the
TL> backend.

On Mon, Jan 24, 2000, Tom Lane wrote:
TL> Hmm.  Why don't you try running the parallel regression test, to see
TL> if that blows up too?

Rerunning the ordinary regression "runtest", the sanity_check passes. The
difference being that this time I wasn't running a select at the same time.
The parallel test "runcheck" fails on different parts at different times eg:
          test select_into          ... FAILED
because
! psql: connection to database "regression" failed - Backend startup failed

(BTW in resultmap, I need the .*-.*-netbsd rather than just netbsd, I think
it's because config.guess returns i386-unknown-netbsd1.4P, and just netbsd
would imply the string starts with netbsd)

3 times in a row now, gmake runtest on its own is fine, gmake runtest with a
concurrent join select makes sanity_check fail with

+ NOTICE:  RegisterSharedInvalid: SI buffer overflow
+ NOTICE:  InvalidateSharedInvalid: cache state reset
+ NOTICE:  Index onek_stringu1: NUMBER OF INDEX' TUPLES (1000) IS NOT THE SAME AS HEAP' (2000).
+       Recreate the index.
+ NOTICE:  Index onek_hundred: NUMBER OF INDEX' TUPLES (1000) IS NOT THE SAME ASHEAP' (2000).
+       Recreate the index.

and the join will still get itself confused

select * from crsids,"tblPerson" where
newnham-> crsids.crsid != "tblPerson"."CRSID";
Backend sent B message without prior T
D21Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> \.
Unknown protocol character 'M' read from backend.  (The protocol character
is the first character the backend sends in response to a query it
receives).
PQendcopy: resetting connection
Asynchronous NOTIFY 'ndropoulou' from backend with pid '1818589281'
received.
Asynchronous NOTIFY 'ndropoulou' from backend with pid '1818589281'
received.

Then plenty of

pq_flush: send() failed: Broken pipe
FATAL: pq_endmessage failed: errno=32

keep on happening even though the connection is apparently dropped and psql
exited. Running a regression test during this fails sanity_check. Restart
postmaster, and the sanity_check passes. Run the join, and all breaks.

Ah - some joins work. The above join works if I replace * by "Surname". It
should return 750440 rows. It seems to just be a matter of quantity of data
going down the connection. A * row contains 428 varchars worth and about 10
numbers. Just in case it's just me, I'll build a new kernel (when in
kern_proc doubt..)

Cheers,

Patrick


Re: [HACKERS] Re: pg_dump possible fix, need testers.

From
Patrick Welche
Date:
On Mon, Jan 24, 2000 at 03:49:26PM -0800, Alfred Perlstein wrote:
> I just ran the regression tests as best as I know how:
> 
> ~/pgcvs/pgsql/src/test/regress % gmake runcheck
> ~/pgcvs/pgsql/src/test/regress % grep FAIL run_check.out  
>            test int2                 ... FAILED
>            test int4                 ... FAILED
>            test float8               ... FAILED
> sequential test geometry             ... FAILED
> ~/pgcvs/pgsql/src/test/regress %
> 
> no int2/int4? yipes!

Not to worry, those will be differences in error message wording, but

> I ran it 10 more times and one time I got:
>            test constraints          ... FAILED

What did this error come from? (cf regression.diffs)

> but i got no weird parse errors or anything from the backend.
> 
> Have you been able to find any weirdness with the fix I posted,
> or is this more likely an issue with Patrick Welche's setup?

I'm not sure: on the one hand, that evil join of mine returns the entire
contents of a table, and the connection gets confused. Smaller joins work.
Maybe it doesn't happen to you because you don't put in such a useless
select (What do you want 750440 rows for?) On the other hand vacuum analyze
table_name doesn't work for me but obviously does for everyone else, so at
least something is wrong with my setup.

Cheers,

Patrick


Re: [HACKERS] Re: pg_dump possible fix, need testers.

From
Alfred Perlstein
Date:
* Patrick Welche <prlw1@newn.cam.ac.uk> [000124 16:02] wrote:
> On Mon, Jan 24, 2000 at 03:49:26PM -0800, Alfred Perlstein wrote:
> > I just ran the regression tests as best as I know how:
> > 
> > ~/pgcvs/pgsql/src/test/regress % gmake runcheck
> > ~/pgcvs/pgsql/src/test/regress % grep FAIL run_check.out  
> >            test int2                 ... FAILED
> >            test int4                 ... FAILED
> >            test float8               ... FAILED
> > sequential test geometry             ... FAILED
> > ~/pgcvs/pgsql/src/test/regress %
> > 
> > no int2/int4? yipes!
> 
> Not to worry, those will be differences in error message wording, but
> 
> > I ran it 10 more times and one time I got:
> >            test constraints          ... FAILED
> 
> What did this error come from? (cf regression.diffs)
> 
> > but i got no weird parse errors or anything from the backend.
> > 
> > Have you been able to find any weirdness with the fix I posted,
> > or is this more likely an issue with Patrick Welche's setup?
> 
> I'm not sure: on the one hand, that evil join of mine returns the entire
> contents of a table, and the connection gets confused. Smaller joins work.
> Maybe it doesn't happen to you because you don't put in such a useless
> select (What do you want 750440 rows for?) On the other hand vacuum analyze
> table_name doesn't work for me but obviously does for everyone else, so at
> least something is wrong with my setup.

whoa whoa whoa... I just updated my snapshot to today's code and lot
more seems broken:

10 runs of the regression test:
          test aggregates           ... FAILED          test alter_table          ... FAILED          test btree_index
       ... FAILED          test create_misc          ... FAILED          test create_operator      ... FAILED
testfloat8               ... FAILED          test hash_index           ... FAILED          test int2
...FAILED          test int4                 ... FAILED          test limit                ... FAILED          test
portals             ... FAILED          test portals_p2           ... FAILED          test random               ...
FAILED         test rules                ... FAILED          test select_distinct      ... FAILED          test
select_distinct_on  ... FAILED          test select_views         ... FAILED          test transactions         ...
FAILED         test triggers             ... FAILED
 
sequential test create_type          ... FAILED
sequential test create_view          ... FAILED
sequential test geometry             ... FAILED
sequential test select               ... FAILED

I'm going to see what happens if i revert my changes 
to libpq completely, but before updating from and old
repo of 2-3 daysall i get are the int/float single constraints
error.

runtests and regression diffs are at:
http://www.freebsd.org/~alfred/postgresql/

-Alfred


Re: [HACKERS] Re: pg_dump possible fix, need testers.

From
Alfred Perlstein
Date:
* Alfred Perlstein <bright@wintelcom.net> [000124 17:40] wrote:
> * Patrick Welche <prlw1@newn.cam.ac.uk> [000124 16:02] wrote:
> > On Mon, Jan 24, 2000 at 03:49:26PM -0800, Alfred Perlstein wrote:
> > > I just ran the regression tests as best as I know how:
> > > 
> > > ~/pgcvs/pgsql/src/test/regress % gmake runcheck
> > > ~/pgcvs/pgsql/src/test/regress % grep FAIL run_check.out  
> > >            test int2                 ... FAILED
> > >            test int4                 ... FAILED
> > >            test float8               ... FAILED
> > > sequential test geometry             ... FAILED
> > > ~/pgcvs/pgsql/src/test/regress %
> > > 
> > > no int2/int4? yipes!
> > 
> > Not to worry, those will be differences in error message wording, but
> > 
> > > I ran it 10 more times and one time I got:
> > >            test constraints          ... FAILED
> > 
> > What did this error come from? (cf regression.diffs)
> > 
> > > but i got no weird parse errors or anything from the backend.
> > > 
> > > Have you been able to find any weirdness with the fix I posted,
> > > or is this more likely an issue with Patrick Welche's setup?
> > 
> > I'm not sure: on the one hand, that evil join of mine returns the entire
> > contents of a table, and the connection gets confused. Smaller joins work.
> > Maybe it doesn't happen to you because you don't put in such a useless
> > select (What do you want 750440 rows for?) On the other hand vacuum analyze
> > table_name doesn't work for me but obviously does for everyone else, so at
> > least something is wrong with my setup.
> 
> whoa whoa whoa... I just updated my snapshot to today's code and lot
> more seems broken:

because I forgot to clean & rebuild in the regression dir... oops.

Patrick, I'll have a delta shortly that totally backs out my non-blocking
changes for you to test.

-Alfred


Re: [HACKERS] Re: pg_dump possible fix, need testers.

From
Tom Lane
Date:
Patrick Welche <prlw1@newn.cam.ac.uk> writes:
> Rerunning the ordinary regression "runtest", the sanity_check passes. The
> difference being that this time I wasn't running a select at the same time.
> The parallel test "runcheck" fails on different parts at different times eg:

>            test select_into          ... FAILED
> because
> ! psql: connection to database "regression" failed - Backend startup failed

Do you see these failures just from running the parallel regress tests,
without anything else going on?  (Theoretically, since the parallel test
script is running its own private postmaster, whatever you might be
doing under other postmasters shouldn't affect it.  But you know the
difference between theory and practice...)

> (BTW in resultmap, I need the .*-.*-netbsd rather than just netbsd, I think
> it's because config.guess returns i386-unknown-netbsd1.4P, and just netbsd
> would imply the string starts with netbsd)

Right.  My oversight --- fix committed.

> 3 times in a row now, gmake runtest on its own is fine, gmake runtest with a
> concurrent join select makes sanity_check fail with

> + NOTICE:  RegisterSharedInvalid: SI buffer overflow
> + NOTICE:  InvalidateSharedInvalid: cache state reset
> + NOTICE:  Index onek_stringu1: NUMBER OF INDEX' TUPLES (1000) IS NOT THE SAME AS HEAP' (2000).
> +       Recreate the index.
> + NOTICE:  Index onek_hundred: NUMBER OF INDEX' TUPLES (1000) IS NOT THE SAME AS
>  HEAP' (2000).
> +       Recreate the index.

> Ah - some joins work. ... It seems to just be a matter of quantity of data
> going down the connection.

Hmm.  I betcha that the critical factor here is how much *time* the
outside join takes, not exactly how much data it emits.

If that backend is tied up for long enough then it will cause the SI
buffer to overflow, just as you show above.  In theory, that shouldn't
cause any problems beyond the appearance of the overflow/cache reset
NOTICEs (and in fact it did not the last time I tried running things
with a very small SI buffer).  It looks like we have recently broken
something in SI overrun handling.

(In other words, I don't think this has anything to do with Alfred's
changes ... he'll be glad to hear that ;-).  Hiroshi may be on the
hook though.  I'm going to rebuild with a small SI buffer and see
if it breaks.)
        regards, tom lane


Re: pg_dump possible fix, need testers. (was: Re: pg_dump disaster)

From
Bruce Momjian
Date:
Can someone remind me of where we left this?


> >> Um, I didn't have any trouble at all reproducing Patrick's complaint.
> >> pg_dump any moderately large table (I used tenk1 from the regress
> >> database) and try to load the script with psql.  Kaboom.
> 
> > This is after or before my latest patch?
> 
> Before.  I haven't updated since yesterday...
> 
> > I can't seem to reproduce this problem,
> 
> Odd.  Maybe there is something different about the kernel's timing of
> message sending on your platform.  I see it very easily on HPUX 10.20,
> and Patrick sees it very easily on whatever he's using (netbsd I think).
> You might try varying the situation a little, say
>     psql mydb <dumpfile
>     psql -f dumpfile mydb
>     psql mydb
>         \i dumpfile
> and the same with -h localhost (to get a TCP/IP connection instead of
> Unix domain).  At the moment (pre-patch) I see failures with the
> first two of these, but not with the \i method.  -h doesn't seem to
> matter for me, but it might for you.
> 
> > Telling me something is wrong without giving suggestions on how
> > to fix it, nor direct pointers to where it fails doesn't help me
> > one bit.  You're not offering constructive critism, you're not
> > even offering valid critism, you're just waving your finger at
> > "problems" that you say exist but don't pin down to anything specific.
> 
> I have been explaining it as clearly as I could.  Let's try it
> one more time.
> 
> > I spent hours looking over what I did to pqFlush and pqPutnBytes
> > because of what you said earlier when all the bug seems to have
> > come down to is that I missed that the socket is set to non-blocking
> > in all cases now.
> 
> Letting the socket mode default to blocking will hide the problems from
> existing clients that don't care about non-block mode.  But people who
> try to actually use the nonblock mode are going to see the same kinds of
> problems that psql is exhibiting.
> 
> > The old sequence of events that happened was as follows:
> 
> >   user sends data almost filling the output buffer...
> >   user sends another line of text overflowing the buffer...
> >   pqFlush is invoked blocking the user until the output pipe clears...
> >   and repeat.
> 
> Right.
> 
> > The nonblocking code allows sends to fail so the user can abort
> > sending stuff to the backend in order to process other work:
> 
> >   user sends data almost filling the output buffer...
> >   user sends another line of text that may overflow the buffer...
> >   pqFlush is invoked, 
> >     if the pipe can't be cleared an error is returned allowing the user to
> >       retry the send later.
> >     if the flush succeeds then more data is queued and success is returned
> 
> But you haven't thought through the mechanics of the "error is returned
> allowing the user to retry" code path clearly enough.  Let's take
> pqPutBytes for an example.  If it returns EOF, is that a hard error or
> does it just mean that the application needs to wait a while?  The
> application *must* distinguish these cases, or it will do the wrong
> thing: for example, if it mistakes a hard error for "wait a while",
> then it will wait forever without making any progress or producing
> an error report.
> 
> You need to provide a different return convention that indicates
> what happened, say
>     EOF (-1)    => hard error (same as old code)
>     0        => OK
>     1        => no data was queued due to risk of blocking
> And you need to guarantee that the application knows what the state is
> when the can't-do-it-yet return is made; note that I specified "no data
> was queued" above.  If pqPutBytes might queue some of the data before
> returning 1, the application is in trouble again.  While you apparently
> foresaw that in recoding pqPutBytes, your code doesn't actually work.
> There is the minor code bug that you fail to update "avail" after the
> first pqFlush call, and the much more fundamental problem that you
> cannot guarantee to have queued all or none of the data.  Think about
> what happens if the passed nbytes is larger than the output buffer size.
> You may pass the first pqFlush successfully, then get into the loop and
> get a won't-block return from pqFlush in the loop.  What then?
> You can't simply refuse to support the case nbytes > bufsize at all,
> because that will cause application failures as well (too long query
> sends it into an infinite loop trying to queue data, most likely).
> 
> A possible answer is to specify that a return of +N means "N bytes
> remain unqueued due to risk of blocking" (after having queued as much
> as you could).  This would put the onus on the caller to update his
> pointers/counts properly; propagating that into all the internal uses
> of pqPutBytes would be no fun.  (Of course, so far you haven't updated
> *any* of the internal callers to behave reasonably in case of a
> won't-block return; PQfn is just one example.)
> 
> Another possible answer is to preserve pqPutBytes' old API, "queue or
> bust", by the expedient of enlarging the output buffer to hold whatever
> we can't send immediately.  This is probably more attractive, even
> though a long query might suck up a lot of space that won't get
> reclaimed as long as the connection lives.  If you don't do this then
> you are going to have to make a lot of ugly changes in the internal
> callers to deal with won't-block returns.  Actually, a bulk COPY IN
> would probably be the worst case --- the app could easily load data into
> the buffer far faster than it could be sent.  It might be best to extend
> PQputline to have a three-way return and add code there to limit the
> growth of the output buffer, while allowing all internal callers to
> assume that the buffer is expanded when they need it.
> 
> pqFlush has the same kind of interface design problem: the same EOF code
> is returned for either a hard error or can't-flush-yet, but it would be
> disastrous to treat those cases alike.  You must provide a 3-way return
> code.
> 
> Furthermore, the same sort of 3-way return code convention will have to
> propagate out through anything that calls pqFlush (with corresponding
> documentation updates).  pqPutBytes can be made to hide a pqFlush won't-
> block return by trying to enlarge the output buffer, but in most other
> places you won't have a choice except to punt it back to the caller.
> 
> PQendcopy has the same interface design problem.  It used to be that
> (unless you passed a null pointer) PQendcopy would *guarantee* that
> the connection was no longer in COPY state on return --- by resetting
> it, if necessary.  So the return code was mainly informative; the
> application didn't have to do anything different if PQendcopy reported
> failure.  But now, a nonblocking application does need to pay attention
> to whether PQendcopy completed or not --- and you haven't provided a way
> for it to tell.  If 1 is returned, the connection might still be in
> COPY state, or it might not (PQendcopy might have reset it).  If the
> application doesn't distinguish these cases then it will fail.
> 
> I also think that you want to take a hard look at the automatic "reset"
> behavior upon COPY failure, since a PQreset call will block the
> application until it finishes.  Really, what is needed to close down a
> COPY safely in nonblock mode is a pair of entry points along the line of
> "PQendcopyStart" and "PQendcopyPoll", with API conventions similar to
> PQresetStart/PQresetPoll.  This gives you the ability to do the reset
> (if one is necessary) without blocking the application.  PQendcopy
> itself will only be useful to blocking applications.
> 
> > I'm sorry if they don't work for some situations other than COPY IN,
> > but it's functionality that I needed and I expect to be expanded on
> > by myself and others that take interest in nonblocking operation.
> 
> I don't think that the nonblock code is anywhere near production quality
> at this point.  It may work for you, if you don't stress it too hard and
> never have a communications failure; but I don't want to see us ship it
> as part of Postgres unless these issues get addressed.
> 
>             regards, tom lane
> 
> ************
> 


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: pg_dump possible fix, need testers. (was: Re: pg_dump disaster)

From
Alfred Perlstein
Date:
* Bruce Momjian <pgman@candle.pha.pa.us> [000929 19:30] wrote:
> Can someone remind me of where we left this?

I really haven't figured a correct way to deal with the output buffer.
I'll try to consider ways to deal with this.

-Alfred



Re: pg_dump possible fix, need testers. (was: Re: pg_dump disaster)

From
Alfred Perlstein
Date:
* Bruce Momjian <pgman@candle.pha.pa.us> [000929 19:32] wrote:
> Can someone remind me of where we left this?

[snip]

Sorry for the really long follow-up, but this was posted a long time
ago and I wanted to refresh everyone's memory as to the problem.

The problem is that when doing a COPY from stdin using libpq and 
my non-blocking query settings there is a big problem, namely that
the size of the query is limited to the size of the output buffer.

The reason is that I couldn't figure any way to determine what the
application was doing, basically, if i'm sending a query I want to
expand the output buffer to accomidate the new query, however if
I'm doing a COPY out, I would like to be informed if the backend
is lagging behind.

The concept is that someone doing queries needs to be retrieving
results from the database backend between queries, however someone
doing a COPY doesn't until they issue the terminating \\. line.

I'm pretty sure I know what to do now, it's pretty simple actually,
I can examine the state of the connection, if it's in PGASYNC_COPY_IN
then I don't grow the buffer, I inform the application that the 
data will block, if it's no PGASYNC_COPY_IN I allow the buffer to grow
protecting the application from blocking.

Tom, I'd like to know what do you think before running off to implement
this.

thanks,
-Alfred

> > Letting the socket mode default to blocking will hide the problems from
> > existing clients that don't care about non-block mode.  But people who
> > try to actually use the nonblock mode are going to see the same kinds of
> > problems that psql is exhibiting.
> > 
> > > The old sequence of events that happened was as follows:
> > 
> > >   user sends data almost filling the output buffer...
> > >   user sends another line of text overflowing the buffer...
> > >   pqFlush is invoked blocking the user until the output pipe clears...
> > >   and repeat.
> > 
> > Right.
> > 
> > > The nonblocking code allows sends to fail so the user can abort
> > > sending stuff to the backend in order to process other work:
> > 
> > >   user sends data almost filling the output buffer...
> > >   user sends another line of text that may overflow the buffer...
> > >   pqFlush is invoked, 
> > >     if the pipe can't be cleared an error is returned allowing the user to
> > >       retry the send later.
> > >     if the flush succeeds then more data is queued and success is returned
> > 
> > But you haven't thought through the mechanics of the "error is returned
> > allowing the user to retry" code path clearly enough.  Let's take
> > pqPutBytes for an example.  If it returns EOF, is that a hard error or
> > does it just mean that the application needs to wait a while?  The
> > application *must* distinguish these cases, or it will do the wrong
> > thing: for example, if it mistakes a hard error for "wait a while",
> > then it will wait forever without making any progress or producing
> > an error report.
> > 
> > You need to provide a different return convention that indicates
> > what happened, say
> >     EOF (-1)    => hard error (same as old code)
> >     0        => OK
> >     1        => no data was queued due to risk of blocking
> > And you need to guarantee that the application knows what the state is
> > when the can't-do-it-yet return is made; note that I specified "no data
> > was queued" above.  If pqPutBytes might queue some of the data before
> > returning 1, the application is in trouble again.  While you apparently
> > foresaw that in recoding pqPutBytes, your code doesn't actually work.
> > There is the minor code bug that you fail to update "avail" after the
> > first pqFlush call, and the much more fundamental problem that you
> > cannot guarantee to have queued all or none of the data.  Think about
> > what happens if the passed nbytes is larger than the output buffer size.
> > You may pass the first pqFlush successfully, then get into the loop and
> > get a won't-block return from pqFlush in the loop.  What then?
> > You can't simply refuse to support the case nbytes > bufsize at all,
> > because that will cause application failures as well (too long query
> > sends it into an infinite loop trying to queue data, most likely).
> > 
> > A possible answer is to specify that a return of +N means "N bytes
> > remain unqueued due to risk of blocking" (after having queued as much
> > as you could).  This would put the onus on the caller to update his
> > pointers/counts properly; propagating that into all the internal uses
> > of pqPutBytes would be no fun.  (Of course, so far you haven't updated
> > *any* of the internal callers to behave reasonably in case of a
> > won't-block return; PQfn is just one example.)
> > 
> > Another possible answer is to preserve pqPutBytes' old API, "queue or
> > bust", by the expedient of enlarging the output buffer to hold whatever
> > we can't send immediately.  This is probably more attractive, even
> > though a long query might suck up a lot of space that won't get
> > reclaimed as long as the connection lives.  If you don't do this then
> > you are going to have to make a lot of ugly changes in the internal
> > callers to deal with won't-block returns.  Actually, a bulk COPY IN
> > would probably be the worst case --- the app could easily load data into
> > the buffer far faster than it could be sent.  It might be best to extend
> > PQputline to have a three-way return and add code there to limit the
> > growth of the output buffer, while allowing all internal callers to
> > assume that the buffer is expanded when they need it.
> > 
> > pqFlush has the same kind of interface design problem: the same EOF code
> > is returned for either a hard error or can't-flush-yet, but it would be
> > disastrous to treat those cases alike.  You must provide a 3-way return
> > code.
> > 
> > Furthermore, the same sort of 3-way return code convention will have to
> > propagate out through anything that calls pqFlush (with corresponding
> > documentation updates).  pqPutBytes can be made to hide a pqFlush won't-
> > block return by trying to enlarge the output buffer, but in most other
> > places you won't have a choice except to punt it back to the caller.
> > 
> > PQendcopy has the same interface design problem.  It used to be that
> > (unless you passed a null pointer) PQendcopy would *guarantee* that
> > the connection was no longer in COPY state on return --- by resetting
> > it, if necessary.  So the return code was mainly informative; the
> > application didn't have to do anything different if PQendcopy reported
> > failure.  But now, a nonblocking application does need to pay attention
> > to whether PQendcopy completed or not --- and you haven't provided a way
> > for it to tell.  If 1 is returned, the connection might still be in
> > COPY state, or it might not (PQendcopy might have reset it).  If the
> > application doesn't distinguish these cases then it will fail.
> > 
> > I also think that you want to take a hard look at the automatic "reset"
> > behavior upon COPY failure, since a PQreset call will block the
> > application until it finishes.  Really, what is needed to close down a
> > COPY safely in nonblock mode is a pair of entry points along the line of
> > "PQendcopyStart" and "PQendcopyPoll", with API conventions similar to
> > PQresetStart/PQresetPoll.  This gives you the ability to do the reset
> > (if one is necessary) without blocking the application.  PQendcopy
> > itself will only be useful to blocking applications.
> > 
> > > I'm sorry if they don't work for some situations other than COPY IN,
> > > but it's functionality that I needed and I expect to be expanded on
> > > by myself and others that take interest in nonblocking operation.
> > 
> > I don't think that the nonblock code is anywhere near production quality
> > at this point.  It may work for you, if you don't stress it too hard and
> > never have a communications failure; but I don't want to see us ship it
> > as part of Postgres unless these issues get addressed.
> > 
> >             regards, tom lane
> > 
> > ************
> > 
> 
> 
> -- 
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


Re: pg_dump possible fix, need testers. (was: Re: pg_dump disaster)

From
Tom Lane
Date:
Alfred Perlstein <bright@wintelcom.net> writes:
> I'm pretty sure I know what to do now, it's pretty simple actually,
> I can examine the state of the connection, if it's in PGASYNC_COPY_IN
> then I don't grow the buffer, I inform the application that the 
> data will block, if it's no PGASYNC_COPY_IN I allow the buffer to grow
> protecting the application from blocking.

From what I recall of the prior discussion, it seemed that a state-based
approach probably isn't the way to go.  The real issue is how many
routines are you going to have to change to deal with a three-way return
convention; you want to minimize the number of places that have to cope
with that.  IIRC the idea was to let pqPutBytes grow the buffer so that
its callers didn't need to worry about a "sorry, won't block" return
condition.  If you feel that growing the buffer is inappropriate for a
specific caller, then probably the right answer is for that particular
caller to make an extra check to see if the buffer will overflow, and
refrain from calling pqPutBytes if it doesn't like what will happen.

If you make pqPutByte's behavior state-based, then callers that aren't
expecting a "won't block" return will fail (silently :-() in some states.
While you might be able to get away with that for PGASYNC_COPY_IN state
because not much of libpq is expected to be exercised in that state,
it strikes me as an awfully fragile coding convention.  I think you will
regret that choice eventually, if you make it.
        regards, tom lane


Re: pg_dump possible fix, need testers. (was: Re: pg_dump disaster)

From
Alfred Perlstein
Date:
* Tom Lane <tgl@sss.pgh.pa.us> [001012 12:14] wrote:
> Alfred Perlstein <bright@wintelcom.net> writes:
> > I'm pretty sure I know what to do now, it's pretty simple actually,
> > I can examine the state of the connection, if it's in PGASYNC_COPY_IN
> > then I don't grow the buffer, I inform the application that the 
> > data will block, if it's no PGASYNC_COPY_IN I allow the buffer to grow
> > protecting the application from blocking.
> 
> >From what I recall of the prior discussion, it seemed that a state-based
> approach probably isn't the way to go.  The real issue is how many
> routines are you going to have to change to deal with a three-way return
> convention; you want to minimize the number of places that have to cope
> with that.  IIRC the idea was to let pqPutBytes grow the buffer so that
> its callers didn't need to worry about a "sorry, won't block" return
> condition.  If you feel that growing the buffer is inappropriate for a
> specific caller, then probably the right answer is for that particular
> caller to make an extra check to see if the buffer will overflow, and
> refrain from calling pqPutBytes if it doesn't like what will happen.
> 
> If you make pqPutByte's behavior state-based, then callers that aren't
> expecting a "won't block" return will fail (silently :-() in some states.
> While you might be able to get away with that for PGASYNC_COPY_IN state
> because not much of libpq is expected to be exercised in that state,
> it strikes me as an awfully fragile coding convention.  I think you will
> regret that choice eventually, if you make it.

It is a somewhat fragile change, but much less intrusive than anything
else I could think of.  It removes the three-way return value from
the send code for everything except when in a COPY IN state where
it's the application's job to handle it.  If there would be a
blocking condition we do as the non-blocking code handles it except
instead of blocking we buffer it in it's entirety.

My main question was wondering if there's any cases where we might 
"go nuts" sending data to the backend with the exception of the 
COPY code?

-- or --

I could make a function to check the buffer space and attempt to
flush the buffer (in a non-blocking manner) to be called from
PQputline and PQputnbytes if the connection is non-blocking.

However since those are external functions my question is if you
know of any other uses for those function besideds when COPY'ing
into the backend?

Since afaik I'm the only schmoe using my non-blocking stuff,
restricting the check for bufferspace to non-blocking connections
wouldn't break any APIs from my PoV.

How should I proceed?

--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


Libpq async issues

From
Bruce Momjian
Date:
I have added this email to TODO.detail and a mention in the TODO list.

> >> Um, I didn't have any trouble at all reproducing Patrick's complaint.
> >> pg_dump any moderately large table (I used tenk1 from the regress
> >> database) and try to load the script with psql.  Kaboom.
> 
> > This is after or before my latest patch?
> 
> Before.  I haven't updated since yesterday...
> 
> > I can't seem to reproduce this problem,
> 
> Odd.  Maybe there is something different about the kernel's timing of
> message sending on your platform.  I see it very easily on HPUX 10.20,
> and Patrick sees it very easily on whatever he's using (netbsd I think).
> You might try varying the situation a little, say
>     psql mydb <dumpfile
>     psql -f dumpfile mydb
>     psql mydb
>         \i dumpfile
> and the same with -h localhost (to get a TCP/IP connection instead of
> Unix domain).  At the moment (pre-patch) I see failures with the
> first two of these, but not with the \i method.  -h doesn't seem to
> matter for me, but it might for you.
> 
> > Telling me something is wrong without giving suggestions on how
> > to fix it, nor direct pointers to where it fails doesn't help me
> > one bit.  You're not offering constructive critism, you're not
> > even offering valid critism, you're just waving your finger at
> > "problems" that you say exist but don't pin down to anything specific.
> 
> I have been explaining it as clearly as I could.  Let's try it
> one more time.
> 
> > I spent hours looking over what I did to pqFlush and pqPutnBytes
> > because of what you said earlier when all the bug seems to have
> > come down to is that I missed that the socket is set to non-blocking
> > in all cases now.
> 
> Letting the socket mode default to blocking will hide the problems from
> existing clients that don't care about non-block mode.  But people who
> try to actually use the nonblock mode are going to see the same kinds of
> problems that psql is exhibiting.
> 
> > The old sequence of events that happened was as follows:
> 
> >   user sends data almost filling the output buffer...
> >   user sends another line of text overflowing the buffer...
> >   pqFlush is invoked blocking the user until the output pipe clears...
> >   and repeat.
> 
> Right.
> 
> > The nonblocking code allows sends to fail so the user can abort
> > sending stuff to the backend in order to process other work:
> 
> >   user sends data almost filling the output buffer...
> >   user sends another line of text that may overflow the buffer...
> >   pqFlush is invoked, 
> >     if the pipe can't be cleared an error is returned allowing the user to
> >       retry the send later.
> >     if the flush succeeds then more data is queued and success is returned
> 
> But you haven't thought through the mechanics of the "error is returned
> allowing the user to retry" code path clearly enough.  Let's take
> pqPutBytes for an example.  If it returns EOF, is that a hard error or
> does it just mean that the application needs to wait a while?  The
> application *must* distinguish these cases, or it will do the wrong
> thing: for example, if it mistakes a hard error for "wait a while",
> then it will wait forever without making any progress or producing
> an error report.
> 
> You need to provide a different return convention that indicates
> what happened, say
>     EOF (-1)    => hard error (same as old code)
>     0        => OK
>     1        => no data was queued due to risk of blocking
> And you need to guarantee that the application knows what the state is
> when the can't-do-it-yet return is made; note that I specified "no data
> was queued" above.  If pqPutBytes might queue some of the data before
> returning 1, the application is in trouble again.  While you apparently
> foresaw that in recoding pqPutBytes, your code doesn't actually work.
> There is the minor code bug that you fail to update "avail" after the
> first pqFlush call, and the much more fundamental problem that you
> cannot guarantee to have queued all or none of the data.  Think about
> what happens if the passed nbytes is larger than the output buffer size.
> You may pass the first pqFlush successfully, then get into the loop and
> get a won't-block return from pqFlush in the loop.  What then?
> You can't simply refuse to support the case nbytes > bufsize at all,
> because that will cause application failures as well (too long query
> sends it into an infinite loop trying to queue data, most likely).
> 
> A possible answer is to specify that a return of +N means "N bytes
> remain unqueued due to risk of blocking" (after having queued as much
> as you could).  This would put the onus on the caller to update his
> pointers/counts properly; propagating that into all the internal uses
> of pqPutBytes would be no fun.  (Of course, so far you haven't updated
> *any* of the internal callers to behave reasonably in case of a
> won't-block return; PQfn is just one example.)
> 
> Another possible answer is to preserve pqPutBytes' old API, "queue or
> bust", by the expedient of enlarging the output buffer to hold whatever
> we can't send immediately.  This is probably more attractive, even
> though a long query might suck up a lot of space that won't get
> reclaimed as long as the connection lives.  If you don't do this then
> you are going to have to make a lot of ugly changes in the internal
> callers to deal with won't-block returns.  Actually, a bulk COPY IN
> would probably be the worst case --- the app could easily load data into
> the buffer far faster than it could be sent.  It might be best to extend
> PQputline to have a three-way return and add code there to limit the
> growth of the output buffer, while allowing all internal callers to
> assume that the buffer is expanded when they need it.
> 
> pqFlush has the same kind of interface design problem: the same EOF code
> is returned for either a hard error or can't-flush-yet, but it would be
> disastrous to treat those cases alike.  You must provide a 3-way return
> code.
> 
> Furthermore, the same sort of 3-way return code convention will have to
> propagate out through anything that calls pqFlush (with corresponding
> documentation updates).  pqPutBytes can be made to hide a pqFlush won't-
> block return by trying to enlarge the output buffer, but in most other
> places you won't have a choice except to punt it back to the caller.
> 
> PQendcopy has the same interface design problem.  It used to be that
> (unless you passed a null pointer) PQendcopy would *guarantee* that
> the connection was no longer in COPY state on return --- by resetting
> it, if necessary.  So the return code was mainly informative; the
> application didn't have to do anything different if PQendcopy reported
> failure.  But now, a nonblocking application does need to pay attention
> to whether PQendcopy completed or not --- and you haven't provided a way
> for it to tell.  If 1 is returned, the connection might still be in
> COPY state, or it might not (PQendcopy might have reset it).  If the
> application doesn't distinguish these cases then it will fail.
> 
> I also think that you want to take a hard look at the automatic "reset"
> behavior upon COPY failure, since a PQreset call will block the
> application until it finishes.  Really, what is needed to close down a
> COPY safely in nonblock mode is a pair of entry points along the line of
> "PQendcopyStart" and "PQendcopyPoll", with API conventions similar to
> PQresetStart/PQresetPoll.  This gives you the ability to do the reset
> (if one is necessary) without blocking the application.  PQendcopy
> itself will only be useful to blocking applications.
> 
> > I'm sorry if they don't work for some situations other than COPY IN,
> > but it's functionality that I needed and I expect to be expanded on
> > by myself and others that take interest in nonblocking operation.
> 
> I don't think that the nonblock code is anywhere near production quality
> at this point.  It may work for you, if you don't stress it too hard and
> never have a communications failure; but I don't want to see us ship it
> as part of Postgres unless these issues get addressed.
> 
>             regards, tom lane
> 
> ************
> 


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Libpq async issues

From
Tom Lane
Date:
Alfred Perlstein <bright@wintelcom.net> writes:
> * Bruce Momjian <pgman@candle.pha.pa.us> [010124 07:58] wrote:
>> I have added this email to TODO.detail and a mention in the TODO list.

> The bug mentioned here is long gone,

Au contraire, the misdesign is still there.  The nonblock-mode code
will *never* be reliable under stress until something is done about
that, and that means fairly extensive code and API changes.
        regards, tom lane


Re: Libpq async issues

From
Alfred Perlstein
Date:
* Bruce Momjian <pgman@candle.pha.pa.us> [010124 07:58] wrote:
> 
> I have added this email to TODO.detail and a mention in the TODO list.

The bug mentioned here is long gone, however the problem with
issuing non-blocking COPY commands is still present (8k limit on
buffer size).  I hope to get to fix this sometime soon, but you
shouldn't worry about the "normal" path.

There's also a bug with PQCopyEnd(sp?) where it can still block because
it automagically calls into a routine that select()'s waiting for data.

It's on my TODO list as well, but a little behind a several thousand
line server I'm almost complete with.

-Alfred

> 
> > >> Um, I didn't have any trouble at all reproducing Patrick's complaint.
> > >> pg_dump any moderately large table (I used tenk1 from the regress
> > >> database) and try to load the script with psql.  Kaboom.
> > 
> > > This is after or before my latest patch?
> > 
> > Before.  I haven't updated since yesterday...
> > 
> > > I can't seem to reproduce this problem,
> > 
> > Odd.  Maybe there is something different about the kernel's timing of
> > message sending on your platform.  I see it very easily on HPUX 10.20,
> > and Patrick sees it very easily on whatever he's using (netbsd I think).
> > You might try varying the situation a little, say
> >     psql mydb <dumpfile
> >     psql -f dumpfile mydb
> >     psql mydb
> >         \i dumpfile
> > and the same with -h localhost (to get a TCP/IP connection instead of
> > Unix domain).  At the moment (pre-patch) I see failures with the
> > first two of these, but not with the \i method.  -h doesn't seem to
> > matter for me, but it might for you.
> > 
> > > Telling me something is wrong without giving suggestions on how
> > > to fix it, nor direct pointers to where it fails doesn't help me
> > > one bit.  You're not offering constructive critism, you're not
> > > even offering valid critism, you're just waving your finger at
> > > "problems" that you say exist but don't pin down to anything specific.
> > 
> > I have been explaining it as clearly as I could.  Let's try it
> > one more time.
> > 
> > > I spent hours looking over what I did to pqFlush and pqPutnBytes
> > > because of what you said earlier when all the bug seems to have
> > > come down to is that I missed that the socket is set to non-blocking
> > > in all cases now.
> > 
> > Letting the socket mode default to blocking will hide the problems from
> > existing clients that don't care about non-block mode.  But people who
> > try to actually use the nonblock mode are going to see the same kinds of
> > problems that psql is exhibiting.
> > 
> > > The old sequence of events that happened was as follows:
> > 
> > >   user sends data almost filling the output buffer...
> > >   user sends another line of text overflowing the buffer...
> > >   pqFlush is invoked blocking the user until the output pipe clears...
> > >   and repeat.
> > 
> > Right.
> > 
> > > The nonblocking code allows sends to fail so the user can abort
> > > sending stuff to the backend in order to process other work:
> > 
> > >   user sends data almost filling the output buffer...
> > >   user sends another line of text that may overflow the buffer...
> > >   pqFlush is invoked, 
> > >     if the pipe can't be cleared an error is returned allowing the user to
> > >       retry the send later.
> > >     if the flush succeeds then more data is queued and success is returned
> > 
> > But you haven't thought through the mechanics of the "error is returned
> > allowing the user to retry" code path clearly enough.  Let's take
> > pqPutBytes for an example.  If it returns EOF, is that a hard error or
> > does it just mean that the application needs to wait a while?  The
> > application *must* distinguish these cases, or it will do the wrong
> > thing: for example, if it mistakes a hard error for "wait a while",
> > then it will wait forever without making any progress or producing
> > an error report.
> > 
> > You need to provide a different return convention that indicates
> > what happened, say
> >     EOF (-1)    => hard error (same as old code)
> >     0        => OK
> >     1        => no data was queued due to risk of blocking
> > And you need to guarantee that the application knows what the state is
> > when the can't-do-it-yet return is made; note that I specified "no data
> > was queued" above.  If pqPutBytes might queue some of the data before
> > returning 1, the application is in trouble again.  While you apparently
> > foresaw that in recoding pqPutBytes, your code doesn't actually work.
> > There is the minor code bug that you fail to update "avail" after the
> > first pqFlush call, and the much more fundamental problem that you
> > cannot guarantee to have queued all or none of the data.  Think about
> > what happens if the passed nbytes is larger than the output buffer size.
> > You may pass the first pqFlush successfully, then get into the loop and
> > get a won't-block return from pqFlush in the loop.  What then?
> > You can't simply refuse to support the case nbytes > bufsize at all,
> > because that will cause application failures as well (too long query
> > sends it into an infinite loop trying to queue data, most likely).
> > 
> > A possible answer is to specify that a return of +N means "N bytes
> > remain unqueued due to risk of blocking" (after having queued as much
> > as you could).  This would put the onus on the caller to update his
> > pointers/counts properly; propagating that into all the internal uses
> > of pqPutBytes would be no fun.  (Of course, so far you haven't updated
> > *any* of the internal callers to behave reasonably in case of a
> > won't-block return; PQfn is just one example.)
> > 
> > Another possible answer is to preserve pqPutBytes' old API, "queue or
> > bust", by the expedient of enlarging the output buffer to hold whatever
> > we can't send immediately.  This is probably more attractive, even
> > though a long query might suck up a lot of space that won't get
> > reclaimed as long as the connection lives.  If you don't do this then
> > you are going to have to make a lot of ugly changes in the internal
> > callers to deal with won't-block returns.  Actually, a bulk COPY IN
> > would probably be the worst case --- the app could easily load data into
> > the buffer far faster than it could be sent.  It might be best to extend
> > PQputline to have a three-way return and add code there to limit the
> > growth of the output buffer, while allowing all internal callers to
> > assume that the buffer is expanded when they need it.
> > 
> > pqFlush has the same kind of interface design problem: the same EOF code
> > is returned for either a hard error or can't-flush-yet, but it would be
> > disastrous to treat those cases alike.  You must provide a 3-way return
> > code.
> > 
> > Furthermore, the same sort of 3-way return code convention will have to
> > propagate out through anything that calls pqFlush (with corresponding
> > documentation updates).  pqPutBytes can be made to hide a pqFlush won't-
> > block return by trying to enlarge the output buffer, but in most other
> > places you won't have a choice except to punt it back to the caller.
> > 
> > PQendcopy has the same interface design problem.  It used to be that
> > (unless you passed a null pointer) PQendcopy would *guarantee* that
> > the connection was no longer in COPY state on return --- by resetting
> > it, if necessary.  So the return code was mainly informative; the
> > application didn't have to do anything different if PQendcopy reported
> > failure.  But now, a nonblocking application does need to pay attention
> > to whether PQendcopy completed or not --- and you haven't provided a way
> > for it to tell.  If 1 is returned, the connection might still be in
> > COPY state, or it might not (PQendcopy might have reset it).  If the
> > application doesn't distinguish these cases then it will fail.
> > 
> > I also think that you want to take a hard look at the automatic "reset"
> > behavior upon COPY failure, since a PQreset call will block the
> > application until it finishes.  Really, what is needed to close down a
> > COPY safely in nonblock mode is a pair of entry points along the line of
> > "PQendcopyStart" and "PQendcopyPoll", with API conventions similar to
> > PQresetStart/PQresetPoll.  This gives you the ability to do the reset
> > (if one is necessary) without blocking the application.  PQendcopy
> > itself will only be useful to blocking applications.
> > 
> > > I'm sorry if they don't work for some situations other than COPY IN,
> > > but it's functionality that I needed and I expect to be expanded on
> > > by myself and others that take interest in nonblocking operation.
> > 
> > I don't think that the nonblock code is anywhere near production quality
> > at this point.  It may work for you, if you don't stress it too hard and
> > never have a communications failure; but I don't want to see us ship it
> > as part of Postgres unless these issues get addressed.
> > 
> >             regards, tom lane
> > 
> > ************
> > 
> 
> 
> -- 
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


Re: Libpq async issues

From
Alfred Perlstein
Date:
* Tom Lane <tgl@sss.pgh.pa.us> [010124 10:27] wrote:
> Alfred Perlstein <bright@wintelcom.net> writes:
> > * Bruce Momjian <pgman@candle.pha.pa.us> [010124 07:58] wrote:
> >> I have added this email to TODO.detail and a mention in the TODO list.
> 
> > The bug mentioned here is long gone,
> 
> Au contraire, the misdesign is still there.  The nonblock-mode code
> will *never* be reliable under stress until something is done about
> that, and that means fairly extensive code and API changes.

The "bug" is the one mentioned in the first paragraph of the email
where I broke _blocking_ connections for a short period.

I still need to fix async connections for myself (and of course
contribute it back), but I just haven't had the time.  If anyone
else wants it fixed earlier they can wait for me to do it, do it
themself, contract me to do it or hope someone else comes along
to fix it.

I'm thinking that I'll do what you said and have seperate paths
for writing/reading to the socket and API's to do so that give
the user the option of a boundry, basically:
buffer this, but don't allow me to write until it's flushed

which would allow for larger than 8k COPY rows to go into the
backend.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."