Re: pgsql: Add tests for '-f' option in dropdb utility. - Mailing list pgsql-committers

From Amit Kapila
Subject Re: pgsql: Add tests for '-f' option in dropdb utility.
Date
Msg-id CAA4eK1+GNyjaPK77y+euh5eAgM75pncG1JYZhxYZF+SgS6NpjA@mail.gmail.com
Whole thread Raw
In response to Re: pgsql: Add tests for '-f' option in dropdb utility.  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: pgsql: Add tests for '-f' option in dropdb utility.  (Amit Kapila <amit.kapila16@gmail.com>)
Re: pgsql: Add tests for '-f' option in dropdb utility.  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-committers
On Fri, Nov 29, 2019 at 7:42 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Nov 29, 2019 at 2:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
>
> > * On all the failing machines, it's very clear from the postmaster
> > log that the backend knows why it's being terminated:
> >
> > 2019-11-28 13:47:56.320 UTC [212024:4] 051_dropdb_force.pl FATAL:  terminating connection due to administrator
command
> >
>
> Yeah, I can confirm this behavior on the Windows machines.  I have
> also seen that we already expose this behavior in more than one way
> and all behave similar to this.  If I use pg_terminate_backend(<pid>)
> or pg_ctl kill TERM <pid>, the behavior is exactly the same
> (terminated backend doesn't get the message (FATAL:  terminating
> connection due to administrator command), but it is present in
> postmaster log.
>
> > So the question seems to be why libpq isn't reporting that
> > message before it detects connection-closed.
> >
> > This triggered a vague memory, and after a bit of archives-digging,
> > I found this thread from a few months back:
> >
> >
https://www.postgresql.org/message-id/flat/CA%2BhUKGJowQypXSKsjws9A%2BnEQDD0-mExHZqFXtJ09N209rCO5A%40mail.gmail.com#0629f079bc59ecdaa0d6ac9f8f2c18ac
> >
> > in which it's alleged that Windows' TCP stack is flat-out
> > broken and will drop not-yet-delivered data when the server
> > closes the connection.
> >
> > If that's true, it's pretty nasty.  Windows is about the
> > last platform where I'd want us to have behavior like this,
> > because we *will* get bug reports about it from novices.
> >
> > If there's no other workaround, I'm tempted to propose
> >
> > #ifdef WIN32
> >         pg_sleep(1 second);
> > #endif
> >
> > or something close to that, before we close the socket.
> >
>
> I can experiment with this or if something else occurred to me.
>

The above experiment suggested by you works and the test passes after
this in the local Windows setup.   One more thing, I think we don't
need to do above for graceful exits and during socket_close we have
access to exit_status_code which can be used in this context.  I have
attached the patch for this.  I think something on these lines is even
suggested by MSDN [1], but I am not sure.

I have also tried calling closesocket() explicitly in our function
socket_close which has changed the error message to "could not receive
data from server: Software caused connection abort
(0x00002745/10053)".  I am not sure if it is any better.  Another
thing I have tried to use is SO_LINGER option of socket but that
didn't work out.

> > Or we could revert the whole feature.
> >
>
> Yeah, that is also one possibility, but I think given we already have
> this behavior in existing features, it is better to either come up
> with some solution or maybe mention in docs that in such cases users
> need to check postmaster log to know the actual reason.
>
> I think we can further explore this, but for now, we might want to (a)
> revert this test, or (b) change the expected output to match.
>

I think if we decide to go with the above solution for Windows, then
we can keep the current test as it is or probably remove one test
related to <pid> test, otherwise, it is better to go with (a) for now
and once we decide any solution for this we can again commit these
tests.


[1] - https://docs.microsoft.com/en-us/windows/win32/api/winsock/nf-winsock-closesocket
See below note in the above link:
Using the closesocket or shutdown functions with SD_SEND or SD_BOTH
results in a RELEASE signal being sent out on the control channel. Due
to ATM's use of separate signal and data channels, it is possible that
a RELEASE signal could reach the remote end before the last of the
data reaches its destination, resulting in a loss of that data. One
possible solutions is programming a sufficient delay between the last
data sent and the closesocket or shutdown function calls for an ATM
socket.
Half close is not supported by ATM.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgsql-committers by date:

Previous
From: Peter Eisentraut
Date:
Subject: pgsql: Make allow_system_table_mods settable at run time
Next
From: Peter Eisentraut
Date:
Subject: pgsql: Small code simplification