Re: Why is citext/regress failing on hamerkop? - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Why is citext/regress failing on hamerkop?
Date
Msg-id CA+hUKG+idKwmJY9g5n=jQJ99BdvzqACY_mbiWjp+4gsbvGk2nA@mail.gmail.com
Whole thread Raw
In response to Re: Why is citext/regress failing on hamerkop?  (Alexander Lakhin <exclusion@gmail.com>)
Responses Re: Why is citext/regress failing on hamerkop?
List pgsql-hackers
On Tue, May 14, 2024 at 9:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> 14.05.2024 03:38, Thomas Munro wrote:
> > I was beginning to suspect that lingering odour myself.  I haven't
> > look at the GSS code but I was imagining that what we have here is
> > perhaps not unsent data dropped on the floor due to linger policy
> > (unclean socket close on process exist), but rather that the server
> > didn't see the socket as ready to read because it lost track of the
> > FD_CLOSE somewhere because the client closed it gracefully, and our
> > server-side FD_CLOSE handling has always been a bit suspect.  I wonder
> > if the GSS code is somehow more prone to brokenness.  One thing we
> > learned in earlier problems was that abortive/error disconnections
> > generate FD_CLOSE repeatedly, while graceful ones give you only one.
> > In other words, if the other end politely calls closesocket(), the
> > server had better not miss the FD_CLOSE event, because it won't come
> > again.   That's what
> >
> > https://commitfest.postgresql.org/46/3523/
> >
> > is intended to fix.  Does it help here?  Unfortunately that's
> > unpleasantly complicated and unbackpatchable (keeping a side-table of
> > socket FDs and event handles, so we don't lose events between the
> > cracks).
>
> Yes, that cure helps here too. I've tested it on b282fa88d~1 (the last
> state when that patch set can be applied).

Thanks for checking, and generally for your infinite patience with all
these horrible Windows problems.

OK, so we know what the problem is here.  Here is the simplest
solution I know of for that problem.  I have proposed this in the past
and received negative feedback because it's a really gross hack.  But
I don't personally know what else to do about the back-branches (or
even if that complex solution is the right way forward for master).
The attached kludge at least has the [de]merit of being a mirror image
of the kludge that follows it for the "opposite" event.  Does this fix
it?

Attachment

pgsql-hackers by date:

Previous
From: Cary Huang
Date:
Subject: Re: Support tid range scan in parallel?
Next
From: Michael Paquier
Date:
Subject: Re: pgsql: Fix overread in JSON parsing errors for incomplete byte sequence