Thread: Re: pgsql: Add WL_EXIT_ON_PM_DEATH pseudo-event.

Re: pgsql: Add WL_EXIT_ON_PM_DEATH pseudo-event.

From
Christoph Berg
Date:
Re: Thomas Munro 2018-11-23 <E1gQ6IU-0002sR-Fm@gemulon.postgresql.org>
> Add WL_EXIT_ON_PM_DEATH pseudo-event.

I think this broke something:

TRAP: FailedAssertion(»!(!IsUnderPostmaster || (wakeEvents & (1 << 5)) || (wakeEvents & (1 << 4)))«, Datei:
»/build/postgresql-12-JElZNq/postgresql-12-12~~devel~20181124.1158/build/../src/backend/storage/ipc/latch.c«,Zeile:
389)
2018-11-24 15:20:43.193 CET [17834] LOG:  Serverprozess (PID 18425) wurde von Signal 6 beendet: Aborted

I can trigger it just by opening an ssl connection, non-ssl tcp
connections are fine.

Debian unstable/amd64.

Christoph


Re: pgsql: Add WL_EXIT_ON_PM_DEATH pseudo-event.

From
Thomas Munro
Date:
On Sun, Nov 25, 2018 at 3:38 AM Christoph Berg <myon@debian.org> wrote:
> Re: Thomas Munro 2018-11-23 <E1gQ6IU-0002sR-Fm@gemulon.postgresql.org>
> > Add WL_EXIT_ON_PM_DEATH pseudo-event.
>
> I think this broke something:
>
> TRAP: FailedAssertion(»!(!IsUnderPostmaster || (wakeEvents & (1 << 5)) || (wakeEvents & (1 << 4)))«, Datei:
»/build/postgresql-12-JElZNq/postgresql-12-12~~devel~20181124.1158/build/../src/backend/storage/ipc/latch.c«,Zeile:
389)
> 2018-11-24 15:20:43.193 CET [17834] LOG:  Serverprozess (PID 18425) wurde von Signal 6 beendet: Aborted
>
> I can trigger it just by opening an ssl connection, non-ssl tcp
> connections are fine.

Thanks.  I was initially surprised that this didn't come up in
check-world, but I see now that I need to go and add
PG_TEST_EXTRA="ssl ldap" to my testing routine (and cfbot's).
Reproduced here, and it's a case where we were not handling postmaster
death, which exactly what this assertion was designed to find.  The
following is one way to fix the assertion failure, though I'm not sure
if it would be better to request WL_POSTMASTER_DEATH and generate a
FATAL error like secure_read() does:

--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -406,9 +406,9 @@ aloop:
                                 * StartupPacketTimeoutHandler() which
directly exits.
                                 */
                                if (err == SSL_ERROR_WANT_READ)
-                                       waitfor = WL_SOCKET_READABLE;
+                                       waitfor = WL_SOCKET_READABLE |
WL_EXIT_ON_PM_DEATH;
                                else
-                                       waitfor = WL_SOCKET_WRITEABLE;
+                                       waitfor = WL_SOCKET_WRITEABLE
| WL_EXIT_ON_PM_DEATH;

--
Thomas Munro
http://www.enterprisedb.com


Re: pgsql: Add WL_EXIT_ON_PM_DEATH pseudo-event.

From
Thomas Munro
Date:
On Sun, Nov 25, 2018 at 12:59 PM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Sun, Nov 25, 2018 at 3:38 AM Christoph Berg <myon@debian.org> wrote:
> > TRAP: FailedAssertion(»!(!IsUnderPostmaster || (wakeEvents & (1 << 5)) || (wakeEvents & (1 << 4)))«, Datei:
»/build/postgresql-12-JElZNq/postgresql-12-12~~devel~20181124.1158/build/../src/backend/storage/ipc/latch.c«,Zeile:
389)
> > 2018-11-24 15:20:43.193 CET [17834] LOG:  Serverprozess (PID 18425) wurde von Signal 6 beendet: Aborted

Fix pushed.

By way of penance, I have now configured PG_TEST_EXTRA="ssl ldap
kerberos" for my build farm animals elver and eelpout.  elver should
pass at the next build, as I just tested it with --nosend, but eelpout
is so slow I'll just take my chances see if that works.  I'll also
review the firewall config on those VMs, since apparently everyone is
too chicken to run those tests, perhaps for those sorts of reasons.
I've also set those tests up for cfbot, which would have caught this
when draft patches were posted, and also enabled -Werror on cfbot
which would have caught a GCC warning I missed because I usually
develop/test with clang.

--
Thomas Munro
http://www.enterprisedb.com


Re: pgsql: Add WL_EXIT_ON_PM_DEATH pseudo-event.

From
Tom Lane
Date:
Thomas Munro <thomas.munro@enterprisedb.com> writes:
> Fix pushed.
> By way of penance, I have now configured PG_TEST_EXTRA="ssl ldap
> kerberos" for my build farm animals elver and eelpout.  elver should
> pass at the next build, as I just tested it with --nosend, but eelpout
> is so slow I'll just take my chances see if that works.

Nope :-(.  Looks like something about key length ... probably just
misconfiguration?

> I'll also
> review the firewall config on those VMs, since apparently everyone is
> too chicken to run those tests, perhaps for those sorts of reasons.

I think in many cases the answer is just "it's not in the default
buildfarm configuration".  I couldn't think of a strong reason not
to run the ssl check on longfin, so I've just updated that to do so.

            regards, tom lane


Re: pgsql: Add WL_EXIT_ON_PM_DEATH pseudo-event.

From
Thomas Munro
Date:
On Mon, Nov 26, 2018 at 6:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@enterprisedb.com> writes:
> > Fix pushed.
> > By way of penance, I have now configured PG_TEST_EXTRA="ssl ldap
> > kerberos" for my build farm animals elver and eelpout.  elver should
> > pass at the next build, as I just tested it with --nosend, but eelpout
> > is so slow I'll just take my chances see if that works.
>
> Nope :-(.  Looks like something about key length ... probably just
> misconfiguration?

It seems that we have keys in our tree that are unacceptable to
OpenSSL 1.1.1 as shipped in Debian buster:

2018-11-25 20:32:22.519 UTC [26882] FATAL:  could not load server
certificate file "server-cn-only.crt": ee key too small

That's what you get if you use the libssl-dev package (1.1.1a-1), but
you can still install libssl1.0-dev (which uninstalls 1.1's dev
package).  I've  done that and it the ssl test passes on that machine,
so fingers crossed for the next build farm run.

I see now that Michael already wrote about this recently[1], but that
thread hasn't yet reached a conclusion.

[1] https://www.postgresql.org/message-id/flat/20180917131340.GE31460%40paquier.xyz

-- 
Thomas Munro
http://www.enterprisedb.com


Re: pgsql: Add WL_EXIT_ON_PM_DEATH pseudo-event.

From
Michael Paquier
Date:
On Mon, Nov 26, 2018 at 09:53:19AM +1300, Thomas Munro wrote:
> I see now that Michael already wrote about this recently[1], but that
> thread hasn't yet reached a conclusion.
>
> [1] https://www.postgresql.org/message-id/flat/20180917131340.GE31460%40paquier.xyz

Yes, I heard nothing but crickets on this one.  So what I have been
doing is just to update my SSL configuration when running the tests.
That's annoying...  Still not impossible to solve.  If there are extra
opinions to move on with a key replacement, I could always do so.
--
Michael

Attachment