Thread: [PATCH] Fix hostaddr crash during non-blocking cancellation

[PATCH] Fix hostaddr crash during non-blocking cancellation

From
Jacob Champion
Date:
Hi all,

A connection with only a hostaddr (no host) can't be cancelled via
PQcreateCancel(), because we'll crash in emitHostIdentityInfo(). The
problem is that the synthetic connhost entry we've created for
cancellation has an incorrect type field, which causes the following
code to make bad decisions if connhost[0].host is NULL:

> emitHostIdentifyInfo(...)
> {
>     ...
>     if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
>         displayed_host = conn->connhost[conn->whichhost].hostaddr;
>     else
>         displayed_host = conn->connhost[conn->whichhost].host;
> ...
>     if (conn->connhost[conn->whichhost].type != CHT_HOST_ADDRESS &&
>         host_addr[0] &&
>         strcmp(displayed_host, host_addr) != 0)  <- crashes here

I think the solution is just to copy over the correct type, as done in
the attached 0001.

Putting in a regression test requires us to once again answer the
question of "where do we test TCP-only features". I'd like to have a
PG_TEST_EXTRA entry for these, so 0002 adds a `tcp` group. That's
going to need more debate, and 002_tcp.pl is very quick-and-dirty, but
I also _really_ want to stop throwing tests away just because we don't
have a nice place to put them... [1]

So: if 0001 looks good, I propose that I backpatch it after beta1, but
hold onto 0002 until REL_18_STABLE is split off. Then we can figure
out the "test TCP" semantics for the full 19 cycle, and maybe
eventually backpatch tests once we're happy with how they work.

WDYT?

--Jacob

[1] https://postgr.es/m/flat/CAOYmi%2Bkx8eOmKj01dV4vSBeq9pvqR8dt6rGw%2BB_pBOE2_GOj%2Bg%40mail.gmail.com

Attachment
Jacob Champion <jacob.champion@enterprisedb.com> writes:
> A connection with only a hostaddr (no host) can't be cancelled via
> PQcreateCancel(), because we'll crash in emitHostIdentityInfo(). The
> problem is that the synthetic connhost entry we've created for
> cancellation has an incorrect type field, which causes the following
> code to make bad decisions if connhost[0].host is NULL:

I hadn't noticed (or maybe I forgot) this thread, so when the
same problem was reported at [1] I just went ahead and pushed the
submitted patch, which is only cosmetically different from your 0001.
Apologies for treading on your toes.

As for the question of how to test this sort of thing, I'm not
too excited about the narrow-gauge test case your 0002 proposes.
What I did for manual testing in [1] was to hack the postgres_fdw
tests to connect using hostaddr instead of the default.  I think
formalizing that sort of approach would yield much better coverage.
I don't have any specific ideas about how to do it, though.
Maybe get our tests to respond to an environment variable that
allows overriding the default choices of connection properties?

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/18974-575f02b2168b36b3%40postgresql.org