Thread: Regression in pipeline mode in libpq 14.5

Regression in pipeline mode in libpq 14.5

From
Daniele Varrazzo
Date:
Hello,

I believe that pipeline mode was broken in libpq 14.5, likely after
the refactoring performed to solve the problem of the unexpected Close
messages sent on PQexecQuery [1].

The psycopg 3.1 test suite hangs when running with libpq 14.5
(reported at [2]). I have written a script to reproduce the issue,
which can be executed running:

```
git clone -b fix-350 git@github.com:psycopg/psycopg.git
cd psycopg
python3 -m venv .venv
source .venv/bin/activate
pip install -e ./psycopg
PSYCOPG_IMPL=debug python ./test-350.py
```

The script prints on stderr all the libpq calls and the be-fe trace.
You can find attached the two logs obtained running the script with
libpq 14.4 and 14.5. Differences can be seen online in [3].

The script runs, in Python:
```
    with conn.cursor() as cur:
        with conn.pipeline() as p:
            cur.execute("SELECT 1")
```

The execute() runs an implicit BEGIN, which is also executed in
pipeline mode. Exiting the pipeline() block causes a Sync. So we
expect 3 results in the pipeline (a COMMAND_OK after BEGIN, a
TUPLES_OK after SELECT, a PIPELINE_SYNC). At a glance I see the
following behaviours in 14.5 which seem errors:

- the result of the SELECT (TUPLES_OK) is lost.
- later, a PQisBusy() returns 1, but the following epoll() call blocks
and times out, nothing is received from the network.

Happy to know if we need to do something different to accommodate
changes in 14.5, however these seem regressions to me.

Thank you very much

-- Daniele

[1] https://www.postgresql.org/message-id/CA%2Bmi_8bvD0_CW3sumgwPvWdNzXY32itoG_16tDYRu_1S2gV2iw%40mail.gmail.com
[2] https://github.com/psycopg/psycopg/issues/350
[3] https://www.diffchecker.com/oe0yA6lu

Attachment

Re: Regression in pipeline mode in libpq 14.5

From
Alvaro Herrera
Date:
On 2022-Aug-14, Daniele Varrazzo wrote:

> The execute() runs an implicit BEGIN, which is also executed in
> pipeline mode. Exiting the pipeline() block causes a Sync. So we
> expect 3 results in the pipeline (a COMMAND_OK after BEGIN, a
> TUPLES_OK after SELECT, a PIPELINE_SYNC).

Hmm, it seems (judging only from comparing your two traces) that the
problem stems from the newly added hack to handle CloseComplete.  I'll
have a look later in the week.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"No hay ausente sin culpa ni presente sin disculpa" (Prov. francés)



Re: Regression in pipeline mode in libpq 14.5

From
Daniele Varrazzo
Date:
On Mon, 15 Aug 2022 at 17:24, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> Hmm, it seems (judging only from comparing your two traces) that the
> problem stems from the newly added hack to handle CloseComplete.  I'll
> have a look later in the week.

We worked around the problem in psycopg by dropping every use of
`PQsendQuery()` and only using `PQsendQueryParams()` for internal
queries too. So this is no more a blocker for our 3.1 release. I will
try to perform periodic test runs against Postgres master in order to
catch future breakages before a Postgres release.

Please find attached a smaller test to reproduce the issue. It's
written in Python and uses psycopg master branch, but it only uses
libpq calls so it can be easily converted to C or whatever is useful
to add to your test suite.

In order to run:

```
python3 -m venv venv
source venv/bin/activate
pip install "git+https://github.com/psycopg/psycopg.git@e5079184#subdirectory=psycopg&egg=psycopg"
python test-pipeline-bug.py
```

The script will succeed running with libpq 14.4 and fail running libpq
14.5. The difference in the traces is similar to what was attached
upthread.

Best regards

-- Daniele

Attachment

Re: Regression in pipeline mode in libpq 14.5

From
Alvaro Herrera
Date:
On 2022-Aug-14, Daniele Varrazzo wrote:

> I believe that pipeline mode was broken in libpq 14.5, likely after
> the refactoring performed to solve the problem of the unexpected Close
> messages sent on PQexecQuery [1].

So I've spent a lot of time trying to understand what is going on here,
and my impression is that this stuff is thoroughly broken, and I don't
know how to fix it.  So I propose to rip it out -- specifically: make it
an error to call PQsendQuery when in pipeline mode.  PQsendQueryParams
can be used instead, and all is well.  The problem is that that the
CloseComplete message remains a mess, and the hack I added made things
worse, or maybe it just moved the mess elsewhere.

More specifically, I propose to remove its handling from 15 and master;
but leave it in place in 14, to avoid breaking things in a minor release
if somebody is already using it and they haven't run into this
particular bug.

This should be OK for psycopg, since Daniele said he already stopped
using PQsendQuery in pipeline mode.

PS: it's quite likely that there *is* a way to fix it, if we're OK with
more coupling between fe-exec (PQgetResult) and fe-protocol3
(parseInput3).  But we probably don't want that and I don't want to
spend more time figuring out exactly how.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/



Re: Regression in pipeline mode in libpq 14.5

From
Alvaro Herrera
Date:
So it'd be as in the attached.

In writing this, I also noticed that the extended query protocol
emulation I wrote for PQsendQuery had a bug, so the traces that result
by using PQsendQueryParams instead have a small difference.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"You don't solve a bad join with SELECT DISTINCT" #CupsOfFail
https://twitter.com/connor_mc_d/status/1431240081726115845

Attachment