Thread: Entities created in one query not available in another in extended protocol

Entities created in one query not available in another in extended protocol

From
Shay Rojansky
Date:
In Npgsql, the .NET driver for PostgreSQL, we've switched from simple to extended protocol and have received a user complaint.

It appears that when we send two messages in an extended protocol (so two Parse/Bind/Execute followed by a single Sync), where the first one creates some entity (function, table), and the second one can't query that entity (not found). This isn't terribly important but does seem a bit odd, I wanted to make sure you're aware of this.

Thanks,

Shay
On 11 June 2015 at 11:20, Shay Rojansky <roji@roji.org> wrote:
In Npgsql, the .NET driver for PostgreSQL, we've switched from simple to extended protocol and have received a user complaint.

It appears that when we send two messages in an extended protocol (so two Parse/Bind/Execute followed by a single Sync), where the first one creates some entity (function, table), and the second one can't query that entity (not found). This isn't terribly important but does seem a bit odd, I wanted to make sure you're aware of this.

Sounds somewhat unlikely, but thank you for the report. Can we see a test case?

Most commonly in such cases the first request failed and error messages weren't checked before running second message.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Jun 11, 2015 at 5:38 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 11 June 2015 at 11:20, Shay Rojansky <roji@roji.org> wrote:
>>
>> In Npgsql, the .NET driver for PostgreSQL, we've switched from simple to
>> extended protocol and have received a user complaint.
>>
>> It appears that when we send two messages in an extended protocol (so two
>> Parse/Bind/Execute followed by a single Sync), where the first one creates
>> some entity (function, table), and the second one can't query that entity
>> (not found). This isn't terribly important but does seem a bit odd, I wanted
>> to make sure you're aware of this.
>
> Sounds somewhat unlikely, but thank you for the report. Can we see a test
> case?

Actually, I think I've seen this before.   The code that handles the
Sync message does this:
                       case 'S':                       /* sync */
pq_getmsgend(&input_message);                              finish_xact_command();
send_ready_for_query= true;                               break;
 

finish_xact_command() calls CommitTransactionCommand(), which does
CommandCounterIncrement() or CommitTransaction() as appropriate.  So
without the Sync, I think it's expected that you don't see the results
of the previous command.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Simon Riggs <simon@2ndQuadrant.com> writes:
> On 11 June 2015 at 11:20, Shay Rojansky <roji@roji.org> wrote:
>> It appears that when we send two messages in an extended protocol (so two
>> Parse/Bind/Execute followed by a single Sync), where the first one creates
>> some entity (function, table), and the second one can't query that entity
>> (not found). This isn't terribly important but does seem a bit odd, I
>> wanted to make sure you're aware of this.

> Sounds somewhat unlikely, but thank you for the report. Can we see a test
> case?

> Most commonly in such cases the first request failed and error messages
> weren't checked before running second message.

I'm wondering if it was really more like
Parse/Parse/Bind/Bind/Execute/Execute/Sync, in which case the described
behavior wouldn't be too surprising at all.

I do note that if a transaction is implicitly started to execute these
messages, it's not closed until Sync.  But that should only affect the
visibility of the results to other sessions, not to the current one.
There's definitely a CommandCounterIncrement in exec_execute_message ...
        regards, tom lane



I just understood the same thing Tom wrote, yes, Npgsql (currently) sends Parse for the second command before sending Execute for the first one. I will look into that implementation decision. It might be worth looking into Simon's comment though, I'll report if I still see the same problematic behavior after reordering the messages (assuming we do reorder).

Thanks for your inputs...

On Thu, Jun 11, 2015 at 5:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Simon Riggs <simon@2ndQuadrant.com> writes:
> On 11 June 2015 at 11:20, Shay Rojansky <roji@roji.org> wrote:
>> It appears that when we send two messages in an extended protocol (so two
>> Parse/Bind/Execute followed by a single Sync), where the first one creates
>> some entity (function, table), and the second one can't query that entity
>> (not found). This isn't terribly important but does seem a bit odd, I
>> wanted to make sure you're aware of this.

> Sounds somewhat unlikely, but thank you for the report. Can we see a test
> case?

> Most commonly in such cases the first request failed and error messages
> weren't checked before running second message.

I'm wondering if it was really more like
Parse/Parse/Bind/Bind/Execute/Execute/Sync, in which case the described
behavior wouldn't be too surprising at all.

I do note that if a transaction is implicitly started to execute these
messages, it's not closed until Sync.  But that should only affect the
visibility of the results to other sessions, not to the current one.
There's definitely a CommandCounterIncrement in exec_execute_message ...

                        regards, tom lane

On 2015-06-11 10:50:31 -0400, Tom Lane wrote:
> I do note that if a transaction is implicitly started to execute these
> messages, it's not closed until Sync.  But that should only affect the
> visibility of the results to other sessions, not to the current one.
> There's definitely a CommandCounterIncrement in exec_execute_message ...

exec_execute_message() only does so if the command has run to
completion. Shay, Is it possible that a row limit was used and the
commands didn't run fully?



Andres Freund <andres@anarazel.de> writes:
> On 2015-06-11 10:50:31 -0400, Tom Lane wrote:
>> I do note that if a transaction is implicitly started to execute these
>> messages, it's not closed until Sync.  But that should only affect the
>> visibility of the results to other sessions, not to the current one.
>> There's definitely a CommandCounterIncrement in exec_execute_message ...

> exec_execute_message() only does so if the command has run to
> completion. Shay, Is it possible that a row limit was used and the
> commands didn't run fully?

That wouldn't affect utility statements like CREATE FUNCTION or CREATE
TABLE, though.
        regards, tom lane



On 11 June 2015 at 16:56, Shay Rojansky <roji@roji.org> wrote:

Npgsql (currently) sends Parse for the second command before sending Execute for the first one.

Look no further than that.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Thanks everyone for your time (or rather sorry for having wasted it).

Just in case it's interesting to you... The reason we implemented things this way is in order to avoid a deadlock situation - if we send two queries as P1/D1/B1/E1/P2/D2/B2/E2, and the first query has a large resultset, PostgreSQL may block writing the resultset, since Npgsql isn't reading it at that point. Npgsql on its part may get stuck writing the second query (if it's big enough) since PostgreSQL isn't reading on its end (thanks to Emil Lenngren for pointing this out originally).

Of course this isn't an excuse for anything, we're looking into ways of solving this problem differently in our driver implementation.

Shay

On Thu, Jun 11, 2015 at 6:17 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 11 June 2015 at 16:56, Shay Rojansky <roji@roji.org> wrote:

Npgsql (currently) sends Parse for the second command before sending Execute for the first one.

Look no further than that.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

On 11 June 2015 at 22:12, Shay Rojansky <roji@roji.org> wrote:
Thanks everyone for your time (or rather sorry for having wasted it).

Just in case it's interesting to you... The reason we implemented things this way is in order to avoid a deadlock situation - if we send two queries as P1/D1/B1/E1/P2/D2/B2/E2, and the first query has a large resultset, PostgreSQL may block writing the resultset, since Npgsql isn't reading it at that point. Npgsql on its part may get stuck writing the second query (if it's big enough) since PostgreSQL isn't reading on its end (thanks to Emil Lenngren for pointing this out originally).

That part does sound like a problem that we have no good answer to. Sounds worth starting a new thread on that.
 
Of course this isn't an excuse for anything, we're looking into ways of solving this problem differently in our driver implementation.

Shay

On Thu, Jun 11, 2015 at 6:17 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 11 June 2015 at 16:56, Shay Rojansky <roji@roji.org> wrote:

Npgsql (currently) sends Parse for the second command before sending Execute for the first one.

Look no further than that.



--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Simon Riggs <simon@2ndquadrant.com> writes:
> On 11 June 2015 at 22:12, Shay Rojansky <roji@roji.org> wrote:
>> Just in case it's interesting to you... The reason we implemented things
>> this way is in order to avoid a deadlock situation - if we send two queries
>> as P1/D1/B1/E1/P2/D2/B2/E2, and the first query has a large resultset,
>> PostgreSQL may block writing the resultset, since Npgsql isn't reading it
>> at that point. Npgsql on its part may get stuck writing the second query
>> (if it's big enough) since PostgreSQL isn't reading on its end (thanks to
>> Emil Lenngren for pointing this out originally).

> That part does sound like a problem that we have no good answer to. Sounds
> worth starting a new thread on that.

I do not accept that the backend needs to deal with that; it's the
responsibility of the client side to manage buffering properly if it is
trying to overlap sending the next query with receipt of data from a
previous one.  See commit 2a3f6e368 for a related issue in libpq.
        regards, tom lane



Re: Entities created in one query not available in another in extended protocol

From
Sehrope Sarkuni
Date:
The JDBC driver tries to handle this by estimating how much data has been buffered. It mainly comes up when executing batch INSERTS as a large number of statements may be sent to the backend prior to reading back any results.

There's a nice write up of the potential deadlock and the driver's logic to avoid it here:



Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/

On 12 June 2015 at 20:06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
> On 11 June 2015 at 22:12, Shay Rojansky <roji@roji.org> wrote:
>> Just in case it's interesting to you... The reason we implemented things
>> this way is in order to avoid a deadlock situation - if we send two queries
>> as P1/D1/B1/E1/P2/D2/B2/E2, and the first query has a large resultset,
>> PostgreSQL may block writing the resultset, since Npgsql isn't reading it
>> at that point. Npgsql on its part may get stuck writing the second query
>> (if it's big enough) since PostgreSQL isn't reading on its end (thanks to
>> Emil Lenngren for pointing this out originally).

> That part does sound like a problem that we have no good answer to. Sounds
> worth starting a new thread on that.

I do not accept that the backend needs to deal with that; it's the
responsibility of the client side to manage buffering properly if it is
trying to overlap sending the next query with receipt of data from a
previous one.  See commit 2a3f6e368 for a related issue in libpq.

Then it's our responsibility to define what "manage buffering properly" means and document it.

People should be able to talk to us without risk of deadlock.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tom, I agree this is entirely a client-side issue. Regardless, as Simon says it would be useful to have some documentation for client implementors.

Sehrope, thanks for the JDBC link! I was actually thinking of going about it another way in Npgsql:
  1. Send messages normally until the first Execute message is sent.
  2. From that point on, socket writes should simply be non-blocking. As long as buffers aren't full, there's no issue, we continue writing. The moment a non-blocking write exits because it would block, we transfer control to the user, who can now read data from queries (the ADO.NET.API allows for multiple resultsets).
  3. When the user finishes processing the resultsets, control is transferred back to Npgsql which continues sending messages (back to step 1).
This approach has the advantage of not caring about buffer sizes or trying to assume how many bytes are sent by the server: we simply write as much as we can without blocking, then switch to reading until we've exhausted outstanding data, and back to writing. The main issue I'm concerned about is SSL/TLS, which is a layer on top of the sockets and which might not work well with non-blocking sockets...

Any comments?

Shay

On Sat, Jun 13, 2015 at 5:08 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 12 June 2015 at 20:06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
> On 11 June 2015 at 22:12, Shay Rojansky <roji@roji.org> wrote:
>> Just in case it's interesting to you... The reason we implemented things
>> this way is in order to avoid a deadlock situation - if we send two queries
>> as P1/D1/B1/E1/P2/D2/B2/E2, and the first query has a large resultset,
>> PostgreSQL may block writing the resultset, since Npgsql isn't reading it
>> at that point. Npgsql on its part may get stuck writing the second query
>> (if it's big enough) since PostgreSQL isn't reading on its end (thanks to
>> Emil Lenngren for pointing this out originally).

> That part does sound like a problem that we have no good answer to. Sounds
> worth starting a new thread on that.

I do not accept that the backend needs to deal with that; it's the
responsibility of the client side to manage buffering properly if it is
trying to overlap sending the next query with receipt of data from a
previous one.  See commit 2a3f6e368 for a related issue in libpq.

Then it's our responsibility to define what "manage buffering properly" means and document it.

People should be able to talk to us without risk of deadlock.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Shay Rojansky <roji@roji.org> writes:
> [ rely on non-blocking sockets to avoid deadlock ]

Yeah, that's pretty much the approach libpq has taken: write (or read)
when you can, but press on when you can't.

> The main issue I'm concerned about
> is SSL/TLS, which is a layer on top of the sockets and which might not work
> well with non-blocking sockets...

We have not had word of any such problem with libpq.  It's possible that
the intersection of SSL users with non-blocking-mode users is nil, but
I kinda doubt that.  You do need to interpret openssl's return codes
correctly ...
        regards, tom lane





On Sun, Jun 14, 2015 at 6:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Shay Rojansky <roji@roji.org> writes:
> [ rely on non-blocking sockets to avoid deadlock ]

Yeah, that's pretty much the approach libpq has taken: write (or read)
when you can, but press on when you can't.

Good to hear.

> The main issue I'm concerned about
> is SSL/TLS, which is a layer on top of the sockets and which might not work
> well with non-blocking sockets...

We have not had word of any such problem with libpq.  It's possible that
the intersection of SSL users with non-blocking-mode users is nil, but
I kinda doubt that.  You do need to interpret openssl's return codes
correctly ...

I don't think there's a problem with non-blocking I/O and SSL per se, the question is about the .NET TLS/SSL implementation Npgsql uses - so it's really totally unrelated to PostgreSQL...

Shay