Re: Slowness of extended protocol - Mailing list pgsql-hackers

From Shay Rojansky
Subject Re: Slowness of extended protocol
Date
Msg-id CADT4RqBYuPUW3LMTqTdOqBdScAm0JHkebEYQPZ-zCVX2r+7sXQ@mail.gmail.com
Whole thread Raw
In response to Re: Slowness of extended protocol  (Vladimir Sitnikov <sitnikov.vladimir@gmail.com>)
Responses Re: Slowness of extended protocol  (Dave Cramer <pg@fastcrypt.com>)
Re: Slowness of extended protocol  (Vladimir Sitnikov <sitnikov.vladimir@gmail.com>)
List pgsql-hackers


On Mon, Aug 15, 2016 at 3:16 PM, Vladimir Sitnikov <sitnikov.vladimir@gmail.com> wrote:
Vladimir>> Yes, that is what happens.
Vladimir>> The idea is not to mess with gucs.

Shay:> Wow... That is... insane...

Someone might say that "programming languages that enable side-effects
are insane". 
Lots of connection pools work by sharing the connections and it is up
to developer
if he can behave well (see "It is not" below)

The insane part is that there's really almost no reason to allow these side-effects... It's possible to efficiently reset connection state with a truly negligible impact on 


Shay> it's entirely reasonable for
Shay> more than one app to use use the same pool

Sharing connections between different applications might be not that good idea.

Not sure why... It seems like a bad idea mostly if your pool is leaking state.
 
However, I would not agree that "having out-of-process connection
pool" is the only sane
way to go.
I do admit there might be valid cases for out of process pooling,
however I do not agree
it is the only way to go. Not only "inprocess" is one of the ways,
"in-process" way is wildly used
in at least one of the top enterprise languages.

If you agree with that, you might agree that "in-process connection
pool that serves
exactly one application" might work in a reasonable fashion even
without DISCARD ALL.

I never said out-of-process pooling is the way to go - in the .NET world in-process pooling is very valid. But as I carefully said in my last email, the same problems you have with multiple applications can occur with multiple components within the same application. The process boundary isn't very important here - in some scenarios programmers choose process separation, in others they choose threads. The same basic problems that occur in one model can occur in the other, including usage strategies which make LRU caching very bad.
 
Shay> Even with in-process pools it's standard practice (and a good idea) to
Shay> reset state.

It is not. Can you quote where did you get that "standard practice is
to reset state" from?

I guess I take that back, I haven't actually made a thorough comparison here.

Shay> If some part of a big complex
Shay> application modified state, and then some other part happens to get that
Shay> physical connection, it's extremely hard to make sense of things.

Let's not go into "functional programming" vs "imperative programming"
discussion?
Of course you might argue that "programming in Haskell or OCaml or F#" makes
"extremely easy to make sense of things", but that's completely
another discussion.

I'm not sure what this is referring to... If you're talking about my comment that "isolation/separation of layers is a good thing in programming", then I don't think it's the function vs. imperative kind of argument.

Shay> One note - in Npgsql's implementation of persistent prepared statements,
Shay> instead of sending DISCARD ALL Npgsql sends the commands listed in
Shay> https://www.postgresql.org/docs/current/static/sql-discard.html,
except for
Shay> DEALLOCATE ALL. This cleans up state changes but leaves prepared
statements
Shay> in place.

Ok. At least you consider that "always discarding all the state" might be bad.

Yes I do. I actually implemented persistent prepared statements before this conversation started - I think it's a great performance booster. I'm still not sure if it should be opt-in or default, although I admit I'm leaning towards default. But that feature has very little to do with *implicit* preparation.

Shay> This is somewhat similar to the CPU reordering you
Shay> keep coming back to - it's totally invisible

I would disagree. CPU reordering is easily visible if you are dealing
with multithreaded case.
It can easily result in application bugs if application misses some
synchronization.

CPU reordering is very visible to regular programmers, and it is a compromise:
1) Developers enable compiler and CPU do certain "reorderings"
2) Developers agree to follow the rules like "missing synchronization
might screw things up"
3) In the result, the code gets executed faster.

The point is that AFAIK the same bugs that can result from reordering can also result from other basic conditions as well. If you're writing multithreaded code then you must handle synchronization - this is not a reordering-specific problem. Therefore if your program is multithreaded but doesn't do proper synchronization you have a bug - regardless of whether its manifestation is triggered by CPU reordering or not. I admit I'm not an expert on this and may be wrong (it would be interesting to know).
 
Vladimir> Just in case: PostgreSQL does not execute "discard all" on its own.

Shay> Of course it doesn't - it doesn't know anything about connection pooling,
Shay> it only knows about physical connections. When would it execute "discard
Shay> all" on its own?

That my point was for "pgpool aiming to look like a regular postgresql
connection".
The point was: "postgresql does not discard on its own, so pgpool
should not discard".

That makes no sense at all... PostgreSQL is not a pool, nor is it a driver. Within this conversation, DISCARD ALL is only relevant in the context of connection pooling. PostgreSQL only deals with physical connections, so there's no possible relevance here.
 
Shay> To enforce isolation, which is maybe the most important way for
programs to
Shay> be reliable - but this is a general theme which you don't seem to agree
Shay> with.

If you want to isolate something, you might better have a
per-application connection pool.
That way, if a particular application consumes all the connections, it
would not impact
other applications. If all the applications use the same
out-of-process pool, there might
be trouble of resource hogging.

Sometimes that's true, sometimes it's not... Different scenarios require different degrees of isolation. That's the thing about writing infrastructure like drivers and pools - it's not a good idea to assume too much about how people will use your component, and it's especially bad to impose things on them. It's perfectly reasonable to have two applications using the same pool - this removes pressure from PostgreSQL. Of course some attention has to be paid to resource hogging, but that doesn't mean pools should always be per-application.
 
Shay> Regardless, resetting state doesn't have to have a necessary effect
Shay> on response times/throughput.

Even if you do not reset prepared statements, "reset query" takes time.

It does only if you do it in a roundtrip of its own. When you close a pooled connection in Npgsql, the reset query is written to an internal buffer but not sent. The first query that actually gets sent by the user after opening will therefore have the reset query prepended to it (basically the reset query is batched). You can argue there's still some overhead there because of the extra PostgreSQL message, but that really seems like a negligible price to pay for the isolation advantages. And I accept that there should be an option for performance-hungry programmers to remove the reset query for extreme situations.
 
For instance: there's a common problem to "validate connections before use".
That is the pooler should ensure the connection is working before handling it
to the application.
Both Weblogic server, and HikariCP have those connection validation built in
and the validation is enabled by default.

However, it turns out that "connection validation" takes too much time,
it is visible in the application response times, etc, so they both implement a
grace period. That is "if the connection was recently used, it is
assumed to be fine".
Weblogic trusts 15 seconds by default, so if you borrow connections
each 10 seconds, then
they will not be tested.
Well, there's additional background validation, but my main point is
"even select 1"
is visible on the application response times.

That's all very true. When I started contributing to Npgsql (back in the 2.x days), it did a SELECT 1 validation roundtrip on each pooled connection open. This is obviously a very bad idea. It also doesn't make much sense, because connections can break at any point in time - why check only when a connection is returned from a pool? One thing Npgsql does to help, is to have an opt-in keepalive feature, which sends SELECT 1 after X seconds of inactivity. Aside from preventing nosy routers from killing connections after inactivity, it performs something like what you describe with the grace period, but the keepalive isn't tied to the actual open in any way.

But this interesting subject has nothing to do with the reset query, which can be done without an extra roundtrip as I said above.
 
The concept is "the implementation of PreparedStatement interface is
free to chose
how it will execute the queries". It can go with "server-prepare on
each execution",
it can go with "cache server-prepared statements", etc, etc.
The whole purpose of having that "vague API" is to enable database vendors to
make most sense of their databases.

What exactly does "server-prepare on each execution" means? Sending Parse on each execution? How can that be considered prepared at all? In my mind there's basically 2 possibilities:
1. You send Parse/Describe/Bind/Execute/Sync (or a single Query). This is the non-prepared mode (which I proposed to optimize originally).
2. You send Parse/Describe/Sync when prepareStatement is called (when the PreparedStatement instance is created, you'll forgive me if I'm too familiar with the JDBC API). Then, when the prepared statement is executed, you send Bind/Execute/Sync. This is a server-prepared (or simply prepared) statement.

Do you see some other mode of query execution? Does pgjdbc consider something "prepared" without it being the 2nd option above? Note that I'm genuinely interested in case I'm missing something.
 
Shay> Regardless of any optimizations you may be doing, in every database driver
Shay> I've ever seen in my life, "prepared" simply means "server-prepared". And
Shay> in every driver I've ever seen, there's an explicit API for
that. Therefore
Shay> server-prepare is something that's exposed to the developer, rather than
Shay> some internal detail left to the driver.

Please, take CPU example (or TCP example) seriously. Seriously.
CPU did not always had a L2 cache. L2 was invented to improve the performance
of existing and future applications.

The same applies to "prepared statement cache at the database driver level".
It is implemented in pgjdbc to improve the performance of existing and
future applications.

I really am listening to you, but I'm really not agreeing with you, so I'm going to repeat myself. CPU caching is an even clearer case than instruction reordering, in that it's totally invisible - and that is why your analogy breaks down. Of course L2 was invented to improve performance, but that doesn't mean that all caches are the same. More precisely, what I find objectionable about your approach is not any caching - it's the implicit or automatic preparation of statements. This practice isn't invisible in that a) it may cause errors that wouldn't have been there otherwise (e.g. because of DDL), and b) it imposes a resource drain on the server. The second point is very important: the L2 cache doesn't impose a resource drain on anyone - it's just there, speeding up your application. I hope that point makes it across - it's why I don't accept your analogy.
 
Vladimir>> Suppose backend can handle 20 server-prepared statements at most (if
Vladimir> using more it would run out of memory).
Vladimir>> Suppose an application has 100 statements with ".prepare()" call.
Vladimir>> I think it is reasonable for the DB driver to figure out
which statements
Vladimir> are most important and server-prepare just "20 most
important ones", > and
Vladimir> leave the rest 80 as regular non-prepared statements.

Shay> I seriously don't find it reasonable at all.

Would you please answer to "should db driver silently server-prepare
all the 100 statements
and crash the DB" question?

Sure, the answer is no. A driver shouldn't silently server-prepare anything. This is what I've been arguing all along (I'm starting to think we might have some communication failure here). But the way I understand the database APIs (including JDBC), if the programmer calls .prepare() on 100 statements this overloads the database, then that's expected behavior, because there's nothing silent about it. Calling .prepare() means "please server-prepare my statement". I really don't understand how it can have any other meaning.
 
Shay> Each connection fills
Shay> the cache, ejecting the other component's prepared statements from the
Shay> cache. As we switch between the two components, statements have to be
Shay> reprepared.

This is exactly what happens when several applications use the same CPU.
L2/L3 can be shared between cores, so if one cares on the performance of
a particular application, he should isolate the critical task to its
own CPU (or set of cores). 

The same principle applies to connection pool.
Either multiple pools should be used, or cache size increased, or more
sophisticated algorithms
should be used to provide better hit rate.

You're absolutely right that it happens with CPU caching, but that doesn't mean that it should be the same at the database driver level (see below).
 
Shay> Manual work here would allow picking truly hottest statements from each
Shay> app/component, and preparing those - allowing the hottest
statements across
Shay> both components to always remain live.
Shay> This is exactly what is meant by
Shay> "programmer knowledge" which the driver doesn't have - it only has its
Shay> recently-used logic which sometimes breaks down.

Let me exaggerate a bit.
Compiler optimizations sometimes break down (they produce not that
well performing code),
so all the optimizations should be disabled by default, and every
developer should
manually examine each line of code, identify hottest statements from each
app/component and assign variables to CPU registers.
This is exactly "programmer knowledge" which the compiler doesn't have.

Does it sound good to you?

Of course not. But I don't think it's a very valid analogy.

Here's how I see things, we have a pretty well-defined stack in the database world. ORMs exist on top of drivers as a way of allowing users to write higher-level code. It's possible to loosely see a database driver as, say, C, whereas using an ORM would be something like Java or C#, where you delegate memory management to the language runtime. You can even decide to drop down to "assembly", dumping the database driver and communicating with PostgreSQL in TCP to squeeze out some more performance (extreme scenario obviously).

All these are valid choices, and programmers should be able to choose at what layer they want to operate. By including all the high-level functionality (e.g. silent server preparation) in the driver layer, you're effectively saying that C shouldn't exist - there's no reason a programmer should ever choose a low-level approach.

And the thing I find hardest to understand here, if we leave ORMs aside for a minute, is that it's really trivial for programmers to explicit server-prepare their statements - as database APIs universally allow and encourage. Simply calling prepare on your statements is nowhere near the complexity that a compiler optimizer provides. Whereas it's totally unreasonable to expect programmers to do the work of the compiler optimizer, it seems really reasonable to ask them to think about server preparation, and even about multivalue inserts (at least if performance is important in the application).

The silent server preparation simply doesn't contribute that much, except when working with an ORM that doesn't support preparation. I've already said that for that case I think silent preparation is a *great* opt-in feature (and intend to implement it), but it's an exceptional thing that should be part of the ORM, not the driver.
 
Come on. Suppose you are writing a framework (ORM or whatever) that
happens to execute queries into the database.
Should that API have a special parameter "should_use_server_prepared"
for each and every method?
How should framework developer tell which statements should be
server-prepared and which should not?

As I said above, I think this is a critical point of misunderstand between us. The developers tells the driver which statements should be server-prepared by calling .prepareStatement(). I'm guessing you have a totally different understanding here.

I'm not going to respond to the part about dealing with prepared statements errors, since I think we've already covered that and there's nothing new being said. I don't find automatic savepointing acceptable, and a significant change of the PostgreSQL protocol to support this doesn't seem reasonable (but you can try proposing).

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: condition variables
Next
From: Tom Lane
Date:
Subject: Re: New version numbering practices