Re: RFC: Async query processing - Mailing list pgsql-hackers
From | Claudio Freire |
---|---|
Subject | Re: RFC: Async query processing |
Date | |
Msg-id | CAGTBQpak7r6Woyc4FqK4C+Yh7=9w1OMBxETxivCP=sg_s35Ytg@mail.gmail.com Whole thread Raw |
In response to | Re: RFC: Async query processing (Florian Weimer <fweimer@redhat.com>) |
Responses |
Re: RFC: Async query processing
|
List | pgsql-hackers |
On Fri, Jan 3, 2014 at 10:22 AM, Florian Weimer <fweimer@redhat.com> wrote: > On 01/02/2014 07:52 PM, Claudio Freire wrote: > >>> No, because this doesn't scale automatically with the bandwidth-delay >>> product. It also requires that the client buffers queries and their >>> parameters even though the network has to do that anyway. >> >> >> Why not? I'm talking about transport-level packets, btw, not libpq >> frames/whatever. >> >> Yes, the network stack will sometimes do that. But the it doesn't have >> to do it. It does it sometimes, which is not the same. > > > The network inevitably buffers because the speed of light is not infinite. > > Here's a concrete example. Suppose the server is 100ms away, and you want > to send data at a constant rate of 10 Mbps. The server needs to acknowledge > the data you sent, but this acknowledgment arrives after 200 ms. As a > result, you've sent 2 Mbits before the acknowledgment arrives, so the > network appears to have buffered 250 KB. This effect can actually be used > for data storage, called "delay line memory", but it is somewhat out of > fashion now. ... >> So, trusting the network start to do the quick start won't work. For >> steady streams of queries, it will work. But not for short bursts, >> which will be the most heavily used case I believe (most apps create >> short bursts of inserts and not continuous streams at full bandwidth). > > > Loading data into the database isn't such an uncommon task. Not everything > is OLTP. Truly, but a sustained insert stream of 10 Mbps is certainly way beyond common non-OLTP loads. This is far more specific than non-OLTP. Buffering will benefit the vast majority of applications that don't do steady, sustained query streams. Which is the vast majority of applications. An ORM doing a flush falls in this category, so it's an overwhelmingly common case. >> And buffering algorithms are quite platform-dependent anyway, so it's >> not the best idea to make libpq highly reliant on them. > > > That is why I think libpq needs to keep sending until the first response > from the server arrives. Batching a fixed number of INSERTs together in a > single conceptual query does not achieve auto-tuning to the buffering > characteristics of the path. Not on its own, but it does improve thoughput during slow start, which benefits OLTP, which is a hugely common use case. As you say, the network will then auto-tune when the query stream is consistent enough, so what's the problem with explicitly buffering a little then?
pgsql-hackers by date: