Re: RFC: Async query processing - Mailing list pgsql-hackers

From Claudio Freire
Subject Re: RFC: Async query processing
Date
Msg-id CAGTBQpak7r6Woyc4FqK4C+Yh7=9w1OMBxETxivCP=sg_s35Ytg@mail.gmail.com
Whole thread Raw
In response to Re: RFC: Async query processing  (Florian Weimer <fweimer@redhat.com>)
Responses Re: RFC: Async query processing
List pgsql-hackers
On Fri, Jan 3, 2014 at 10:22 AM, Florian Weimer <fweimer@redhat.com> wrote:
> On 01/02/2014 07:52 PM, Claudio Freire wrote:
>
>>> No, because this doesn't scale automatically with the bandwidth-delay
>>> product.  It also requires that the client buffers queries and their
>>> parameters even though the network has to do that anyway.
>>
>>
>> Why not? I'm talking about transport-level packets, btw, not libpq
>> frames/whatever.
>>
>> Yes, the network stack will sometimes do that. But the it doesn't have
>> to do it. It does it sometimes, which is not the same.
>
>
> The network inevitably buffers because the speed of light is not infinite.
>
> Here's a concrete example.  Suppose the server is 100ms away, and you want
> to send data at a constant rate of 10 Mbps.  The server needs to acknowledge
> the data you sent, but this acknowledgment arrives after 200 ms.  As a
> result, you've sent 2 Mbits before the acknowledgment arrives, so the
> network appears to have buffered 250 KB.  This effect can actually be used
> for data storage, called "delay line memory", but it is somewhat out of
> fashion now.
...
>> So, trusting the network start to do the quick start won't work. For
>> steady streams of queries, it will work. But not for short bursts,
>> which will be the most heavily used case I believe (most apps create
>> short bursts of inserts and not continuous streams at full bandwidth).
>
>
> Loading data into the database isn't such an uncommon task.  Not everything
> is OLTP.

Truly, but a sustained insert stream of 10 Mbps is certainly way
beyond common non-OLTP loads. This is far more specific than non-OLTP.

Buffering will benefit the vast majority of applications that don't do
steady, sustained query streams. Which is the vast majority of
applications. An ORM doing a flush falls in this category, so it's an
overwhelmingly common case.

>> And buffering algorithms are quite platform-dependent anyway, so it's
>> not the best idea to make libpq highly reliant on them.
>
>
> That is why I think libpq needs to keep sending until the first response
> from the server arrives.  Batching a fixed number of INSERTs together in a
> single conceptual query does not achieve auto-tuning to the buffering
> characteristics of the path.

Not on its own, but it does improve thoughput during slow start, which
benefits OLTP, which is a hugely common use case. As you say, the
network will then auto-tune when the query stream is consistent
enough, so what's the problem with explicitly buffering a little then?



pgsql-hackers by date:

Previous
From: Florian Weimer
Date:
Subject: Re: RFC: Async query processing
Next
From: Alvaro Herrera
Date:
Subject: Re: Add CREATE support to event triggers