Re: Flushing large data immediately in pqcomm - Mailing list pgsql-hackers
From | Melih Mutlu |
---|---|
Subject | Re: Flushing large data immediately in pqcomm |
Date | |
Msg-id | CAGPVpCTfzhiOCWPwpRpvV6EZU0egJix4jNObp_OkhfZESdPbFQ@mail.gmail.com Whole thread Raw |
In response to | Re: Flushing large data immediately in pqcomm (Heikki Linnakangas <hlinnaka@iki.fi>) |
List | pgsql-hackers |
Hi Heikki,
Heikki Linnakangas <hlinnaka@iki.fi>, 29 Oca 2024 Pzt, 19:12 tarihinde şunu yazdı:
> Proposed change modifies socket_putmessage to send any data larger than
> 8K immediately without copying it into the send buffer. Assuming that
> the send buffer would be flushed anyway due to reaching its limit, the
> patch just gets rid of the copy part which seems unnecessary and sends
> data without waiting.
If there's already some data in PqSendBuffer, I wonder if it would be
better to fill it up with data, flush it, and then send the rest of the
data directly. Instead of flushing the partial data first. I'm afraid
that you'll make a tiny call to secure_write(), followed by a large one,
then a tine one again, and so forth. Especially when socket_putmessage
itself writes the msgtype and len, which are tiny, before the payload.
I agree that I could do better there without flushing twice for both PqSendBuffer and input data. PqSendBuffer always has some data, even if it's tiny, since msgtype and len are added.
Perhaps we should invent a new pq_putmessage() function that would take
an input buffer with 5 bytes of space reserved before the payload.
pq_putmessage() could then fill in the msgtype and len bytes in the
input buffer and send that directly. (Not wedded to that particular API,
but something that would have the same effect)
I thought about doing this. The reason why I didn't was because I think that such a change would require adjusting all input buffers wherever pq_putmessage is called, and I did not want to touch that many different places. These places where we need pq_putmessage might not be that many though, I'm not sure.
> This change affects places where pq_putmessage is used such as
> pg_basebackup, COPY TO, walsender etc.
>
> I did some experiments to see how the patch performs.
> Firstly, I loaded ~5GB data into a table [1], then ran "COPY test TO
> STDOUT". Here are perf results of both the patch and HEAD > ...
> The patch brings a ~5% gain in socket_putmessage.
>
> [1]
> CREATE TABLE test(id int, name text, time TIMESTAMP);
> INSERT INTO test (id, name, time) SELECT i AS id, repeat('dummy', 100)
> AS name, NOW() AS time FROM generate_series(1, 100000000) AS i;
I'm surprised by these results, because each row in that table is < 600
bytes. PqSendBufferSize is 8kB, so the optimization shouldn't kick in in
that test. Am I missing something?
You're absolutely right. I made a silly mistake there. I also think that the way I did perf analysis does not make much sense, even if one row of the table is greater than 8kB.
Here are some quick timing results after being sure that it triggers this patch's optimization. I need to think more on how to profile this with perf. I hope to share proper results soon.
I just added a bit more zeros [1] and ran [2] (hopefully measured the correct thing)
HEAD:
real 2m48,938s
user 0m9,226s
sys 1m35,342s
Patch:
real 2m40,690s
user 0m8,492s
sys 1m31,001s
[1]
INSERT INTO test (id, name, time) SELECT i AS id, repeat('dummy', 10000) AS name, NOW() AS time FROM generate_series(1, 1000000) AS i;
[2]
rm /tmp/dummy && echo 3 | sudo tee /proc/sys/vm/drop_caches && time psql -d postgres -c "COPY test TO STDOUT;" > /tmp/dummy
Thanks,
Melih Mutlu
Microsoft
pgsql-hackers by date: