Re: Flushing large data immediately in pqcomm - Mailing list pgsql-hackers

From Melih Mutlu
Subject Re: Flushing large data immediately in pqcomm
Date
Msg-id CAGPVpCQ-265P-DY8hZADEKE5GO0-1NVB9kn7dH82BQgEUbdv1g@mail.gmail.com
Whole thread Raw
In response to Re: Flushing large data immediately in pqcomm  (Jelte Fennema-Nio <postgres@jeltef.nl>)
Responses Re: Flushing large data immediately in pqcomm
List pgsql-hackers
Hi,

PSA v3.

Jelte Fennema-Nio <postgres@jeltef.nl>, 21 Mar 2024 Per, 12:58 tarihinde şunu yazdı:
On Thu, 21 Mar 2024 at 01:24, Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> What if I do a simple comparison like PqSendStart == PqSendPointer instead of calling pq_is_send_pending()

Yeah, that sounds worth trying out. So the new suggestions to fix the
perf issues on small message sizes would be:

1. add "inline" to internal_flush function
2. replace pq_is_send_pending() with PqSendStart == PqSendPointer
3. (optional) swap the order of PqSendStart == PqSendPointer and len
>= PqSendBufferSize

I did all of the above changes and it seems like those resolved the regression issue. 
Since the previous results were with unix sockets, I share here the results of v3 when using unix sockets for comparison. 
Sharing only the case where all messages are 100 bytes, since this was when the regression was most visible.  

row size = 100 bytes, # of rows = 1000000
┌───────────┬────────────┬──────┬──────┬──────┬──────┬──────┐
│           │ 1400 bytes │ 2KB  │ 4KB  │ 8KB  │ 16KB │ 32KB │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ HEAD      │ 1106       │ 1006 │ 947  │ 920  │ 899  │ 888  │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ patch     │ 1094       │ 997  │ 943  │ 913  │ 894  │ 881  │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ no buffer │ 6389       │ 6195 │ 6214 │ 6271 │ 6325 │ 6211 │
└───────────┴────────────┴──────┴──────┴──────┴──────┴──────┘

David Rowley <dgrowleyml@gmail.com>, 21 Mar 2024 Per, 00:57 tarihinde şunu yazdı:
On Fri, 15 Mar 2024 at 01:46, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> - the "(int *) &len)" cast is not ok, and will break visibly on
> big-endian systems where sizeof(int) != sizeof(size_t).

I think fixing this requires adjusting the signature of
internal_flush_buffer() to use size_t instead of int.   That also
means that PqSendStart and PqSendPointer must also become size_t, or
internal_flush() must add local size_t variables to pass to
internal_flush_buffer and assign these back again to the global after
the call.  Upgrading the globals might be the cleaner option.

David

This is done too.

I actually tried to test it over a real network for a while. However, I couldn't get reliable-enough numbers with both HEAD and the patch due to network related issues. 
I've decided to go with Jelte's suggestion [1]  which is decreasing MTU of the loopback interface to 1500 and using localhost. 

Here are the results:

1- row size = 100 bytes, # of rows = 1000000
┌───────────┬────────────┬──────┬──────┬──────┬──────┬──────
│           │ 1400 bytes │ 2KB   │  4KB  │  8KB  │  16KB │  32KB │
├───────────┼────────────┼─────
─┼──────┼──────┼──────┼─────
│ HEAD      │ 1351       │ 1233  │ 1074  │  988  │  944  │  916  │
├───────────┼────────────┼──────
┼──────┼──────┼──────┼──────
│ patch     │ 1369       │ 1232  │ 1073  │  981  │  928  │  907  │
├───────────┼────────────┼─────
─┼──────┼──────┼──────┼──────
│ no buffer │ 14949      │ 14533 │ 14791 │ 14864 │ 14612 │ 14751 │
└───────────┴────────────┴─────
─┴──────┴──────┴──────┴──────

2-  row size = half of the rows are 1KB and rest is 10KB , # of rows = 1000000
┌───────────┬────────────┬───────┬───────┬───────┬───────┬───────┐
│           │ 1400 bytes │ 2KB   │ 4KB   │ 8KB   │ 16KB  │ 32KB  │
├───────────┼────────────┼───────┼───────┼───────┼───────┼───────┤
│ HEAD      │ 37212      │ 31372 │ 25520 │ 21980 │ 20311 │ 18864 │
├───────────┼────────────┼───────┼───────┼───────┼───────┼───────┤
│ patch     │ 23006      │ 23127 │ 23147 │ 22229 │ 20367 │ 19155 │
├───────────┼────────────┼───────┼───────┼───────┼───────┼───────┤
│ no buffer │ 30725      │ 31090 │ 30917 │ 30796 │ 30984 │ 30813 │
└───────────┴────────────┴───────┴───────┴───────┴───────┴───────┘

3-  row size = half of the rows are 1KB and rest is 1MB , # of rows = 1000
┌───────────┬────────────┬──────┬──────┬──────┬──────┬──────┐
│           │ 1400 bytes │ 2KB  │ 4KB  │ 8KB  │ 16KB │ 32KB │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ HEAD      │ 4296       │ 3713 │ 3040 │ 2711 │ 2528 │ 2449 │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ patch     │ 2401       │ 2411 │ 2404 │ 2374 │ 2395 │ 2408 │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ no buffer │ 2399       │ 2403 │ 2408 │ 2389 │ 2402 │ 2403 │
└───────────┴────────────┴──────┴──────┴──────┴──────┴──────┘

4-  row size = all rows are 1MB , # of rows = 1000
┌───────────┬────────────┬──────┬──────┬──────┬──────┬──────┐
│           │ 1400 bytes │ 2KB  │ 4KB  │ 8KB  │ 16KB │ 32KB │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ HEAD      │ 8335       │ 7370 │ 6017 │ 5368 │ 5009 │ 4843 │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ patch     │ 4711       │ 4722 │ 4708 │ 4693 │ 4724 │ 4717 │
├───────────┼────────────┼──────┼──────┼──────┼──────┼──────┤
│ no buffer │ 4704       │ 4712 │ 4746 │ 4728 │ 4709 │ 4730 │
└───────────┴────────────┴──────┴──────┴──────┴──────┴──────┘
 

[1]

Thanks,
--
Melih Mutlu
Microsoft
Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: documentation structure
Next
From: Daniel Gustafsson
Date:
Subject: Re: documentation structure