Thread: Proposal to provide the facility to set binary format output for specific OID's per session
Proposal to provide the facility to set binary format output for specific OID's per session
From
Dave Cramer
Date:
Greetings,
Jack Christensen the author of the go pgx driver had suggested Default result formats should be settable per session · Discussion #5 · postgresql-interfaces/enhancement-ideas (github.com)
The JDBC driver has a similar problem and defers switching to binary format until a statement has been reused 5 times; at which point we create a named prepared statement and incur the overhead of an extra round trip for the DESCRIBE statement. Because the extra round trip generally negates any performance enhancements that receiving the data in binary format may provide, we avoid using binary and receive everything in text format until we are sure the extra trip is worth it.
Connection pools further complicate the issue: We can't use named statements with connection pools since there is no binding of the connection to the client. As such in the JDBC driver we recommend turning off the ability to create a named statement and thus binary formats.
As a proof of concept I provide the attached patch which implements the ability to specify which oids will be returned in binary format per session.
IE set format_binary='20,21,25' for instance.
After which the specified oids will be output in binary format if there is no describe statement or even using simpleQuery.
Both the JDBC driver and the go driver can exploit this change with no changes. I haven't confirmed if other drivers would work without changes.
Furthermore jackc/postgresql_simple_protocol_binary_format_bench (github.com) suggests that there is a considerable performance benefit. To quote 'At 100 rows the text format takes 48% longer than the binary format.'
Regards,
Dave Cramer
Attachment
Re: Proposal to provide the facility to set binary format output for specific OID's per session
From
Kyotaro Horiguchi
Date:
At Fri, 22 Jul 2022 11:00:18 -0400, Dave Cramer <davecramer@gmail.com> wrote in > As a proof of concept I provide the attached patch which implements the > ability to specify which oids will be returned in binary format per > session. ... > Both the JDBC driver and the go driver can exploit this change with no > changes. I haven't confirmed if other drivers would work without changes. I'm not sure about the needs of that, but binary exchange format is not the one that can be turned on ignoring the peer's capability. If JDBC driver wants some types be sent in binary format, it seems to be able to be specified in bind message. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Proposal to provide the facility to set binary format output for specific OID's per session
From
Dave Cramer
Date:
Dave Cramer
On Sun, 24 Jul 2022 at 23:02, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Fri, 22 Jul 2022 11:00:18 -0400, Dave Cramer <davecramer@gmail.com> wrote in
> As a proof of concept I provide the attached patch which implements the
> ability to specify which oids will be returned in binary format per
> session.
...
> Both the JDBC driver and the go driver can exploit this change with no
> changes. I haven't confirmed if other drivers would work without changes.
I'm not sure about the needs of that, but binary exchange format is
not the one that can be turned on ignoring the peer's capability.
I'm not sure what this means. The client is specifying which types it wants in binary format.
If
JDBC driver wants some types be sent in binary format, it seems to be
able to be specified in bind message.
To be clear it's not just the JDBC client; the original idea came from the author of go driver.
And yes you can specify it in the bind message but you have to specify it in *every* bind message which pretty much negates any advantage you might get out of binary format due to the extra round trip.
Regards,
Dave
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Re: Proposal to provide the facility to set binary format output for specific OID's per session
From
Jack Christensen
Date:
On Mon, Jul 25, 2022 at 4:57 AM Dave Cramer <davecramer@gmail.com> wrote:
Dave CramerOn Sun, 24 Jul 2022 at 23:02, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:At Fri, 22 Jul 2022 11:00:18 -0400, Dave Cramer <davecramer@gmail.com> wrote in
> As a proof of concept I provide the attached patch which implements the
> ability to specify which oids will be returned in binary format per
> session.
...
> Both the JDBC driver and the go driver can exploit this change with no
> changes. I haven't confirmed if other drivers would work without changes.
I'm not sure about the needs of that, but binary exchange format is
not the one that can be turned on ignoring the peer's capability.I'm not sure what this means. The client is specifying which types it wants in binary format.If
JDBC driver wants some types be sent in binary format, it seems to be
able to be specified in bind message.To be clear it's not just the JDBC client; the original idea came from the author of go driver.And yes you can specify it in the bind message but you have to specify it in *every* bind message which pretty much negates any advantage you might get out of binary format due to the extra round trip.Regards,Dave
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
The advantage is to be able to use the binary format with only a single network round trip in cases where prepared statements are not possible. e.g. when using PgBouncer. Using the simple protocol with this patch lets users of pgx (the Go driver mentioned above) and PgBouncer use the binary format. The performance gains can be significant especially with types such as timestamptz that are very slow to parse.
As far as only sending binary types that the client can understand, the client driver would call `set format_binary` at the beginning of the session.
Jack Christensen
Re: Proposal to provide the facility to set binary format output for specific OID's per session
From
Joe Conway
Date:
On 7/25/22 10:07, Jack Christensen wrote: > The advantage is to be able to use the binary format with only a single > network round trip in cases where prepared statements are not possible. > e.g. when using PgBouncer. Using the simple protocol with this patch > lets users of pgx (the Go driver mentioned above) and PgBouncer use the > binary format. The performance gains can be significant especially with > types such as timestamptz that are very slow to parse. > > As far as only sending binary types that the client can understand, the > client driver would call `set format_binary` at the beginning of the > session. +1 makes a lot of sense to me. Dave please add this to the open commitfest (202209) -- Joe Conway RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Proposal to provide the facility to set binary format output for specific OID's per session
From
Sehrope Sarkuni
Date:
Idea here makes sense and I've seen this brought up repeatedly on the JDBC lists.
Does the driver need to be aware that this SET command was executed? I'm wondering what happens if an end user executes this with an OID the driver does not actually know how to handle.
Does the driver need to be aware that this SET command was executed? I'm wondering what happens if an end user executes this with an OID the driver does not actually know how to handle.
> + Oid *tmpOids = palloc(length+1);
> ...
> + tmpOids = repalloc(tmpOids, length+1);
These should be: sizeof(Oid) * (length + 1)
Also, I think you need to specify an explicit context via MemoryContextAlloc or the allocated memory will be in the default context and released at the end of the command.
> ...
> + tmpOids = repalloc(tmpOids, length+1);
These should be: sizeof(Oid) * (length + 1)
Also, I think you need to specify an explicit context via MemoryContextAlloc or the allocated memory will be in the default context and released at the end of the command.
Re: Proposal to provide the facility to set binary format output for specific OID's per session
From
Dave Cramer
Date:
Hi Sehrope,
On Mon, 25 Jul 2022 at 17:22, Sehrope Sarkuni <sehrope@jackdb.com> wrote:
Idea here makes sense and I've seen this brought up repeatedly on the JDBC lists.
Does the driver need to be aware that this SET command was executed? I'm wondering what happens if an end user executes this with an OID the driver does not actually know how to handle.
I suppose there would be a failure to read the attribute correctly.
> + Oid *tmpOids = palloc(length+1);
> ...
> + tmpOids = repalloc(tmpOids, length+1);
These should be: sizeof(Oid) * (length + 1)
Yes they should, thanks!
Also, I think you need to specify an explicit context via MemoryContextAlloc or the allocated memory will be in the default context and released at the end of the command.
Also good catch
Thanks,
Dave
Re: Proposal to provide the facility to set binary format output for specific OID's per session
From
Dave Cramer
Date:
Hi Sehrope,
On Mon, 25 Jul 2022 at 17:53, Dave Cramer <davecramer@gmail.com> wrote:
Hi Sehrope,On Mon, 25 Jul 2022 at 17:22, Sehrope Sarkuni <sehrope@jackdb.com> wrote:Idea here makes sense and I've seen this brought up repeatedly on the JDBC lists.
Does the driver need to be aware that this SET command was executed? I'm wondering what happens if an end user executes this with an OID the driver does not actually know how to handle.I suppose there would be a failure to read the attribute correctly.> + Oid *tmpOids = palloc(length+1);
> ...
> + tmpOids = repalloc(tmpOids, length+1);
These should be: sizeof(Oid) * (length + 1)Yes they should, thanks!
Also, I think you need to specify an explicit context via MemoryContextAlloc or the allocated memory will be in the default context and released at the end of the command.Also good catchThanks,
Attached patch to correct these deficiencies.
Thanks again,
Dave
Attachment
Re: Proposal to provide the facility to set binary format output for specific OID's per session
From
Justin Pryzby
Date:
On Tue, Jul 26, 2022 at 08:11:04AM -0400, Dave Cramer wrote: > Attached patch to correct these deficiencies. You sent a patch to be applied on top of the first patch, but cfbot doesn't know that, so it says the patch doesn't apply. http://cfbot.cputube.org/dave-cramer.html BTW, a previous discussion about this idea is here: https://www.postgresql.org/message-id/flat/40cbb35d-774f-23ed-3079-03f938aacdae@2ndquadrant.com -- Justin
Re: Proposal to provide the facility to set binary format output for specific OID's per session
From
Dave Cramer
Date:
On Fri, 5 Aug 2022 at 17:51, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Tue, Jul 26, 2022 at 08:11:04AM -0400, Dave Cramer wrote:
> Attached patch to correct these deficiencies.
You sent a patch to be applied on top of the first patch, but cfbot doesn't
know that, so it says the patch doesn't apply.
http://cfbot.cputube.org/dave-cramer.html
BTW, a previous discussion about this idea is here:
https://www.postgresql.org/message-id/flat/40cbb35d-774f-23ed-3079-03f938aacdae@2ndquadrant.com
squashed patch attached
Dave
Attachment
Re: Proposal to provide the facility to set binary format output for specific OID's per session
From
Ibrar Ahmed
Date:
On Fri, Aug 12, 2022 at 5:48 PM Dave Cramer <davecramer@gmail.com> wrote:
On Fri, 5 Aug 2022 at 17:51, Justin Pryzby <pryzby@telsasoft.com> wrote:On Tue, Jul 26, 2022 at 08:11:04AM -0400, Dave Cramer wrote:
> Attached patch to correct these deficiencies.
You sent a patch to be applied on top of the first patch, but cfbot doesn't
know that, so it says the patch doesn't apply.
http://cfbot.cputube.org/dave-cramer.html
BTW, a previous discussion about this idea is here:
https://www.postgresql.org/message-id/flat/40cbb35d-774f-23ed-3079-03f938aacdae@2ndquadrant.comsquashed patch attachedDave
The patch does not apply successfully; a rebase is required.
=== applying patch ./0001-add-format_binary.patch patching file src/backend/tcop/postgres.c Hunk #1 succeeded at 97 (offset -8 lines). patching file src/backend/tcop/pquery.c patching file src/backend/utils/init/globals.c patching file src/backend/utils/misc/guc.c Hunk #1 succeeded at 144 (offset 1 line). Hunk #2 succeeded at 244 with fuzz 2 (offset 1 line). Hunk #3 succeeded at 4298 (offset -1 lines). Hunk #4 FAILED at 12906. 1 out of 4 hunks FAILED -- saving rejects to file src/backend/utils/misc/guc.c.rej patching file src/include/miscadmin.h
Ibrar Ahmed
Re: Proposal to provide the facility to set binary format output for specific OID's per session
From
Dave Cramer
Date:
On Tue, 6 Sept 2022 at 02:30, Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote:
On Fri, Aug 12, 2022 at 5:48 PM Dave Cramer <davecramer@gmail.com> wrote:On Fri, 5 Aug 2022 at 17:51, Justin Pryzby <pryzby@telsasoft.com> wrote:On Tue, Jul 26, 2022 at 08:11:04AM -0400, Dave Cramer wrote:
> Attached patch to correct these deficiencies.
You sent a patch to be applied on top of the first patch, but cfbot doesn't
know that, so it says the patch doesn't apply.
http://cfbot.cputube.org/dave-cramer.html
BTW, a previous discussion about this idea is here:
https://www.postgresql.org/message-id/flat/40cbb35d-774f-23ed-3079-03f938aacdae@2ndquadrant.comsquashed patch attachedDaveThe patch does not apply successfully; a rebase is required.=== applying patch ./0001-add-format_binary.patch patching file src/backend/tcop/postgres.c Hunk #1 succeeded at 97 (offset -8 lines). patching file src/backend/tcop/pquery.c patching file src/backend/utils/init/globals.c patching file src/backend/utils/misc/guc.c Hunk #1 succeeded at 144 (offset 1 line). Hunk #2 succeeded at 244 with fuzz 2 (offset 1 line). Hunk #3 succeeded at 4298 (offset -1 lines). Hunk #4 FAILED at 12906. 1 out of 4 hunks FAILED -- saving rejects to file src/backend/utils/misc/guc.c.rej patching file src/include/miscadmin.h
Thanks,
New rebased patch attached
Dave
Attachment
Re: Proposal to provide the facility to set binary format output for specific OID's per session
From
Dave Cramer
Date:
Waiting on the author to do what ? I'm waiting for a review.
Re: Proposal to provide the facility to set binary format output for specific OID's per session
From
Ian Lawrence Barwick
Date:
2022年9月6日(火) 21:32 Dave Cramer <davecramer@gmail.com>: > > > > > On Tue, 6 Sept 2022 at 02:30, Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote: >> >> >> >> On Fri, Aug 12, 2022 at 5:48 PM Dave Cramer <davecramer@gmail.com> wrote: >>> >>> >>> >>> On Fri, 5 Aug 2022 at 17:51, Justin Pryzby <pryzby@telsasoft.com> wrote: >>>> >>>> On Tue, Jul 26, 2022 at 08:11:04AM -0400, Dave Cramer wrote: >>>> > Attached patch to correct these deficiencies. >>>> >>>> You sent a patch to be applied on top of the first patch, but cfbot doesn't >>>> know that, so it says the patch doesn't apply. >>>> http://cfbot.cputube.org/dave-cramer.html >>>> >>>> BTW, a previous discussion about this idea is here: >>>> https://www.postgresql.org/message-id/flat/40cbb35d-774f-23ed-3079-03f938aacdae@2ndquadrant.com >>> >>> >>> squashed patch attached >>> >>> Dave >> >> The patch does not apply successfully; a rebase is required. >> >> === applying patch ./0001-add-format_binary.patch >> patching file src/backend/tcop/postgres.c >> Hunk #1 succeeded at 97 (offset -8 lines). >> patching file src/backend/tcop/pquery.c >> patching file src/backend/utils/init/globals.c >> patching file src/backend/utils/misc/guc.c >> Hunk #1 succeeded at 144 (offset 1 line). >> Hunk #2 succeeded at 244 with fuzz 2 (offset 1 line). >> Hunk #3 succeeded at 4298 (offset -1 lines). >> Hunk #4 FAILED at 12906. >> 1 out of 4 hunks FAILED -- saving rejects to file src/backend/utils/misc/guc.c.rej >> patching file src/include/miscadmin.h >> > > Thanks, > > New rebased patch attached Hi cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is currently underway, this would be an excellent time to update the patch again. [1] http://cfbot.cputube.org/patch_40_3777.log Thanks Ian Barwick
Re: Proposal to provide the facility to set binary format output for specific OID's per session
From
Dave Cramer
Date:
Hi Ian,
Thanks, will do
Dave Cramer
On Thu, 3 Nov 2022 at 21:36, Ian Lawrence Barwick <barwick@gmail.com> wrote:
2022年9月6日(火) 21:32 Dave Cramer <davecramer@gmail.com>:
>
>
>
>
> On Tue, 6 Sept 2022 at 02:30, Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote:
>>
>>
>>
>> On Fri, Aug 12, 2022 at 5:48 PM Dave Cramer <davecramer@gmail.com> wrote:
>>>
>>>
>>>
>>> On Fri, 5 Aug 2022 at 17:51, Justin Pryzby <pryzby@telsasoft.com> wrote:
>>>>
>>>> On Tue, Jul 26, 2022 at 08:11:04AM -0400, Dave Cramer wrote:
>>>> > Attached patch to correct these deficiencies.
>>>>
>>>> You sent a patch to be applied on top of the first patch, but cfbot doesn't
>>>> know that, so it says the patch doesn't apply.
>>>> http://cfbot.cputube.org/dave-cramer.html
>>>>
>>>> BTW, a previous discussion about this idea is here:
>>>> https://www.postgresql.org/message-id/flat/40cbb35d-774f-23ed-3079-03f938aacdae@2ndquadrant.com
>>>
>>>
>>> squashed patch attached
>>>
>>> Dave
>>
>> The patch does not apply successfully; a rebase is required.
>>
>> === applying patch ./0001-add-format_binary.patch
>> patching file src/backend/tcop/postgres.c
>> Hunk #1 succeeded at 97 (offset -8 lines).
>> patching file src/backend/tcop/pquery.c
>> patching file src/backend/utils/init/globals.c
>> patching file src/backend/utils/misc/guc.c
>> Hunk #1 succeeded at 144 (offset 1 line).
>> Hunk #2 succeeded at 244 with fuzz 2 (offset 1 line).
>> Hunk #3 succeeded at 4298 (offset -1 lines).
>> Hunk #4 FAILED at 12906.
>> 1 out of 4 hunks FAILED -- saving rejects to file src/backend/utils/misc/guc.c.rej
>> patching file src/include/miscadmin.h
>>
>
> Thanks,
>
> New rebased patch attached
Hi
cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is
currently underway, this would be an excellent time to update the patch again.
[1] http://cfbot.cputube.org/patch_40_3777.log
Thanks
Ian Barwick