Thread: request for database identifier in the startup packet

request for database identifier in the startup packet

From

Dave Cramer

Date:

09 May 2024, 12:06:11

Greetings,

The JDBC driver is currently keeping a per connection cache of types in the driver. We are seeing cases where the number of columns is quite high. In one case Prevent fetchFieldMetaData() from being run when unnecessary. · Issue #3241 · pgjdbc/pgjdbc (github.com) 2.6 Million columns.

If we knew that we were connecting to the same database we could use a single cache across connections.

I think we would require a server/database identifier in the startup message.

Dave Cramer

Re: request for database identifier in the startup packet

From

"David G. Johnston"

Date:

09 May 2024, 13:55:27

On Thursday, May 9, 2024, Dave Cramer <davecramer@gmail.com> wrote:

Greetings,

The JDBC driver is currently keeping a per connection cache of types in the driver. We are seeing cases where the number of columns is quite high. In one case Prevent fetchFieldMetaData() from being run when unnecessary. · Issue #3241 · pgjdbc/pgjdbc (github.com) 2.6 Million columns.

If we knew that we were connecting to the same database we could use a single cache across connections.

I think we would require a server/database identifier in the startup message.

I feel like pgbouncer ruins this plan.

But maybe you can construct a lookup key from some combination of data provided by these functions:

https://www.postgresql.org/docs/current/functions-info.html#FUNCTIONS-INFO-SESSION

David J.

Re: request for database identifier in the startup packet

From

Robert Haas

Date:

09 May 2024, 16:22:37

On Thu, May 9, 2024 at 8:06 AM Dave Cramer <davecramer@gmail.com> wrote:
> The JDBC driver is currently keeping a per connection cache of types in the driver. We are seeing cases where the
numberof columns is quite high. In one case Prevent fetchFieldMetaData() from being run when unnecessary. · Issue #3241
·pgjdbc/pgjdbc (github.com) 2.6 Million columns. 
>
> If we knew that we were connecting to the same database we could use a single cache across connections.
>
> I think we would require a server/database identifier in the startup message.

I understand the desire to share the cache, but not why that would
require any kind of change to the wire protocol.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: request for database identifier in the startup packet

From

Dave Cramer

Date:

09 May 2024, 18:20:49

Dave Cramer

On Thu, 9 May 2024 at 12:22, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, May 9, 2024 at 8:06 AM Dave Cramer <davecramer@gmail.com> wrote:
> The JDBC driver is currently keeping a per connection cache of types in the driver. We are seeing cases where the number of columns is quite high. In one case Prevent fetchFieldMetaData() from being run when unnecessary. · Issue #3241 · pgjdbc/pgjdbc (github.com) 2.6 Million columns.
>
> If we knew that we were connecting to the same database we could use a single cache across connections.
>
> I think we would require a server/database identifier in the startup message.

I understand the desire to share the cache, but not why that would
require any kind of change to the wire protocol.

The server identity is actually useful for many things such as knowing which instance of a cluster you are connected to.

For the cache however we can't use the IP address to determine which server we are connected to as we could be connected to a pooler.

Knowing exactly which server/database makes it relatively easy to have a common cache across connections. Getting that in the startup message seems like a good place

Dave

Re: request for database identifier in the startup packet

From

Andres Freund

Date:

09 May 2024, 19:14:55

Hi,

On 2024-05-09 14:20:49 -0400, Dave Cramer wrote:
> On Thu, 9 May 2024 at 12:22, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Thu, May 9, 2024 at 8:06 AM Dave Cramer <davecramer@gmail.com> wrote:
> > > The JDBC driver is currently keeping a per connection cache of types in
> > the driver. We are seeing cases where the number of columns is quite high.
> > In one case Prevent fetchFieldMetaData() from being run when unnecessary. ·
> > Issue #3241 · pgjdbc/pgjdbc (github.com) 2.6 Million columns.
> > >
> > > If we knew that we were connecting to the same database we could use a
> > single cache across connections.
> > >
> > > I think we would require a server/database identifier in the startup
> > message.
> >
> > I understand the desire to share the cache, but not why that would
> > require any kind of change to the wire protocol.
> >
> > The server identity is actually useful for many things such as knowing
> which instance of a cluster you are connected to.
> For the cache however we can't use the IP address to determine which server
> we are connected to as we could be connected to a pooler.
> Knowing exactly which server/database makes it relatively easy to have a
> common cache across connections. Getting that in the startup message seems
> like a good place

ISTM that you could just as well query the information you'd like after
connecting. And that's going to be a lot more flexible than having to have
precisely the right information in the startup message, and most clients not
needing it.

Greetings,

Andres Freund

Re: request for database identifier in the startup packet

From

Robert Haas

Date:

09 May 2024, 19:19:00

On Thu, May 9, 2024 at 3:14 PM Andres Freund <andres@anarazel.de> wrote:
> ISTM that you could just as well query the information you'd like after
> connecting. And that's going to be a lot more flexible than having to have
> precisely the right information in the startup message, and most clients not
> needing it.

I agree with this.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: request for database identifier in the startup packet

From

Dave Cramer

Date:

09 May 2024, 19:33:40

On Thu, 9 May 2024 at 15:19, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, May 9, 2024 at 3:14 PM Andres Freund <andres@anarazel.de> wrote:
> ISTM that you could just as well query the information you'd like after
> connecting. And that's going to be a lot more flexible than having to have
> precisely the right information in the startup message, and most clients not
> needing it.

I agree with this.

Well other than the extra round trip.

Thanks,

Dave

Re: request for database identifier in the startup packet

From

Robert Haas

Date:

09 May 2024, 19:39:20

On Thu, May 9, 2024 at 3:33 PM Dave Cramer <davecramer@gmail.com> wrote:
> On Thu, 9 May 2024 at 15:19, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Thu, May 9, 2024 at 3:14 PM Andres Freund <andres@anarazel.de> wrote:
>> > ISTM that you could just as well query the information you'd like after
>> > connecting. And that's going to be a lot more flexible than having to have
>> > precisely the right information in the startup message, and most clients not
>> > needing it.
>>
>> I agree with this.
>>
> Well other than the extra round trip.

I mean, sure, but we can't avoid that for everyone for everything.
There might be some way of doing something like this with, for
example, the infrastructure that was proposed to dynamically add stuff
to the list of PGC_REPORT GUCs, if the values you need are GUCs
already, or were made so. But I think it's just not workable to
unconditionally add a bunch of things to the startup packet. It'll
just grow and grow.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: request for database identifier in the startup packet

From

Dave Cramer

Date:

09 May 2024, 19:51:57

On Thu, 9 May 2024 at 15:39, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, May 9, 2024 at 3:33 PM Dave Cramer <davecramer@gmail.com> wrote:
> On Thu, 9 May 2024 at 15:19, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Thu, May 9, 2024 at 3:14 PM Andres Freund <andres@anarazel.de> wrote:
>> > ISTM that you could just as well query the information you'd like after
>> > connecting. And that's going to be a lot more flexible than having to have
>> > precisely the right information in the startup message, and most clients not
>> > needing it.
>>
>> I agree with this.
>>
> Well other than the extra round trip.

I mean, sure, but we can't avoid that for everyone for everything.
There might be some way of doing something like this with, for
example, the infrastructure that was proposed to dynamically add stuff
to the list of PGC_REPORT GUCs, if the values you need are GUCs
already, or were made so. But I think it's just not workable to
unconditionally add a bunch of things to the startup packet. It'll
just grow and grow.

I don't think this is unconditional. These are real world situations where having this information is useful.

That said, adding them everytime I ask for them would end up growing uncontrollably. This seems like a decent discussion to have with others.

Dave