Thread: Roadmap for FE/BE protocol redesign

Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
This is an attempt to lay out a road map for updating the frontend/backend
protocol in 7.4.  I don't at this point want to get into details on any
one of the TODO items, just get consensus that this is the set of tasks
to be tackled.  Are there any areas that I've missed (that require
protocol changes)?


* Extend ERROR and NOTICE messages to carry multiple fields, including
(as appropriate) a spec-compliant error code, a textual position in the
original query string, information about the source-code location where
the error was reported, etc.

* Consider extending NOTIFY messages to allow a parameter to be carried.

* Fix COPY protocol to allow graceful error recovery (more graceful than
aborting the connection, at least) and to support COPY BINARY to/from
frontend.

* Redesign fastpath function call protocol to eliminate the problems cited
in the source code comments (src/backend/tcop/fastpath.c), and to
eliminate the security hole of accepting unchecked internal representation
from frontend.  Also consider a fastpath for execution of PREPAREd queries.

* Re-institute type-specific send/receive conversion routines to allow
some modest amount of architecture independence for binary data.  This'd
provide a place to check for bogus internal representation during fastpath
input and COPY BINARY IN, too, thus alleviating security concerns.

* Get rid of hardwired field sizes in StartupPacket --- use
variable-length null-terminated strings.  Fixes problem with usernames
being limited to 32 characters, gets around unreasonable limitation on
PGOPTIONS length.  Also can remove unused fields.

* Backend should pass its version number, database encoding, default
client encoding, and possibly other data (any ideas?) to frontend during
startup, to avoid need for explicit queries to get this info.  We could
also consider eliminating SET commands sent by libpq in favor of adding
variable settings to startup packet's PGOPTIONS field.  Ideally we could
get back to the point where a standard connection startup takes only one
packet in each direction.

* Backend's ReadyForQuery message (Z message) should carry an indication
of current transaction status (idle/in transaction/in aborted transaction)
so that frontend need not guess at state.  Perhaps also indicate
autocommit status.  (Is there anything else that frontends would Really
Like To Know?)

* XML support?  If we do anything, I'd want some extensible solution to
allowing multiple query-result output formats from the backend, not an
XML-specific hack.  For one thing, that would allow the actual appearance
of any XML support to happen later.


One of the $64 questions that has to be answered is how much work we're
willing to expend on backwards compatibility.  The path of least
resistance would be to handle it the same way we've done protocol
revisions in the past: the backend will be able to handle both old and new
protocols (so it can talk to old clients) but libpq would be revised to
speak only the new protocol (so new/recompiled clients couldn't talk to
old backends).  We've gotten away with this approach in the past, but the
last time was release 6.4.  I fully expect to hear more complaints now.

One way to tamp down expectations of client backwards compatibility
would be to call the release 8.0 instead of 7.4 ;-)

Comments?
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Justin Clift
Date:
Tom Lane wrote:
<snip>
> One way to tamp down expectations of client backwards compatibility
> would be to call the release 8.0 instead of 7.4 ;-)
> 
> Comments?

Actually, I've been thinking about the numbering of the next PostgreSQL 
version for a few days now.

The scenario that's appealing to me the most is this for the next release:

PostgreSQL 8.0
**************

+ Includes PITR and the Win32 port

+ Not sure where Satoshi is up to with his 2 phase commit proposal, but 
that might make sense to incorporate into a wire protocol revision. From memory he received funding to work on it, so
itmight be coming 
 
along nicely.

+ Other things optional of course.


Personally, I'd rather we go for PostgreSQL 8.0, waiting a while extra 
for PITR and Win32 if needed, and also properly co-ordinate all of the 
release process information (website updates, package builds, Announce 
to the mailing lists and news sources).

Regards and best wishes,

Justin Clift

-- 
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
- Indira Gandhi



Re: Roadmap for FE/BE protocol redesign

From
Rod Taylor
Date:
> + Not sure where Satoshi is up to with his 2 phase commit proposal, but
> that might make sense to incorporate into a wire protocol revision.
>  From memory he received funding to work on it, so it might be coming
> along nicely.

One should note that his protocol changes had absolutely nothing to do
with 2 phase commits -- but were used as a marker to direct replication.

We may want to consider leaving some space for a server / server style
communication (Cluster ID, etc.)

--
Rod Taylor <rbt@rbt.ca>

PGP Key: http://www.rbt.ca/rbtpub.asc

Re: Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Tom Lane wrote:
> * Backend should pass its version number, database encoding, default
> client encoding, and possibly other data (any ideas?) to frontend during
> startup, to avoid need for explicit queries to get this info.  We could
> also consider eliminating SET commands sent by libpq in favor of adding
> variable settings to startup packet's PGOPTIONS field.  Ideally we could
> get back to the point where a standard connection startup takes only one
> packet in each direction.

Should we pass this in a way where we can add stuff later, like passing
it as a simple NULL-terminated string that can get split up on the
client end.

> One of the $64 questions that has to be answered is how much work we're
> willing to expend on backwards compatibility.  The path of least
> resistance would be to handle it the same way we've done protocol
> revisions in the past: the backend will be able to handle both old and new
> protocols (so it can talk to old clients) but libpq would be revised to
> speak only the new protocol (so new/recompiled clients couldn't talk to
> old backends).  We've gotten away with this approach in the past, but the
> last time was release 6.4.  I fully expect to hear more complaints now.

I think such compatibility is sufficient.  We can mention in the
releases notes that they should upgrade there servers before their
clients.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Justin Clift <justin@postgresql.org> writes:
> The scenario that's appealing to me the most is this for the next release:
> PostgreSQL 8.0
> + Includes PITR and the Win32 port

If the folks doing those things can get done in time, great.  I'm even
willing to push out the release schedule (now, not later) to make it
more likely they can get done.  What I'm not willing to do is define
the release in terms of "it happens when these things are done".  We
learned the folly of that approach in 7.1 and 7.2.  Setting a target
date and sticking to it works *much* better.

> + Not sure where Satoshi is up to with his 2 phase commit proposal, but 
> that might make sense to incorporate into a wire protocol revision. 

I can't see any need for protocol-level support for such a thing.
Why wouldn't it just be some more SQL commands?

(Not that I believe in 2PC as a real-world solution anyway, but that's
a different argument...)
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Rod Taylor
Date:
> * Backend's ReadyForQuery message (Z message) should carry an indication
> of current transaction status (idle/in transaction/in aborted transaction)
> so that frontend need not guess at state.  Perhaps also indicate
> autocommit status.  (Is there anything else that frontends would Really
> Like To Know?)

Could it include transaction depth with the assumption nested
transactions will arrive at some point?

> * XML support?  If we do anything, I'd want some extensible solution to
> allowing multiple query-result output formats from the backend, not an
> XML-specific hack.  For one thing, that would allow the actual appearance
> of any XML support to happen later.

> One of the $64 questions that has to be answered is how much work we're
> willing to expend on backwards compatibility.  The path of least
> resistance would be to handle it the same way we've done protocol
> revisions in the past: the backend will be able to handle both old and new
> protocols (so it can talk to old clients) but libpq would be revised to
> speak only the new protocol (so new/recompiled clients couldn't talk to
> old backends).  We've gotten away with this approach in the past, but the
> last time was release 6.4.  I fully expect to hear more complaints now.

I wouldn't worry about backward compatibility complaints too much BUT
I'd be tempted to make a startup packet that will allow libpq to revert
back to old protocols easily enough for the future so that we can do
incremental changes to the protocol.

--
Rod Taylor <rbt@rbt.ca>

PGP Key: http://www.rbt.ca/rbtpub.asc

Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> * Backend should pass its version number, database encoding, default
>> client encoding, and possibly other data (any ideas?) to frontend during
>> startup, to avoid need for explicit queries to get this info.

> Should we pass this in a way where we can add stuff later, like passing
> it as a simple NULL-terminated string that can get split up on the
> client end.

Yeah, I was envisioning something with multiple labeled fields so that
more stuff can be added later without a protocol change (likewise for
StartupPacket and ErrorMessage).  But again, I don't want this thread to
get into any details about specific tasks --- let's try to get a view of
the forest before we start hewing down individual trees...


>> We've gotten away with this approach in the past, but the
>> last time was release 6.4.  I fully expect to hear more complaints now.

> I think such compatibility is sufficient.  We can mention in the
> releases notes that they should upgrade there servers before their
> clients.

I'd be really happy if we can make that stick.  There's enough work to
be done here without trying to develop a multiprotocol version of
libpq.

It would be good to hear some words from the JDBC and ODBC developers
about what sort of plans they'd have for updating those interfaces.
        regards, tom lane


Re: 7.4 vs 8.0 WAS Roadmap for FE/BE protocol redesign

From
Robert Treat
Date:
On Mon, 2003-03-10 at 14:05, Justin Clift wrote: 
> Tom Lane wrote:
> <snip>
> > One way to tamp down expectations of client backwards compatibility
> > would be to call the release 8.0 instead of 7.4 ;-)
> > 
> > Comments?
> 
> Actually, I've been thinking about the numbering of the next PostgreSQL 
> version for a few days now.
> 
> The scenario that's appealing to me the most is this for the next release:
> 
> PostgreSQL 8.0
> **************
> 
> + Includes PITR and the Win32 port
> 
> + Not sure where Satoshi is up to with his 2 phase commit proposal, but 
> that might make sense to incorporate into a wire protocol revision. 
>  From memory he received funding to work on it, so it might be coming 
> along nicely.
> 
> + Other things optional of course.
> 
> 
> Personally, I'd rather we go for PostgreSQL 8.0, waiting a while extra 
> for PITR and Win32 if needed, and also properly co-ordinate all of the 
> release process information (website updates, package builds, Announce 
> to the mailing lists and news sources).
> 
I don't think PITR or Win32 (or even replication) warrant an 8.0, since
none of those should effect client/server interaction and/or backward
compatibility. (Or at least not as much as schema support did, which
required most "adminy" apps to be worked over) 

A protocol change however, would warrant a version number bump IMHO. I
would guess that by the time all of the protocol changes could be
completed, we'd have win32 or pitr, so it will hopefully be moot. 

Robert Treat 




Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Rod Taylor <rbt@rbt.ca> writes:
> I'd be tempted to make a startup packet that will allow libpq to revert
> back to old protocols easily enough for the future so that we can do=20
> incremental changes to the protocol.

We already have that: you send a startup packet with a version less than
the latest, and the backend speaks that version to you.

One thing I want to do though is relax the protocol-level constraints
on certain message contents: for example, if ErrorMessage becomes a
collection of labeled fields, it should be possible to add new field
types without calling it a protocol revision.  The protocol need only
specify "ignore any fields whose label you do not recognize".
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Rod Taylor
Date:
On Mon, 2003-03-10 at 14:30, Tom Lane wrote:
> Rod Taylor <rbt@rbt.ca> writes:
> > I'd be tempted to make a startup packet that will allow libpq to revert
> > back to old protocols easily enough for the future so that we can do=20
> > incremental changes to the protocol.
>
> We already have that: you send a startup packet with a version less than
> the latest, and the backend speaks that version to you.

Yes, but that requires you know the backend is less than the latest.

If you send version A, and the backend responds don't know A, but I know
A - 2, then libpq may want to try speaking A - 2.

> types without calling it a protocol revision.  The protocol need only
> specify "ignore any fields whose label you do not recognize".

This is probably just as good, if it's done for both sides.

--
Rod Taylor <rbt@rbt.ca>

PGP Key: http://www.rbt.ca/rbtpub.asc

Re: Roadmap for FE/BE protocol redesign

From
"Merlin Moncure"
Date:
Justin Clift wrote:
>
> PostgreSQL 8.0
> **************
>
> + Includes PITR and the Win32 port
*snip*

I feel like the upcoming 7.4 is the most important release since the
introduction of toast, maybe even since the introduction of the sql
language.  I wholeheartedly agree with your proposition.

Merlin


Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Rod Taylor <rbt@rbt.ca> writes:
>> We already have that: you send a startup packet with a version less than
>> the latest, and the backend speaks that version to you.

> Yes, but that requires you know the backend is less than the latest.

As opposed to knowing what?  You send the version number you wish to speak;
either the backend can handle it, or not.
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Ashley Cambrell
Date:
Tom Lane wrote:

>This is an attempt to lay out a road map for updating the frontend/backend
>protocol in 7.4.  I don't at this point want to get into details on any
>one of the TODO items, just get consensus that this is the set of tasks
>to be tackled.  Are there any areas that I've missed (that require
>protocol changes)?
>
>  
>
What about binding variables ala oracle's ociparse -> ocibindbyname -> 
ociexecute -> ocifetch ?  I know you can do most of it via SQL 
(PREPARE/EXECUTE) but you can't do 'RETURN x INTO :x' as it stands.  
This would also get around the problem of getting values from newly 
inserted rows (eg PKs) without resorting to OIDs.  Not entirely a FE/BE 
issue... but worth considering in any redesign.

Ashley Cambrell




Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Bruce Badger
Date:
On Tue, 2003-03-11 at 06:52, Tom Lane wrote:
> Rod Taylor <rbt@rbt.ca> writes:
> >> We already have that: you send a startup packet with a version less than
> >> the latest, and the backend speaks that version to you.
> 
> > Yes, but that requires you know the backend is less than the latest.
> 
> As opposed to knowing what?  You send the version number you wish to speak;
> either the backend can handle it, or not.

If the backend can not handle the version I request, but can handle a
prior version, I'd like to know.  I am planning on having handlers for
multiple protocol versions in the same memory space (I'm using
Smalltalk, BTW) so that one application can talk to various databases of
various vintages.

I suppose that the client can just keep retrying the connection with
different versions until it gets a match, though.



Re: Roadmap for FE/BE protocol redesign

From
Rod Taylor
Date:
On Mon, 2003-03-10 at 14:52, Tom Lane wrote:
> Rod Taylor <rbt@rbt.ca> writes:
> >> We already have that: you send a startup packet with a version less than
> >> the latest, and the backend speaks that version to you.
>
> > Yes, but that requires you know the backend is less than the latest.
>
> As opposed to knowing what?  You send the version number you wish to speak;
> either the backend can handle it, or not.

At some point PostgreSQL will have enough users that changing it will
piss them off.  If the backend cannot handle whats been requested, we
may want to consider negotiating a protocol that both can handle.

Anyway, it doesn't really affect me one way or the other.  So whatever
you like is probably fine.
--
Rod Taylor <rbt@rbt.ca>

PGP Key: http://www.rbt.ca/rbtpub.asc

Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Bruce Badger <bruce_badger@badgerse.com> writes:
> I suppose that the client can just keep retrying the connection with
> different versions until it gets a match, though.

Right now, the backend just barfs withFATAL: unsupported frontend protocol
so you have to do a blind search to see what it will take.

It would probably be a good idea to improve that message to mention
exactly what range of protocol versions the backend does support.
However, this will do you little good for talking to existing backends
:-(
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Neil Conway
Date:
On Mon, 2003-03-10 at 16:37, Ashley Cambrell wrote:
> This would also get around the problem of getting values from newly 
> inserted rows (eg PKs) without resorting to OIDs.

That's not a problem: ensure that the newly inserted row has a SERIAL
column, and use currval().

Cheers,

Neil

-- 
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC





Re: Roadmap for FE/BE protocol redesign

From
Ashley Cambrell
Date:
Neil Conway wrote:<br /><blockquote cite="mid1047333552.357.286.camel@tokyo" type="cite"><pre wrap="">On Mon,
2003-03-10at 16:37, Ashley Cambrell wrote: </pre><blockquote type="cite"><pre wrap="">This would also get around the
problemof getting values from newly 
 
inserted rows (eg PKs) without resorting to OIDs.   </pre></blockquote><pre wrap="">
That's not a problem: ensure that the newly inserted row has a SERIAL
column, and use currval(). </pre></blockquote> Ok.  I forget about that.  Even so, it would still be nice to have a
bindlike interface... :-)<br /><br /> Ashley Cambrell<br /><br /> 

Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Neil Conway <neilc@samurai.com> writes:
> On Mon, 2003-03-10 at 16:37, Ashley Cambrell wrote:
>> This would also get around the problem of getting values from newly 
>> inserted rows (eg PKs) without resorting to OIDs.

> That's not a problem: ensure that the newly inserted row has a SERIAL
> column, and use currval().

There was some talk awhile back of inventing INSERT ... RETURNING and
UPDATE ... RETURNING commands so that you could pass back computed
values to the frontend without an extra query.  It doesn't seem to have
gotten further than a TODO item yet, though.  AFAICS this does not need
a protocol extension, anyway --- it'd look just the same as a SELECT
at the protocol level.
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Tom Lane wrote:
> Justin Clift <justin@postgresql.org> writes:
> > The scenario that's appealing to me the most is this for the next release:
> > PostgreSQL 8.0
> > + Includes PITR and the Win32 port
> 
> If the folks doing those things can get done in time, great.  I'm even
> willing to push out the release schedule (now, not later) to make it
> more likely they can get done.  What I'm not willing to do is define
> the release in terms of "it happens when these things are done".  We
> learned the folly of that approach in 7.1 and 7.2.  Setting a target
> date and sticking to it works *much* better.

Well, we had that 7.4 discussion a few days ago, and only had two people
comment on our scheduling.  At that time, the discussion was for 7.4
beta starting May 1 vs June 1.  June 1 would be roughly six months from
our 7.3 final release, which is typical.

I agree with Tom that we should decided sooner rather than later on a
beta date.  Right now I don't think we have enough to must-have features
to justify a release, and as everyone knows, an upgrade isn't easy for
our users.

I also agree with Tom that we shouldn't peg our beta schedule on specific
features.

So, what should we do?  Should we go another month or two and just wait
until we have enough must-have features?  While not waiting on specific
features, it _is_ waiting for something to warrant a release.  I guess
the big question is whether we release on a scheduled-basis or a
enough-features-basis.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Beta Schedule (was Re: Roadmap for FE/BE protocol redesign)

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> So, what should we do?  Should we go another month or two and just wait
> until we have enough must-have features?  While not waiting on specific
> features, it _is_ waiting for something to warrant a release.  I guess
> the big question is whether we release on a scheduled-basis or a
> enough-features-basis.

"Enough features" is such a judgment call that no one can predict what
the schedule will be, if that's part of the decision.

I had been leaning to May 1 beta, but am happy to switch to June 1 if
you feel that makes an improvement in the odds of completing the Windows
port.  (I think it will also improve the odds of finishing this protocol
stuff I've taken on...)  I don't want to see it pushed further than that
without good concrete arguments for doing so.
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
"Dave Page"
Date:

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] 
> Sent: 10 March 2003 18:51
> To: pgsql-hackers@postgresql.org; pgsql-interfaces@postgresql.org
> Subject: [HACKERS] Roadmap for FE/BE protocol redesign
> 
> 
> This is an attempt to lay out a road map for updating the 
> frontend/backend protocol in 7.4.  I don't at this point want 
> to get into details on any one of the TODO items, just get 
> consensus that this is the set of tasks to be tackled.  Are 
> there any areas that I've missed (that require protocol changes)?

What about the addition of pg_attribute.attrelid &
pg_attribute.attname/attnum in RowDesription messages to identify the
underlying attribute (where appropriate)?

I just looked at an issue in the ODBC driver where this would help - it
seems we return incorrect column metadata from query results, I guess (I
haven't look at the code yet) because it's somewhat tricky to figure
out.

Regards, Dave.

Re: Beta Schedule (was Re: Roadmap for FE/BE protocol redesign)

From
Joe Conway
Date:
Tom Lane wrote:
> "Enough features" is such a judgment call that no one can predict what
> the schedule will be, if that's part of the decision.
> 
> I had been leaning to May 1 beta, but am happy to switch to June 1 if
> you feel that makes an improvement in the odds of completing the Windows
> port.  (I think it will also improve the odds of finishing this protocol
> stuff I've taken on...)  I don't want to see it pushed further than that
> without good concrete arguments for doing so.
> 

FWIW, if we're voting, I'd say:

1. on "firm date" vs "when feature x is ready": I vote "firm date" -- I 
know that last August the firm beta start date motivated me to get some 
things done that I would have dragged out without the scheduled cutoff.

2. June 1 beta cutoff sounds about right.

Joe



Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
"Dave Page" <dpage@vale-housing.co.uk> writes:
> What about the addition of pg_attribute.attrelid &
> pg_attribute.attname/attnum in RowDesription messages to identify the
> underlying attribute (where appropriate)?

Well, we can talk about it, but I still think that any frontend that
relies on such information is broken by design.  (And if that means the
JDBC spec is broken, then the JDBC spec is broken.)

Just to start with, if I do "SELECT * FROM view", am I going to see the
info associated with the view column, or with the hypothetical
underlying table column?  (Actually, didn't I already make a list of a
bunch of ways in which this concept is underspecified?  AFAIR, you
didn't suggest answers to any of those questions ... but we need answers
to all of them if we are going to implement the feature.)
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Tom Lane wrote:
> "Dave Page" <dpage@vale-housing.co.uk> writes:
> > What about the addition of pg_attribute.attrelid &
> > pg_attribute.attname/attnum in RowDesription messages to identify the
> > underlying attribute (where appropriate)?
> 
> Well, we can talk about it, but I still think that any frontend that
> relies on such information is broken by design.  (And if that means the
> JDBC spec is broken, then the JDBC spec is broken.)
> 
> Just to start with, if I do "SELECT * FROM view", am I going to see the
> info associated with the view column, or with the hypothetical
> underlying table column?  (Actually, didn't I already make a list of a
> bunch of ways in which this concept is underspecified?  AFAIR, you
> didn't suggest answers to any of those questions ... but we need answers
> to all of them if we are going to implement the feature.)

I was willing to add a hack to enable default column labels to be
"table.column" --- that seemed like the least obtrusive.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I was willing to add a hack to enable default column labels to be
> "table.column" --- that seemed like the least obtrusive.

Most of the definitional issues still apply: which table name are you
going to insert, and under what conditions?

If we're revising the protocol, there's no reason to hack up the column
label to carry two pieces of info; it'd be cleaner to provide a separate
slot in the T message to carry the table name.  I just want to see a
reasonably complete spec for what the feature is supposed to do, before
we buy into it ...
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Justin Clift
Date:
Bruce Momjian wrote:
<snip>
> So, what should we do?  Should we go another month or two and just wait
> until we have enough must-have features?  While not waiting on specific
> features, it _is_ waiting for something to warrant a release.  I guess
> the big question is whether we release on a scheduled-basis or a
> enough-features-basis.

Hmmm, I feel we should decide on features that will make an 8.0 release 
meaningful, and *somehow* work to making sure they are ready for the 
release.

With 7.1/7.2, Tom mentioned us being delayed because specific features 
we were waiting for became dependant on one person.

Would it be feasible to investigate approaches for having the Win32 and 
PITR work be shared amongst a few very-interested volunteers, so that 
people can cover for each other's downtime?  Not sure of the 
confidentiality level of the Win32/PITR patches at present, but I'd 
guess there would be at least a few solid volunteers willing to 
contribute to the Win32/PITR ports if we asked for people to step forwards.

:-)

Regards and best wishes,

Justin Clift

-- 
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
- Indira Gandhi



Re: Beta Schedule (was Re: Roadmap for FE/BE protocol redesign)

From
"Christopher Kings-Lynne"
Date:
> I had been leaning to May 1 beta, but am happy to switch to June 1 if
> you feel that makes an improvement in the odds of completing the Windows
> port.  (I think it will also improve the odds of finishing this protocol
> stuff I've taken on...)  I don't want to see it pushed further than that
> without good concrete arguments for doing so.

There really is no rush...

I'm well-versed in PostgreSQL, and even I haven't upgraded some of our
production servers to even 7.3 yet (thanks to pg_dump dependency
nightmare)..

BTW, so no-one conflicts, I'm doing up pg_get_triggerdef(oid) at the moment.
Once that's done, I'll be able to submit a redone psql \d output that
includes the trigger definition.

Cheers,

Chris



Re: Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I was willing to add a hack to enable default column labels to be
> > "table.column" --- that seemed like the least obtrusive.
> 
> Most of the definitional issues still apply: which table name are you
> going to insert, and under what conditions?
> 
> If we're revising the protocol, there's no reason to hack up the column
> label to carry two pieces of info; it'd be cleaner to provide a separate
> slot in the T message to carry the table name.  I just want to see a
> reasonably complete spec for what the feature is supposed to do, before
> we buy into it ...

I don't think we can get a complete spec, and hence the _hack_ idea.  :-)

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Roadmap for FE/BE protocol redesign

From
Justin Clift
Date:
Tom Lane wrote:
> "Dave Page" <dpage@vale-housing.co.uk> writes:
> 
>>What about the addition of pg_attribute.attrelid &
>>pg_attribute.attname/attnum in RowDesription messages to identify the
>>underlying attribute (where appropriate)?
> 
> Well, we can talk about it, but I still think that any frontend that
> relies on such information is broken by design.  (And if that means the
> JDBC spec is broken, then the JDBC spec is broken.)
> 
> Just to start with, if I do "SELECT * FROM view", am I going to see the
> info associated with the view column, or with the hypothetical
> underlying table column?  (Actually, didn't I already make a list of a
> bunch of ways in which this concept is underspecified?  AFAIR, you
> didn't suggest answers to any of those questions ... but we need answers
> to all of them if we are going to implement the feature.)

The problem Dave is suggesting this as a first attempt at a solution for 
is that with ODBC, a frontend (i.e. OpenOffice) asks the ODBC driver 
which columns are NULLable, etc.  And the ODBC driver is getting the 
info wrong, then passing back the incorrect info.

So, when a person goes to insert a row into a table with a 
SERIAL/SEQUENCE based column, OpenOffice has been told the column isn't 
NULLable and forces the user to enter a value.  Voila, it doesn't work 
with sequences.  :(

It's likely possible to add to the ODBC driver some way of getting the 
info right, but Dave is also looking for a way of making this easier 
into the future for similar problems.  i.e. Let the database explicitly 
have info about what each column can do.

That's my understanding of it anyway.

:-)

Regards and best wishes,

Justin Clift


>             regards, tom lane


-- 
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
- Indira Gandhi



Re: Roadmap for FE/BE protocol redesign

From
Justin Clift
Date:
Hi guys,

As a thought, has anyone considered if it's worth doing data compression 
of the "new proposed" protocol for PostgreSQL 8.0/7.4?  It was suggested 
a long time ago by Joshua Drake (and his version was well accepted by 
his customers from what I heard), so might this be worth adding too?

:-)

Regards and best wishes,

Justin Clift

-- 
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
- Indira Gandhi



Re: Beta schedule (was Roadmap for FE/BE protocol redesign)

From
Tom Lane
Date:
Justin Clift <justin@postgresql.org> writes:
> With 7.1/7.2, Tom mentioned us being delayed because specific features 
> we were waiting for became dependant on one person.

> Would it be feasible to investigate approaches for having the Win32 and 
> PITR work be shared amongst a few very-interested volunteers, so that 
> people can cover for each other's downtime?

It would certainly be good to bring as much manpower to bear on those
problems as we can.  But that doesn't really address my concern: if the
schedule is defined as "we go beta when feature X is done", then no one
who's working on stuff other than feature X knows how to plan their
time.  The only fair way to run the project is "we go beta at time T";
that way everyone knows what they need to shoot for and can plan
accordingly.

I don't mind setting the planned time T on the basis of what we think
it will take for certain popular feature X's to be done.  But if the
guys working on X aren't done at T, it's not fair to everyone else to
hold our breaths waiting for them to be done at T-plus-who-knows-what.

I don't really have any sympathy for the argument that "it won't be a
compelling release if we don't have feature X".  If the release isn't
compelling for someone, they don't have to upgrade; they can wait for
the next release.  The folks who *are* eager for what's been gotten done
will be glad of having a release now rather than N months from now.
And do I need to point out that "it runs on Windoze" is not of
earth-shattering importance for everyone?
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Justin Clift
Date:
Tom Lane wrote:
> Justin Clift <justin@postgresql.org> writes:
> 
>>The scenario that's appealing to me the most is this for the next release:
>>PostgreSQL 8.0
>>+ Includes PITR and the Win32 port
> 
> If the folks doing those things can get done in time, great.  I'm even
> willing to push out the release schedule (now, not later) to make it
> more likely they can get done.  What I'm not willing to do is define
> the release in terms of "it happens when these things are done".  We
> learned the folly of that approach in 7.1 and 7.2.  Setting a target
> date and sticking to it works *much* better.

Yep, we both seem to be saying that we'd like these features, but we 
don't want to see them become delay-points.


>>+ Not sure where Satoshi is up to with his 2 phase commit proposal, but 
>>that might make sense to incorporate into a wire protocol revision. 
> 
> I can't see any need for protocol-level support for such a thing.
> Why wouldn't it just be some more SQL commands?

Not sure.  It seems like 2PC will be required/desirable within the year 
for better support of some clustering scenarios, so we "might as well 
look at it now".  When I was reading Satoshi's stuff a while ago I 
thought it was a protcol level thing, not a SQL command level thing, but 
don't really care either way.  :)

Regards and best wishes,

Justin Clift


> (Not that I believe in 2PC as a real-world solution anyway, but that's
> a different argument...)
>
>             regards, tom lane


-- 
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
- Indira Gandhi



Re: Roadmap for FE/BE protocol redesign

From
"Dave Page"
Date:
It's rumoured that Tom Lane once said:
> "Dave Page" <dpage@vale-housing.co.uk> writes:
>> What about the addition of pg_attribute.attrelid &
>> pg_attribute.attname/attnum in RowDesription messages to identify the
>> underlying attribute (where appropriate)?
>
> Well, we can talk about it, but I still think that any frontend that
> relies on such information is broken by design.  (And if that means the
> JDBC spec is broken, then the JDBC spec is broken.)

I don't know about JDBC, but ODBC could use it, and it would save a heck
of a lot of pain in apps like pgAdmin that need to figure out if a column
in an arbitrary resultset might be updateable.
At the moment there is some nasty code in pgAdmin II that attempts to
parse the SQL statement to figure out if the the resultset is updateable
by trying to figure out the number of relations in the query, whether any
of them is a view or sequence, whether there are any function calls or
expressions in the attribute list and so on. It then has to try to figure
out if there is a complete pkey in the resultset that can be used for the
update, or whether it should attempt an update based on all existing
values. That code is just plain nasty in VB. In pgAdmin III we've already
mentioned stealing bits of the PostgreSQL parser.
The addition of the base column identifier (the pg_attribute.oid would
have sufficed, but I can live with attrelid, attname, or even nspnam,
relname & attname or similar) would make this trivil, and allow interfaces
like ODBC, JDBC, OLE-DB and Npgsql to gain easy access to any metadata
they might require.
> Just to start with, if I do "SELECT * FROM view", am I going to see the
> info associated with the view column, or with the hypothetical
> underlying table column?

The view. We don't care where the data originally came from, only where it
came from as far as the query creating the resultset is concerned. Of
course, updateable views would make this irrelevant anyway...
(Actually, didn't I already make a list of a
> bunch of ways in which this concept is underspecified?  AFAIR, you
> didn't suggest answers to any of those questions ... but we need
> answers to all of them if we are going to implement the feature.)

I guess I must have missed that thread.

Regards, Dave




Re: Roadmap for FE/BE protocol redesign

From
"Dave Page"
Date:
It's rumoured that Bruce Momjian once said:
> Tom Lane wrote:
>> "Dave Page" <dpage@vale-housing.co.uk> writes:
>> > What about the addition of pg_attribute.attrelid &
>> > pg_attribute.attname/attnum in RowDesription messages to identify
>> > the underlying attribute (where appropriate)?
>>
>> Well, we can talk about it, but I still think that any frontend that
>> relies on such information is broken by design.  (And if that means
>> the JDBC spec is broken, then the JDBC spec is broken.)
>>
>> Just to start with, if I do "SELECT * FROM view", am I going to see
>> the info associated with the view column, or with the hypothetical
>> underlying table column?  (Actually, didn't I already make a list of a
>> bunch of ways in which this concept is underspecified?  AFAIR, you
>> didn't suggest answers to any of those questions ... but we need
>> answers to all of them if we are going to implement the feature.)
>
> I was willing to add a hack to enable default column labels to be
> "table.column" --- that seemed like the least obtrusive.

That would help, but not in the cases that cause the most grief - for
example when the column has been aliased in the original query - that
should override the label of course, but then we still need to parse the
SQL at the client to figure out what's going on.
Regards, Dave.




Re: Roadmap for FE/BE protocol redesign

From
"Dave Page"
Date:
It's rumoured that Bruce Momjian once said:
> Tom Lane wrote:
>> Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> > I was willing to add a hack to enable default column labels to be
>> > "table.column" --- that seemed like the least obtrusive.
>>
>> Most of the definitional issues still apply: which table name are you
>> going to insert, and under what conditions?
>>
>> If we're revising the protocol, there's no reason to hack up the
>> column label to carry two pieces of info; it'd be cleaner to provide a
>> separate slot in the T message to carry the table name.  I just want
>> to see a reasonably complete spec for what the feature is supposed to
>> do, before we buy into it ...
>
> I don't think we can get a complete spec, and hence the _hack_ idea.
> :-)

Well, what would constitute a complete spec? I think I've told the group
what I would like to be able to do, what unanswered questions can I
(hopefully :-) ) answer?
Regards, Dave.




Re: Roadmap for FE/BE protocol redesign

From
Hiroshi Inoue
Date:
Tom Lane wrote:
> 
> > I think such compatibility is sufficient.  We can mention in the
> > releases notes that they should upgrade there servers before their
> > clients.
> 
> I'd be really happy if we can make that stick.  There's enough work to
> be done here without trying to develop a multiprotocol version of
> libpq.
> 
> It would be good to hear some words from the JDBC and ODBC developers
> about what sort of plans they'd have for updating those interfaces.

Psqlodbc driver couldn't use the library unless
the library could handle multiple protocol.

What the driver has suffered from is to get the
fields' info of a query result or the parameters'
info of a statement. The info is needed even before
the execution of the statement(i.e it's only prepared).

regards,
Hiroshi Inouehttp://www.geocities.jp/inocchichichi/psqlodbc/


Re: Beta schedule (was Roadmap for FE/BE protocol redesign)

From
Bruce Momjian
Date:
I agree, let's not wait for specific features.  The issue was whether we
had enough significant features done for a release --- I didn't think we
did, so I am saying, let's get more features, rather than let's get
feature X.

As we fill in missing features, there will be less must-have features to
add, so we are left with continuing with our present release pace or
releasing less frequently with the same number of feature additions.

---------------------------------------------------------------------------

Tom Lane wrote:
> Justin Clift <justin@postgresql.org> writes:
> > With 7.1/7.2, Tom mentioned us being delayed because specific features 
> > we were waiting for became dependant on one person.
> 
> > Would it be feasible to investigate approaches for having the Win32 and 
> > PITR work be shared amongst a few very-interested volunteers, so that 
> > people can cover for each other's downtime?
> 
> It would certainly be good to bring as much manpower to bear on those
> problems as we can.  But that doesn't really address my concern: if the
> schedule is defined as "we go beta when feature X is done", then no one
> who's working on stuff other than feature X knows how to plan their
> time.  The only fair way to run the project is "we go beta at time T";
> that way everyone knows what they need to shoot for and can plan
> accordingly.
> 
> I don't mind setting the planned time T on the basis of what we think
> it will take for certain popular feature X's to be done.  But if the
> guys working on X aren't done at T, it's not fair to everyone else to
> hold our breaths waiting for them to be done at T-plus-who-knows-what.
> 
> I don't really have any sympathy for the argument that "it won't be a
> compelling release if we don't have feature X".  If the release isn't
> compelling for someone, they don't have to upgrade; they can wait for
> the next release.  The folks who *are* eager for what's been gotten done
> will be glad of having a release now rather than N months from now.
> And do I need to point out that "it runs on Windoze" is not of
> earth-shattering importance for everyone?
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
> 
> http://archives.postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Roadmap for FE/BE protocol redesign

From
"Merlin Moncure"
Date:
Justin Clift wrote:
> confidentiality level of the Win32/PITR patches at present, but I'd
> guess there would be at least a few solid volunteers willing to
> contribute to the Win32/PITR ports if we asked for people to step
> forwards.

I'd like to help.  I've been following the list for several months now.
I'd like to submit a patch or two and get a little active if I can cut
the mustard, so to speak.  If not, I can always supply a little testing
and benchmarking.  I have some experience writing backends, GIS, xml,
and all manner of database.  Unfortunately I have very low familiarity
with unix tools, except cvs.

Is there a non cygwin version of the source that will compile (or not)
on a win32 machine?  That would be a great place to start.

Merlin




Re: Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Sure, Neil Conway updated Jan's patches for 7.3.  It is in:
ftp://candle.pha.pa.us/pub/postgresql/mypatches/

---------------------------------------------------------------------------

Merlin Moncure wrote:
> Justin Clift wrote:
> > confidentiality level of the Win32/PITR patches at present, but I'd
> > guess there would be at least a few solid volunteers willing to
> > contribute to the Win32/PITR ports if we asked for people to step
> > forwards.
> 
> I'd like to help.  I've been following the list for several months now.
> I'd like to submit a patch or two and get a little active if I can cut
> the mustard, so to speak.  If not, I can always supply a little testing
> and benchmarking.  I have some experience writing backends, GIS, xml,
> and all manner of database.  Unfortunately I have very low familiarity
> with unix tools, except cvs.
> 
> Is there a non cygwin version of the source that will compile (or not)
> on a win32 machine?  That would be a great place to start.
> 
> Merlin
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> What the driver has suffered from is to get the
> fields' info of a query result or the parameters'
> info of a statement. The info is needed even before
> the execution of the statement(i.e it's only prepared).

Hm.  Are you saying that you would like PREPARE to send back a
RowDescription ('T') message?  Or is there more to it than that?
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
"Dave Page" <dpage@vale-housing.co.uk> writes:
> Well, what would constitute a complete spec? I think I've told the group
> what I would like to be able to do, what unanswered questions can I
> (hopefully :-) ) answer?

I'm still unclear on exactly what your needs are.  In the first place,
are you expecting to obtain data from arbitrary SELECT statements, or
only from statements of the form "SELECT * FROM single_table"?  You've
also been confusing as to whether you want transparency of views (ie,
does a select from a view return data about the view's nominal columns
or about the underlying base table columns?).  What about cases
involving aggregates or grouping --- there may be simple Vars in the
target list, but they can hardly be thought to represent updatable values.

Also, you muttered something about inferring primary keys and number of
relations in query; seems like this feature isn't solving your problem
as far as that goes, because the set of attributes visible in the target
list isn't much help for either.
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
"Andrew Dunstan"
Date:
If there's a build of this available we'd love to test it in a major project
we're working on. The project is currently using the 7.2 build that was made
available, but we had to work around the lack of schema support by kludging
table names as "namespace_table", so a 7.3 build would be great, with or
without installer.

So far, the 7.2 build has worked without a hitch, although our Db is *very*
simple, so the feature set has hardly been stressed.

FWIW, all access in our app is via JDBC.

One of the big things holding us back from using Pg in our shipped product
is lack of a released native Windows port, so I'm very keen to see this
progress.

andrew

----- Original Message -----
From: "Bruce Momjian" <pgman@candle.pha.pa.us>
>
> Sure, Neil Conway updated Jan's patches for 7.3.  It is in:
>
> ftp://candle.pha.pa.us/pub/postgresql/mypatches/
>
> --------------------------------------------------------------------------
-
>
> Merlin Moncure wrote:
> >
> > Is there a non cygwin version of the source that will compile (or not)
> > on a win32 machine?  That would be a great place to start.
> >
> > Merlin
> >



Re: Roadmap for FE/BE protocol redesign

From
"Dave Page"
Date:

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] 
> Sent: 11 March 2003 15:06
> To: Dave Page
> Cc: pgman@candle.pha.pa.us; pgsql-hackers@postgresql.org; 
> pgsql-interfaces@postgresql.org
> Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign 
> 
> 
> "Dave Page" <dpage@vale-housing.co.uk> writes:
> > Well, what would constitute a complete spec? I think I've told the 
> > group what I would like to be able to do, what unanswered questions 
> > can I (hopefully :-) ) answer?
> 
> I'm still unclear on exactly what your needs are.  In the 
> first place, are you expecting to obtain data from arbitrary 
> SELECT statements, or only from statements of the form 
> "SELECT * FROM single_table"?  

From any statement that returns a resultset.

> You've also been confusing as 
> to whether you want transparency of views (ie, does a select 
> from a view return data about the view's nominal columns or 
> about the underlying base table columns?).  

The view's nominal columns. Resolving updates back to the base tables
will (hopefully) be the job of updateable views.

> What about cases 
> involving aggregates or grouping --- there may be simple Vars 
> in the target list, but they can hardly be thought to 
> represent updatable values.

Omit the information. That will tell the client that the data did not
come directly from an underlying relation.

> Also, you muttered something about inferring primary keys and 
> number of relations in query; seems like this feature isn't 
> solving your problem as far as that goes, because the set of 
> attributes visible in the target list isn't much help for either.

That one is my problem. If I can identify the source attributes as I
would like, then I can work out those details easily. For aggregates,
functions, expressions etc. non-existant meta data will also tell me
what I need to know. The attributes in the target list do help with
this, because if I know I have a complete primary key, then I know I can
build an update query for any given row.

As I see it, the tidy/simple answer is to add nspname, relname and
attname information to each attribute in the row description, and if any
one of those cannot be easily determined, or the value doesn't come
directly from there (ie an aggregate), it is simply omitted or left
empty. It could be done more efficiently with attnum/attrelid, but I
wonder if that might cause problems in the future if attnum gets hacked
about.

Regards, Dave.

Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
"Dave Page" <dpage@vale-housing.co.uk> writes:
> As I see it, the tidy/simple answer is to add nspname, relname and
> attname information to each attribute in the row description, and if any
> one of those cannot be easily determined, or the value doesn't come
> directly from there (ie an aggregate), it is simply omitted or left
> empty. It could be done more efficiently with attnum/attrelid, but I
> wonder if that might cause problems in the future if attnum gets hacked
> about.

I'd prefer to stick with rel OID and attnum; those are the keys you are
going to need for looking in the system catalogs, anyway, and they're
surely a lot more compact to transmit than three Names (potentially 180
bytes or so).

So let me get this straight: we want the attribute ID information for
any SELECT output column that is a simple Var (no expressions, and no
aggregated cases either), treating view output columns as Vars belonging
to the view.  What about:

* Joins.  Does an aliased JOIN output variable get resolved to the
underlying table variable?  What about common columns in USING/NATURAL
joins?

* Sub-selects.  Are you expectingSELECT x FROM (SELECT x FROM foo) ss
to produce a reference to foo.x?  (If so, why is this different from the
case where ss is a view?)
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
"Dave Page"
Date:

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] 
> Sent: 11 March 2003 18:39
> To: Dave Page
> Cc: pgman@candle.pha.pa.us; pgsql-hackers@postgresql.org; 
> pgsql-interfaces@postgresql.org
> Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign 
> 
> 
> "Dave Page" <dpage@vale-housing.co.uk> writes:
> >> What about:
> >> 
> >> * Joins.  Does an aliased JOIN output variable get resolved
> >> to the underlying table variable?  What about common columns 
> >> in USING/NATURAL joins?
> 
> > Sorry, not sure what you mean - can you give me an example?
> 
> Sure, consider table1 with columns a,b,c and table2 with a,x,y:
> 
>     SELECT * FROM table1 NATURAL JOIN table2
> 
> has output columns a,b,c,x,y (notice only one column a).  How 
> do you expect these to be labeled?  Does the answer change if 
> the join is OUTER?  For example, in a LEFT JOIN x and y might 
> return NULLs that have nothing to do with any table2 row; is 
> it still appropriate to label them as table2 outputs?  And 
> what do we do with column a in each outer-join case?

Oh, I see. Well, we cannot resolve those cases unless we decide in
advance that we will return metadata from either the left or right
table. That sings future FAQ to me so I would be inclined to omit the
additional info as we would for an aggregate. 

> Also, how about an aliased join
> 
>     SELECT * FROM (table1 NATURAL JOIN table2) AS q(r,s,t,u,v)
> 
> In this case the output columns are named q.r, q.s, q.t, q.u, 
> q.v as far as the SELECT list can tell --- it's actually 
> illegal to refer to
> table1.* or table2.* in the SELECT list, IIRC.  Do we drill 
> down into the JOIN?  (Probably so, if you expect the thing to 
> drill down into sub-SELECTs, but I want clarity ...)

Likewise, if we can't figure out the info accurately, simply don't
report it.

In these, and any other similar cases, we will be no worse off with no
underlying attribute data than we are now, and in the vast majority of
cases we will benefit.

Regards, Dave.

Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
"Dave Page" <dpage@vale-housing.co.uk> writes:
>> What about:
>> 
>> * Joins.  Does an aliased JOIN output variable get resolved 
>> to the underlying table variable?  What about common columns 
>> in USING/NATURAL joins?

> Sorry, not sure what you mean - can you give me an example? 

Sure, consider table1 with columns a,b,c and table2 with a,x,y:
SELECT * FROM table1 NATURAL JOIN table2

has output columns a,b,c,x,y (notice only one column a).  How do you
expect these to be labeled?  Does the answer change if the join is
OUTER?  For example, in a LEFT JOIN x and y might return NULLs that
have nothing to do with any table2 row; is it still appropriate to
label them as table2 outputs?  And what do we do with column a in
each outer-join case?

Also, how about an aliased join
SELECT * FROM (table1 NATURAL JOIN table2) AS q(r,s,t,u,v)

In this case the output columns are named q.r, q.s, q.t, q.u, q.v as
far as the SELECT list can tell --- it's actually illegal to refer to
table1.* or table2.* in the SELECT list, IIRC.  Do we drill down into
the JOIN?  (Probably so, if you expect the thing to drill down into
sub-SELECTs, but I want clarity ...)

>> * Sub-selects.  Are you expecting
>> SELECT x FROM (SELECT x FROM foo) ss
>> to produce a reference to foo.x?  (If so, why is this 
>> different from the case where ss is a view?)

> Yes, I would expect that reference. I don't think it's different from a
> view - we are trying to deciper the SQL itself, not what a view actually
> represents so I would expect to see references to whatever relation is
> in the original SQL, whether that relation is the result of a subselect
> from another table/view, or a view. 

Okay, that clarifies the intent quite a bit.  So we need to do the labeling
processing before any view rewriting occurs.  That probably means
sticking the info into Resdom nodes during the parser, rather than
extracting it on-the-fly from the planned query.
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
"Dave Page"
Date:

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] 
> Sent: 11 March 2003 15:55
> To: Dave Page
> Cc: pgman@candle.pha.pa.us; pgsql-hackers@postgresql.org; 
> pgsql-interfaces@postgresql.org
> Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign 
> 
> 
> I'd prefer to stick with rel OID and attnum; those are the 
> keys you are going to need for looking in the system 
> catalogs, anyway, and they're surely a lot more compact to 
> transmit than three Names (potentially 180 bytes or so).

OK.

> So let me get this straight: we want the attribute ID 
> information for any SELECT output column that is a simple Var 
> (no expressions, and no aggregated cases either), treating 
> view output columns as Vars belonging to the view.  

Yes. We could include attrelid for aggregates I suppose, but obviously
not attnum. Nothing at all for expressions of course.

> What about:
> 
> * Joins.  Does an aliased JOIN output variable get resolved 
> to the underlying table variable?  What about common columns 
> in USING/NATURAL joins?

Sorry, not sure what you mean - can you give me an example? 

> * Sub-selects.  Are you expecting
>     SELECT x FROM (SELECT x FROM foo) ss
> to produce a reference to foo.x?  (If so, why is this 
> different from the case where ss is a view?)

Yes, I would expect that reference. I don't think it's different from a
view - we are trying to deciper the SQL itself, not what a view actually
represents so I would expect to see references to whatever relation is
in the original SQL, whether that relation is the result of a subselect
from another table/view, or a view. 

Regards, Dave.

Re: Roadmap for FE/BE protocol redesign

From
"Merlin Moncure"
Date:
Bruce Momjian [mailto:pgman@candle.pha.pa.us] wrote:
> Sure, Neil Conway updated Jan's patches for 7.3.  It is in:
>
>     ftp://candle.pha.pa.us/pub/postgresql/mypatches/

I have not been able to run the patch against either the 7.2.1 source or
the 7.3.2 source.  It always quits with assertion failure after trying
to patch refint.c

To run the patch I am typing (from inside the postgresql-7.x.x folder)
patch -p1 < win32-port-changes-2.diff

Running (from root)
patch -p0 < win32-port-changes-2.diff
does the same.

Looking inside the patch, I see its looking for the 7.2.1 source tree.
Tried running gnu patch and mingw patch without success.

Should I be patching the head or am I doing something wrong?

Merlin


Re: Roadmap for FE/BE protocol redesign

From
Kevin Brown
Date:
Tom Lane wrote:
> "Dave Page" <dpage@vale-housing.co.uk> writes:
> > Well, what would constitute a complete spec? I think I've told the group
> > what I would like to be able to do, what unanswered questions can I
> > (hopefully :-) ) answer?
> 
> I'm still unclear on exactly what your needs are.  In the first place,
> are you expecting to obtain data from arbitrary SELECT statements, or
> only from statements of the form "SELECT * FROM single_table"?  You've
> also been confusing as to whether you want transparency of views (ie,
> does a select from a view return data about the view's nominal columns
> or about the underlying base table columns?).  What about cases
> involving aggregates or grouping --- there may be simple Vars in the
> target list, but they can hardly be thought to represent updatable values.

These questions can't possibly be unique to PG -- other database
vendors must have answered these questions for their implementations
of ODBC/JDBC too, or their databases would give ODBC and JDBC client
authors the same kinds of trouble.

So ... how have these questions been answered by other database
vendors?

The spec may be underspecified on these points, but since the entire
purpose of ODBC and JDBC is to provide a database-independent
framework for application authors to use, it follows that
compatibility with other implementations is highly desirable.  So
doing more or less what other database vendors have done in response
to these underspecified issues is probably a sensible course of action
when there's no other obviously better answer.


-- 
Kevin Brown                          kevin@sysexperts.com


Re: Roadmap for FE/BE protocol redesign

From
Peter Eisentraut
Date:
Dave Page writes:

> I don't know about JDBC, but ODBC could use it, and it would save a heck
> of a lot of pain in apps like pgAdmin that need to figure out if a column
> in an arbitrary resultset might be updateable.

Strictly speaking, result sets are never updatable, because there's no way
you can refer to a result set and tell the system to update it.  So let's
hear what you *really* need and then consider interfaces for *that*.
Maybe updatable views or updatable cursors?

-- 
Peter Eisentraut   peter_e@gmx.net



Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
"Marc G. Fournier"
Date:
On Mon, 10 Mar 2003, Tom Lane wrote:

> One of the $64 questions that has to be answered is how much work we're
> willing to expend on backwards compatibility.  The path of least
> resistance would be to handle it the same way we've done protocol
> revisions in the past: the backend will be able to handle both old and
> new protocols (so it can talk to old clients) but libpq would be revised
> to speak only the new protocol (so new/recompiled clients couldn't talk
> to old backends).  We've gotten away with this approach in the past, but
> the last time was release 6.4.  I fully expect to hear more complaints
> now.

Personally ... as long as a v8.x client can talk to a v7.x backend, you
have my vote ... I'm more apt to upgrade my clients before my servers
anyway ...


Re: Roadmap for FE/BE protocol redesign

From
"Marc G. Fournier"
Date:
On Mon, 10 Mar 2003, Tom Lane wrote:

> Justin Clift <justin@postgresql.org> writes:
> > The scenario that's appealing to me the most is this for the next release:
> > PostgreSQL 8.0
> > + Includes PITR and the Win32 port
>
> If the folks doing those things can get done in time, great.  I'm even
> willing to push out the release schedule (now, not later) to make it
> more likely they can get done.  What I'm not willing to do is define
> the release in terms of "it happens when these things are done".  We
> learned the folly of that approach in 7.1 and 7.2.  Setting a target
> date and sticking to it works *much* better.

The thing is, IMHO, everyone knew the release scheduale for v7.4, so if
they aren't ready now, I can't really see justifying pushing things back
in hopes that they will be ready then ...




Re: Roadmap for FE/BE protocol redesign

From
"Marc G. Fournier"
Date:
On Mon, 10 Mar 2003, Bruce Momjian wrote:

> Tom Lane wrote:
> > Justin Clift <justin@postgresql.org> writes:
> > > The scenario that's appealing to me the most is this for the next release:
> > > PostgreSQL 8.0
> > > + Includes PITR and the Win32 port
> >
> > If the folks doing those things can get done in time, great.  I'm even
> > willing to push out the release schedule (now, not later) to make it
> > more likely they can get done.  What I'm not willing to do is define
> > the release in terms of "it happens when these things are done".  We
> > learned the folly of that approach in 7.1 and 7.2.  Setting a target
> > date and sticking to it works *much* better.
>
> Well, we had that 7.4 discussion a few days ago, and only had two people
> comment on our scheduling.  At that time, the discussion was for 7.4
> beta starting May 1 vs June 1.  June 1 would be roughly six months from
> our 7.3 final release, which is typical.
>
> I agree with Tom that we should decided sooner rather than later on a
> beta date.  Right now I don't think we have enough to must-have features
> to justify a release, and as everyone knows, an upgrade isn't easy for
> our users.
>
> I also agree with Tom that we shouldn't peg our beta schedule on specific
> features.
>
> So, what should we do?  Should we go another month or two and just wait
> until we have enough must-have features?  While not waiting on specific
> features, it _is_ waiting for something to warrant a release.  I guess
> the big question is whether we release on a scheduled-basis or a
> enough-features-basis.

Schedualed basis ... if we released on an 'enough features basis', I could
see alot longer then 6 mos between releases happening very quickly ... we
have enough problems staying within the scheduale as it is, let alot
moving it to a 'sliding scale' ...



Re: Roadmap for FE/BE protocol redesign

From
"Marc G. Fournier"
Date:
On Tue, 11 Mar 2003, Justin Clift wrote:

> Bruce Momjian wrote:
> <snip>
> > So, what should we do?  Should we go another month or two and just wait
> > until we have enough must-have features?  While not waiting on specific
> > features, it _is_ waiting for something to warrant a release.  I guess
> > the big question is whether we release on a scheduled-basis or a
> > enough-features-basis.
>
> Hmmm, I feel we should decide on features that will make an 8.0 release
> meaningful, and *somehow* work to making sure they are ready for the
> release.
>
> With 7.1/7.2, Tom mentioned us being delayed because specific features
> we were waiting for became dependant on one person.
>
> Would it be feasible to investigate approaches for having the Win32 and
> PITR work be shared amongst a few very-interested volunteers, so that
> people can cover for each other's downtime?  Not sure of the
> confidentiality level of the Win32/PITR patches at present, but I'd
> guess there would be at least a few solid volunteers willing to
> contribute to the Win32/PITR ports if we asked for people to step
> forwards.

Why should we be the ones to ask for ppl to step forward to volunteer to
help?  Shouldn't it be the responsibility of the developer working on it
to admit that there is no way they will make the scheduale and call for
help?



Re: Roadmap for FE/BE protocol redesign

From
Hiroshi Inoue
Date:
Tom Lane wrote:
> 
> Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> > What the driver has suffered from is to get the
> > fields' info of a query result or the parameters'
> > info of a statement. The info is needed even before
> > the execution of the statement(i.e it's only prepared).
> 
> Hm.  Are you saying that you would like PREPARE to send back a
> RowDescription ('T') message?

I'm not sure if PREPARE should return the info directly.
Maybe it should be returned only when it is requested.

> Or is there more to it than that?

More detailed field info is needed anyway.
For example the RowDescription contains neither 
the base column name, the table name nor the
schema name currently and so the current odbc
driver couldn't return the info correctly.
The ODBC function SQLDescribeCol or SQLColAttribute
requires various kind of fields' info.

It's almost impossible to get parameters' info
currently. The ODBC function SQLDescribeParam
isn't implemented yet....

regards,
Hiroshi Inouehttp://www.geocities.jp/inocchichichi/psqlodbc/


Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Marc G. Fournier wrote:
> On Mon, 10 Mar 2003, Tom Lane wrote:
> 
> > One of the $64 questions that has to be answered is how much work we're
> > willing to expend on backwards compatibility.  The path of least
> > resistance would be to handle it the same way we've done protocol
> > revisions in the past: the backend will be able to handle both old and
> > new protocols (so it can talk to old clients) but libpq would be revised
> > to speak only the new protocol (so new/recompiled clients couldn't talk
> > to old backends).  We've gotten away with this approach in the past, but
> > the last time was release 6.4.  I fully expect to hear more complaints
> > now.
> 
> Personally ... as long as a v8.x client can talk to a v7.x backend, you
> have my vote ... I'm more apt to upgrade my clients before my servers
> anyway ...

Actually, it is usually the opposite, where old clients can talk to
newer servers, but not the reverse.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
"Marc G. Fournier"
Date:
On Tue, 11 Mar 2003, Bruce Momjian wrote:

> Marc G. Fournier wrote:
> > On Mon, 10 Mar 2003, Tom Lane wrote:
> >
> > > One of the $64 questions that has to be answered is how much work we're
> > > willing to expend on backwards compatibility.  The path of least
> > > resistance would be to handle it the same way we've done protocol
> > > revisions in the past: the backend will be able to handle both old and
> > > new protocols (so it can talk to old clients) but libpq would be revised
> > > to speak only the new protocol (so new/recompiled clients couldn't talk
> > > to old backends).  We've gotten away with this approach in the past, but
> > > the last time was release 6.4.  I fully expect to hear more complaints
> > > now.
> >
> > Personally ... as long as a v8.x client can talk to a v7.x backend, you
> > have my vote ... I'm more apt to upgrade my clients before my servers
> > anyway ...
>
> Actually, it is usually the opposite, where old clients can talk to
> newer servers, but not the reverse.

D'oh, mis-read Tom's ... you are correct, and it does make sense to do so
... its not like old libraries aren't available if someone wanted to make
a pre-compiled version that is 'backwards compatible' ...




Re: Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Marc G. Fournier wrote:
> > So, what should we do?  Should we go another month or two and just wait
> > until we have enough must-have features?  While not waiting on specific
> > features, it _is_ waiting for something to warrant a release.  I guess
> > the big question is whether we release on a scheduled-basis or a
> > enough-features-basis.
> 
> Schedualed basis ... if we released on an 'enough features basis', I could
> see alot longer then 6 mos between releases happening very quickly ... we
> have enough problems staying within the scheduale as it is, let alot
> moving it to a 'sliding scale' ...

I guess the big question is that if we can't get enough big features in
6 months, do we still stay on the 6 month schedule?  I know Tom said
folks don't have to upgrade --- that is true, but our releases do seem a
little lighter lately.

Six months would be June 1 beta, so maybe that is still a good target. 
I agree we should not hold up beta for any feature.  So maybe the plan
is June 1 beta, and we don't care if we have enough big features or not
--- does that sound good to everyone?  Or should we be looking at May 1
as Tom originally suggested?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Roadmap for FE/BE protocol redesign

From
Neil Conway
Date:
On Tue, 2003-03-11 at 15:38, Merlin Moncure wrote:
> Bruce Momjian [mailto:pgman@candle.pha.pa.us] wrote:
> > Sure, Neil Conway updated Jan's patches for 7.3

Erm, not exactly. I just removed some of the changes in Jan's 7.2.1
patch that had already been made in CVS HEAD (mostly trivial stuff like
constant renaming) -- since that made the patch a fair bit smaller, I
figured it was worth sending along to pgsql-patches.

> I have not been able to run the patch against either the 7.2.1 source or
> the 7.3.2 source.  It always quits with assertion failure after trying
> to patch refint.c

Yeah -- the patch is certainly not in a state that can be applied
against CVS HEAD (and I don't see the point in bothering with applying
it to older versions).

Rather than expecting the whole thing to apply cleanly to CVS HEAD
(which would probably leave you with little work to do, as that's close
to the end goal), my suggestion would be to read through the patch
itself. Find a set of related changes made by the patch (e.g. libpq
changes, FSM changes, etc.), and merge those into CVS HEAD. Then submit
a patch to pgsql-patches.

Cheers,

Neil

-- 
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC





Re: Roadmap for FE/BE protocol redesign

From
Justin Clift
Date:
Marc G. Fournier wrote:
<snip>>>Would it be feasible to investigate approaches for having the Win32 and>>PITR work be shared amongst a few
very-interestedvolunteers, so that>>people can cover for each other's downtime?  Not sure of the>>confidentiality level
ofthe Win32/PITR patches at present, but I'd>>guess there would be at least a few solid volunteers willing
to>>contributeto the Win32/PITR ports if we asked for people to step>>forwards.>> Why should we be the ones to ask for
pplto step forward to volunteer to> help?  Shouldn't it be the responsibility of the developer working on it> to admit
thatthere is no way they will make the scheduale and call for> help?
 

Hadn't thought of that.  Um.. how about "whatever works"?  It's sounds 
like the kind of thing where some people may be offended if we suddenly 
started asking for extra volunteers for their bits, and others in the 
same situation wouldn't.

Now that you mention it, it's probably a matter of getting that bit 
correct I suppose.  :)

Regards and best wishes,

Justin Clift

-- 
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
- Indira Gandhi



Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Kevin Brown <kevin@sysexperts.com> writes:
> doing more or less what other database vendors have done in response
> to these underspecified issues is probably a sensible course of action
> when there's no other obviously better answer.

A good point, indeed.  Who wants to do the legwork to check up on this?
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Hiroshi Inoue
Date:
Peter Eisentraut wrote:
> 
> Dave Page writes:
> 
> > I don't know about JDBC, but ODBC could use it,
> > and it would save a heck of a lot of pain in apps
> > like pgAdmin that need to figure out if a column
> > in an arbitrary resultset might be updateable.
> 
> Strictly speaking, result sets are never updatable,
> because there's no way you can refer to a result set
> and tell the system to update it.
> So let's hear what you *really* need and then consider
> interfaces for *that*. Maybe updatable views or
> updatable cursors?

The word *result set* is used variously
and *updatable cursors* has its meaning
in some middleware applications.

The word *cursor* is used variously too.
For example, ODBC cursors don't necessarily
mean(correspond to) the dbms cursors.
If there's a suitable cursor, the ODBC driver
may conveniently use it to implement ODBC
cursors. Otherwise the ODBC driver may implement
ODBC cursors by itself. PostgreSQL has paid
attention to the cursor support little and I
don't expect much of server-side cursors.

regards,
Hiroshi Inouehttp://www.geocities.jp/inocchichichi/psqlodbc/


Re: Roadmap for FE/BE protocol redesign

From
Barry Lind
Date:
> * Backend should pass its version number, database encoding, default
> client encoding, and possibly other data (any ideas?) to frontend during
> startup, to avoid need for explicit queries to get this info.  We could
> also consider eliminating SET commands sent by libpq in favor of adding
> variable settings to startup packet's PGOPTIONS field.  Ideally we could
> get back to the point where a standard connection startup takes only one
> packet in each direction.

This handles the JDBC needs (currently on startup the jdbc driver 
selects the database encoding and version number and sets the datestyle 
and autocommit parameters).

One addition I would personally like to see (it comes up in my apps 
code) is the ability to detect wheather the server is big endian or 
little endian.  When using binary cursors this is necessary in order to 
read int data.  Currently I issue a 'select 1' statement at connection 
startup to determine what format the server is using.


Other things I would like to see to help jdbc:

1) More information about the attributes selected in a query (I see 
there is an entire thread on this already) to minimize the work 
necessary to implement updateable result sets as defined by the jdbc spec.

2) Better support for domains.  Currently the jdbc driver is broken with 
regards to domains (although no one has reported this yet).  The driver 
will treat a datatype that is a domain as an unknown/unsupported 
datatype.  It would be great if the T response included the 'base' 
datatype for a domain attribute so that the driver would know what 
parsing routines to call to convert to/from the text representation the 
backend expects.

3) Protocol level support for CURSORs.  It would be nice if cursor 
support was done at the protocol level and not as a SQL command.  The 
current default behavior of returning all results from a query in the 
query response message is often a problem (can easily lead to out of 
memory problems for poorly written queries).  So it is desirable to use 
cursors.  But with the current implementation in SQL, cursors are not 
the appropriate choice if a query is only going to return one or a few 
rows.  The reason is that using a cursor requires a minimum of three SQL 
statements:  DECLARE, FETCH, CLOSE.  The jdbc driver issues the DECLARE 
and FETCH in one server call, but the CLOSE needs to be a second call. 
Thus for simple one row selects (which in many cases are the majority of 
selects issued) using CURSORS requires two roundtrips to the server vs. 
one for the nonCursor case.
This leaves me with a problem in the jdbc driver, I can either use 
standard fast/performant queries for single row selects that blowup with 
out of memory errors for large results, or I can use cursors and avoid 
large memory usage but hurt overall performance.  What I have currently 
done is require that the developer call an extra method to turn on the 
use of cursors when they know that the cursor is going to return a large 
number of rows and leave the default be the non-cursor case.  This works 
but requires that developers who are writing code to interact with 
multiple different databases, code differently for the postgres jdbc 
driver.  And this is a problem since one of the goals of jdbc is to be 
able to write code that works against multiple different databases.
So I would request the ability of the client to set a max rows parameter   for query results.  If a query were to
returnmore than the max 
 
number of rows, the client would be given a handle (essentially a cursor 
name) that it could use to fetch additional sets of rows.

4) Protocol level support of PREPARE.  In jdbc and most other 
interfaces, there is support for parameterized SQL.  If you want to take 
advantage of the performance benefits of reusing parsed plans you have 
to use the PREPARE SQL statement.  My complaint on doing this at the SQL 
level vs the protocol level is similar to the problem with cursors 
above.  To use prepare you need to issue three SQL statements: PREPARE, 
EXCECUTE, DEALLOCATE.  If you know ahead of time that you are going to 
reuse a statement many times doing PREPARE, EXECUTE, EXECUTE, ..., 
DEALLOCATE makes sense and can be a big win in performance.  However if 
you only ever execute the statement once then you need to use two round 
trips (one for the PREPARE, EXECUTE and another for the DEALLOCATE) 
versus one round trip to execute the statement 'normally'.  So it 
decreases performance to use prepares for all parameterized sql 
statements.  So the current implementation in jdbc requires the user to 
issue a postgres specific call to turn on the use of prepared statements 
for those cases the developer knows will be a performance win.  But this 
requires coding differently for postgres jdbc than for other databases.
So being better able to handle this in the protocol would be nice.

5) Better support for "large values".  Generally I recommend that users 
of jdbc use bytea to store large binary values.  I generally tell people 
to avoid using LOs (Large Objects).  The reason for this is that LOs 
have two significant problems:  1) security - any user on the database 
can access all LOs even though they may not be able to access the row 
that contains the LO reference, 2) cleanup - deleting the row containing 
the LO reference doesn't delete the LO requireing extra code or triggers 
to behave like a regular value in a regular column.  Bytea works OK for 
small to medium sized values, but doesn't work for very large values, 
where by very large I mean over a few Megabytes.  The reason very large 
values are a problem is memory usage.  There is no way to 'stream' bytea 
values from the server like you can do with LOs, so the driver ends up 
storeing the entire value in memory as it reads the result from the 
backend for a query.  And if the query returns multiple rows each with a 
large value you quickly run out of memory.
So what I would like to see is the ability for the client to set a MAX 
VALUE size parameter.  The server would send up to this amount of data 
for any column.  If the value was longer than MAX VALUE, the server 
would respond with a handle that the client could use to get the rest of 
the value (in chunks of MAX VALUE) if it wanted to.  This would allow 
the client to get the entire result set which could contain perhaps many 
large bytea values, but not use a lot of memory up front.  Then fetch 
the entire values only when/if the application asked for them and stream 
the result to the application and never bring the entire contents of the 
column into memory at once.  (There are probably a number of different 
implementation posibilities so use this one as a suggestion to explain 
what I would like to see not necessarily how it should be implemented).

6)  Better over the wire support for bytea.  The current encoding of 
binary data \000 results in a significant expansion in the size of data 
transmitted.  It would be nice if bytea data didn't result in 2 or 3 
times data expansion. (and all the cpu cycles to convert to/from the 
escaped format).  This may not be a protocol issue, but IMHO the best 
way to fix this would be in the protocol.

thanks,
--Barry





Re: Roadmap for FE/BE protocol redesign

From
Barry Lind
Date:

Dave Page wrote:
> I don't know about JDBC, but ODBC could use it, and it would save a heck
> of a lot of pain in apps like pgAdmin that need to figure out if a column
> in an arbitrary resultset might be updateable.
> At the moment there is some nasty code in pgAdmin II that attempts to
> parse the SQL statement to figure out if the the resultset is updateable
> by trying to figure out the number of relations in the query, whether any
> of them is a view or sequence, whether there are any function calls or
> expressions in the attribute list and so on. It then has to try to figure
> out if there is a complete pkey in the resultset that can be used for the
> update, or whether it should attempt an update based on all existing
> values. That code is just plain nasty in VB. In pgAdmin III we've already
> mentioned stealing bits of the PostgreSQL parser.

I will just add a "me to" here.  This would be very useful for JDBC as 
well.  We go through the same hoops to support the jdbc spec that Dave 
does.  The jdbc spec has two features that require this level of 
information:

1) For every result set you can ask for a ResultSetMetaData object. 
This object provides you with the following methods:

getColumnCount()
isAutoIncrement(int column)
isCaseSensitive(int column)
isSearchable(int column)
isNullable(int column)
getColumnDisplaySize(int column)
getColumnLabel(int column)
getColumnName(int column)
getSchemaName(int column)
getPrecision(int column)
getScale(int column)
getTableName(int column)
getColumnTypeName(int column)
isReadOnly(int column)
isWritable(int column)
isDefinitelyWritable(int column)

Now one can state the spec is broken and it doesn't make sense to ask 
this type of information about a query (and frankly I would agree with 
you), but that doesn't mean that I don't need to support it anyway.  So 
anything that the server can do to make this easier is greatly 
appreciated.  And I believe ODBC has almost the exact same issue since 
in general the JDBC spec was copied from the ODBC spec.

2) Updateable result sets.  The jdbc spec allows the user to declare any 
select statement to be updateable.  This means that as you scroll 
through the result (the ResultSet object) you can issue modify the data 
and expect the jdbc driver to reflect that change back to the base 
tables.  The following if from the JDBC API doc:
 * A set of updater methods were added to this interface * in the JDBC 2.0 API (JavaTM 2 SDK, * Standard Edition,
version1.2). The comments regarding parameters * to the getter methods also apply to parameters to the * updater
methods.* * The updater methods may be used in two ways: * * to update a column value in the current row.  In a
scrollable*     ResultSet object, the cursor can be moved backwards *     and forwards, to an absolute position, or to
aposition *     relative to the current row. *     The following code fragment updates the NAME column *     in the
fifthrow of the ResultSet object *     rs and then uses the method updateRow *     to update the data source table from
whichrs was derived. * * *       rs.absolute(5); // moves the cursor to the fifth row of rs *
rs.updateString("NAME","AINSWORTH"); // updates the *          // NAME column of row 5 to be AINSWORTH *
rs.updateRow();// updates the row in the data source * * * to insert column values into the insert row.  An updatable *
   ResultSet object has a special row associated with *     it that serves as a staging area for building a row to be
inserted.*     The following code fragment moves the cursor to the insert row, 
 
builds *     a three-column row, and inserts it into rs and into *     the data source table using the method
insertRow.* * *       rs.moveToInsertRow(); // moves cursor to the insert row *       rs.updateString(1, "AINSWORTH");
//updates the *          // first column of the insert row to be AINSWORTH *       rs.updateInt(2,35); // updates the
secondcolumn to be 35 *       rs.updateBoolean(3, true); // updates the third row to true *       rs.insertRow(); *
 rs.moveToCurrentRow(); *
 

Now application developers love this functionality.  It allows them to 
implement fairly complex apps with very little sql knowledge.  They only 
need to know how to do a simple select statement and that is it.  The 
jdbc driver handles the rest for them automatically (updates, inserts, 
deletes).  As a jdbc maintainer I personally hate this functionality as 
it is a real pain to implement, and can't work in any but the most 
simple select statements.  But is is part of the spec and needs to be 
supported in the best manner possible.

thanks,
--Barry




Re: Roadmap for FE/BE protocol redesign

From
"Zeugswetter Andreas SB SD"
Date:
Hiroshi Inoue wrote:
> Tom Lane wrote:
> > 
> > Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> > > What the driver has suffered from is to get the
> > > fields' info of a query result or the parameters'
> > > info of a statement. The info is needed even before
> > > the execution of the statement(i.e it's only prepared).
> > 
> > Hm.  Are you saying that you would like PREPARE to send back a
> > RowDescription ('T') message?
> 
> I'm not sure if PREPARE should return the info directly.
> Maybe it should be returned only when it is requested.
> 
> > Or is there more to it than that?
> 
> More detailed field info is needed anyway.
> For example the RowDescription contains neither 
> the base column name, the table name nor the
> schema name currently and so the current odbc
> driver couldn't return the info correctly.

But it only needs that info to do system catalog lookups for type, length etc.
Imho the target should be to not need catalog lookups, but get all needed info 
from the 'T' message, no?

> The ODBC function SQLDescribeCol or SQLColAttribute
> requires various kind of fields' info.

RETCODE SQL_API SQLDescribeCol(   HSTMT       hstmt,   UWORD       icol,   UCHAR  FAR *szColName,   SWORD
cbColNameMax,  SWORD  FAR *pcbColName,   SWORD  FAR *pfSqlType,   UDWORD FAR *pcbColDef,   SWORD  FAR *pibScale,
SWORD FAR *pfNullable);
 

So you see it does not need tablename or schema.

> 
> It's almost impossible to get parameters' info
> currently. The ODBC function SQLDescribeParam
> isn't implemented yet....

Would be same as above, the 'T' message should describe input 
and output columns.

Andreas


Re: Roadmap for FE/BE protocol redesign

From
"Zeugswetter Andreas SB SD"
Date:
> > I'm still unclear on exactly what your needs are.  In the first place,
> > are you expecting to obtain data from arbitrary SELECT statements, or
> > only from statements of the form "SELECT * FROM single_table"?  You've
> > also been confusing as to whether you want transparency of views (ie,
> > does a select from a view return data about the view's nominal columns
> > or about the underlying base table columns?).  What about cases
> > involving aggregates or grouping --- there may be simple Vars in the
> > target list, but they can hardly be thought to represent
> updatable values.
>
> These questions can't possibly be unique to PG -- other database
> vendors must have answered these questions for their implementations
> of ODBC/JDBC too, or their databases would give ODBC and JDBC client
> authors the same kinds of trouble.
>
> So ... how have these questions been answered by other database
> vendors?

They return type info (RowDescription ('T') message) for "prepare", like Tom suggested.
See the more or less standardized ESQL/C data structure sqlda on what is needed
(label, type, length + precision, nullability, pointer to data (for binding host variables), ...). Note that this
structis used for both directions (so when inserting from host variables the interface knows what conversions need to
beapplied).  

Andreas


Re: Roadmap for FE/BE protocol redesign

From
Hannu Krosing
Date:
Barry Lind kirjutas K, 12.03.2003 kell 10:43:
> Dave Page wrote:
> > I don't know about JDBC, but ODBC could use it, and it would save a heck
> > of a lot of pain in apps like pgAdmin that need to figure out if a column
> > in an arbitrary resultset might be updateable.
> > At the moment there is some nasty code in pgAdmin II that attempts to
> > parse the SQL statement to figure out if the the resultset is updateable
> > by trying to figure out the number of relations in the query, whether any
> > of them is a view or sequence, whether there are any function calls or
> > expressions in the attribute list and so on. It then has to try to figure
> > out if there is a complete pkey in the resultset that can be used for the
> > update, or whether it should attempt an update based on all existing
> > values. That code is just plain nasty in VB. In pgAdmin III we've already
> > mentioned stealing bits of the PostgreSQL parser.
> 
> I will just add a "me to" here.  This would be very useful for JDBC as 
> well.  We go through the same hoops to support the jdbc spec that Dave 
> does.  The jdbc spec has two features that require this level of 
> information:
...

> Now one can state the spec is broken and it doesn't make sense to ask 
> this type of information about a query (and frankly I would agree with 
> you), but that doesn't mean that I don't need to support it anyway.  So 
> anything that the server can do to make this easier is greatly 
> appreciated.  And I believe ODBC has almost the exact same issue since 
> in general the JDBC spec was copied from the ODBC spec.

And IIRC they both are mostly the same as ANSI/ISO SQL/CLI spec. The CLI
being Call Level Interface.

> 2) Updateable result sets.  The jdbc spec allows the user to declare any 
> select statement to be updateable.  This means that as you scroll 
> through the result (the ResultSet object) you can issue modify the data 
> and expect the jdbc driver to reflect that change back to the base 
> tables.  The following if from the JDBC API doc:
...
> Now application developers love this functionality.  It allows them to 
> implement fairly complex apps with very little sql knowledge.  They only 
> need to know how to do a simple select statement and that is it.  The 
> jdbc driver handles the rest for them automatically (updates, inserts, 
> deletes).  As a jdbc maintainer I personally hate this functionality as 
> it is a real pain to implement, and can't work in any but the most 
> simple select statements.  But is is part of the spec and needs to be 
> supported in the best manner possible.

Updatable cursors should probably come out as a side benefit when
backend gains the ability to have updatable views. If you have done some
work towards implementing it in JDBC driver, some it will probably be
handy for anyone who is going to implement updatable views.

-----------------
Hannu



Re: Roadmap for FE/BE protocol redesign

From
"Merlin Moncure"
Date:
Dave Page wrote:
> I don't know about JDBC, but ODBC could use it, and it would save a
heck
> of a lot of pain in apps like pgAdmin that need to figure out if a
column
> in an arbitrary resultset might be updateable.

I use an open source toolkit, zeos components for postgresql, that make
this possible.  You can run simple queries (even with some joins) and it
allows updating of the result set *sometimes*.  This can be done inline
with a grid control.

These components work with Delphi (object pascal), Kylix, and C++
Builder.  They are the fastest middleware I have ever seen for any
database, period.  They also already have a built in SQL parser and are
open source.  The project is hosted in sourceforge and I believe it to
be GPL.

> values. That code is just plain nasty in VB. In pgAdmin III we've
already
> mentioned stealing bits of the PostgreSQL parser.

Do not be swayed by the dark side.  In my spare time I threw together a
'proof of concept' replacement for the technology used in pgAdmin.   It
is written in C++.  In three seconds I can query a table and put 100000
records in a grid.  I plan to finish it and release it.  The main reason
not to use it is that it relies on commercial tools to build.

If you or anybody else is intereted, let me know and I'll send it your
way.

Merlin






Re: Roadmap for FE/BE protocol redesign

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Zeugswetter Andreas SB SD [mailto:ZeugswetterA@spardat.at] 
> 
> Hiroshi Inoue wrote:
> > Tom Lane wrote:
> > > 
> > > Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> > > > What the driver has suffered from is to get the
> > > > fields' info of a query result or the parameters'
> > > > info of a statement. The info is needed even before
> > > > the execution of the statement(i.e it's only prepared).
> > > 
> > > Hm.  Are you saying that you would like PREPARE to send back a
> > > RowDescription ('T') message?
> > 
> > I'm not sure if PREPARE should return the info directly.
> > Maybe it should be returned only when it is requested.
> > 
> > > Or is there more to it than that?
> > 
> > More detailed field info is needed anyway.
> > For example the RowDescription contains neither 
> > the base column name, the table name nor the
> > schema name currently and so the current odbc
> > driver couldn't return the info correctly.
> 
> But it only needs that info to do system catalog lookups for 
> type, length etc.
> Imho the target should be to not need catalog lookups, but 
> get all needed info 
> from the 'T' message, no?

Maybe.

> 
> > The ODBC function SQLDescribeCol or SQLColAttribute
> > requires various kind of fields' info.
> 
> RETCODE SQL_API SQLDescribeCol(
>     HSTMT       hstmt,
>     UWORD       icol,
>     UCHAR  FAR *szColName,
>     SWORD       cbColNameMax,
>     SWORD  FAR *pcbColName,
>     SWORD  FAR *pfSqlType,
>     UDWORD FAR *pcbColDef,
>     SWORD  FAR *pibScale,
>     SWORD  FAR *pfNullable);
> 
> So you see it does not need tablename or schema.

Oh yes but SQLColAtrribute should return
such info if requested.
regards,
Hiroshi Inoue



Re: Roadmap for FE/BE protocol redesign

From
"Dave Page"
Date:

> -----Original Message-----
> From: Peter Eisentraut [mailto:peter_e@gmx.net] 
> Sent: 12 March 2003 00:34
> To: Dave Page
> Cc: Tom Lane; PostgreSQL Development
> Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign
> 
> 
> Dave Page writes:
> 
> > I don't know about JDBC, but ODBC could use it, and it would save a 
> > heck of a lot of pain in apps like pgAdmin that need to 
> figure out if 
> > a column in an arbitrary resultset might be updateable.
> 
> Strictly speaking, result sets are never updatable, because 
> there's no way you can refer to a result set and tell the 
> system to update it.  So let's hear what you *really* need 
> and then consider interfaces for *that*. Maybe updatable 
> views or updatable cursors?

Well what I *really* need has been made quite clear in other posts, but,
when I say resultset in the same sentence as pgAdmin, I'm referring to
the ability to enter an arbitrary SQL query, have the results displayed
in a grid, which can then be editted. To do this pgAdmin needs to be
able to figure out enough info about the source of the data to generate
the required insert/update/delete statements.

It also happens that ODBC, JDBC etc, also need the same information to
meet their specs.

Regards, Dave.

Re: Roadmap for FE/BE protocol redesign

From
"Dave Page"
Date:

> -----Original Message-----
> From: Zeugswetter Andreas SB SD [mailto:ZeugswetterA@spardat.at] 
> Sent: 12 March 2003 09:50
> To: Hiroshi Inoue; Tom Lane
> Cc: Bruce Momjian; pgsql-hackers@postgresql.org; 
> pgsql-interfaces@postgresql.org
> Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign
> 
> > The ODBC function SQLDescribeCol or SQLColAttribute
> > requires various kind of fields' info.
> 
> RETCODE SQL_API SQLDescribeCol(
>     HSTMT       hstmt,
>     UWORD       icol,
>     UCHAR  FAR *szColName,
>     SWORD       cbColNameMax,
>     SWORD  FAR *pcbColName,
>     SWORD  FAR *pfSqlType,
>     UDWORD FAR *pcbColDef,
>     SWORD  FAR *pibScale,
>     SWORD  FAR *pfNullable);
> 
> So you see it does not need tablename or schema.

No, but with them we can avoid cluttering the wire protocol with fields
for all this, and the JDBC required data. With 2 numeric columns
(attrelid, attnum), any application/interface can query the system
catalogs easily for whatever extra info they like.

Regards, Dave.

Re: Roadmap for FE/BE protocol redesign

From
"Dave Page"
Date:

> -----Original Message-----
> From: Merlin Moncure [mailto:merlin.moncure@rcsonline.com] 
> Sent: 12 March 2003 13:35
> To: Dave Page
> Cc: pgsql-hackers@postgresql.org
> Subject: RE: [HACKERS] Roadmap for FE/BE protocol redesign
> 
> 
> > values. That code is just plain nasty in VB. In pgAdmin III we've
> already
> > mentioned stealing bits of the PostgreSQL parser.
> 
> Do not be swayed by the dark side.  In my spare time I threw 
> together a
> 'proof of concept' replacement for the technology used in 
> pgAdmin.   It
> is written in C++.  In three seconds I can query a table and 
> put 100000 records in a grid.  I plan to finish it and 
> release it.  The main reason not to use it is that it relies 
> on commercial tools to build.  
> 
> If you or anybody else is intereted, let me know and I'll 
> send it your way.

You do realise I'm the pgAdmin project lead? Anyway, the 'technology' in
pgAdmin II is ADO/ODBC which I quite agree doesn't play well with that
many records - but then, not many humans do either.

In pgAdmin III we use libpq directly from C++. If you can speed up that
then I'm sure there will be more people than just me that are interested
:-)

The code I'm actually referring too in this thread, is that which
decides whether and how the results from a given query can be updated.
That would be vastly simplified is I could easily locate the
pg_attribute row for each column.

Regards, Dave.

Re: Roadmap for FE/BE protocol redesign

From
Dave Cramer
Date:
This may be problematic in the future if we change attrelid, attnum.

My preference would be to be able to query the backend for the
information if I need it. It is rarely required.

ie give me the meta data for the last resultset.

Dave

On Wed, 2003-03-12 at 09:49, Dave Page wrote:
> > -----Original Message-----
> > From: Zeugswetter Andreas SB SD [mailto:ZeugswetterA@spardat.at] 
> > Sent: 12 March 2003 09:50
> > To: Hiroshi Inoue; Tom Lane
> > Cc: Bruce Momjian; pgsql-hackers@postgresql.org; 
> > pgsql-interfaces@postgresql.org
> > Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign
> > 
> > > The ODBC function SQLDescribeCol or SQLColAttribute
> > > requires various kind of fields' info.
> > 
> > RETCODE SQL_API SQLDescribeCol(
> >     HSTMT       hstmt,
> >     UWORD       icol,
> >     UCHAR  FAR *szColName,
> >     SWORD       cbColNameMax,
> >     SWORD  FAR *pcbColName,
> >     SWORD  FAR *pfSqlType,
> >     UDWORD FAR *pcbColDef,
> >     SWORD  FAR *pibScale,
> >     SWORD  FAR *pfNullable);
> > 
> > So you see it does not need tablename or schema.
> 
> No, but with them we can avoid cluttering the wire protocol with fields
> for all this, and the JDBC required data. With 2 numeric columns
> (attrelid, attnum), any application/interface can query the system
> catalogs easily for whatever extra info they like.
> 
> Regards, Dave.
-- 
Dave Cramer <Dave@micro-automation.net>
-- 
Dave Cramer <dave@fastcrypt.com>
Cramer Consulting



Re: Roadmap for FE/BE protocol redesign

From
"Dave Page"
Date:

> -----Original Message-----
> From: Dave Cramer [mailto:Dave@micro-automation.net] 
> Sent: 12 March 2003 14:55
> To: Dave Page
> Cc: Zeugswetter Andreas SB SD; Hiroshi Inoue; Tom Lane; Bruce 
> Momjian; pgsql-hackers@postgresql.org; pgsql-interfaces@postgresql.org
> Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign
> 
> 
> This may be problematic in the future if we change attrelid, attnum.

I suggested using names to Tom for this reason, but he preferred to use
attrelid/attnum.

> My preference would be to be able to query the backend for 
> the information if I need it. It is rarely required.

Rarely required by most applications, but most interfaces do need this
ability, and so do apps like pgAdmin and OpenOffice that allow the
execution of arbitrary queries and subsequent updating of the results.

Regards, Dave.

Re: Roadmap for FE/BE protocol redesign

From
Alvaro Herrera
Date:
On Wed, Mar 12, 2003 at 03:02:23PM -0000, Dave Page wrote:

> > This may be problematic in the future if we change attrelid, attnum.
> 
> I suggested using names to Tom for this reason, but he preferred to use
> attrelid/attnum.

Oh, and what happenned to the attlognum idea?  If something that needs
it is going to be implemented the column should probably be added now
and used instead of attnum.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"In a specialized industrial society, it would be a disaster
to have kids running around loose." (Paul Graham)


Re: Roadmap for FE/BE protocol redesign

From
Dave Cramer
Date:
On Wed, 2003-03-12 at 10:02, Dave Page wrote:
> > -----Original Message-----
> > From: Dave Cramer [mailto:Dave@micro-automation.net] 
> > Sent: 12 March 2003 14:55
> > To: Dave Page
> > Cc: Zeugswetter Andreas SB SD; Hiroshi Inoue; Tom Lane; Bruce 
> > Momjian; pgsql-hackers@postgresql.org; pgsql-interfaces@postgresql.org
> > Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign
> > 
> > 
> > This may be problematic in the future if we change attrelid, attnum.
> 
> I suggested using names to Tom for this reason, but he preferred to use
> attrelid/attnum.

Ok, but I am willing to bet that this will be a problem later. However
we can deal with it later, as well.

> 
> > My preference would be to be able to query the backend for 
> > the information if I need it. It is rarely required.
> 
> Rarely required by most applications, but most interfaces do need this
> ability, and so do apps like pgAdmin and OpenOffice that allow the
> execution of arbitrary queries and subsequent updating of the results.

Yes, the interface requires it but how often? If I just do a select *
from foo, I don't need all the metadata, it's only if I do getColumnName
or something that I need metadata.

However qualified column names would be nice, as in the instance when
two column names are the same from different tables.
> 
> Regards, Dave.
-- 
Dave Cramer <dave@fastcrypt.com>
Cramer Consulting



Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
"Dave Page" <dpage@vale-housing.co.uk> writes:
> No, but with them we can avoid cluttering the wire protocol with fields
> for all this, and the JDBC required data. With 2 numeric columns
> (attrelid, attnum), any application/interface can query the system
> catalogs easily for whatever extra info they like.

This is my feeling also.  We shouldn't try to guess in the protocol
exactly what set of information will be wanted by a frontend; we should
just provide the catalog keys needed to look up whatever is wanted.

I like attrelid/attnum because (a) it's compact --- not much added
overhead for apps that don't care; and (b) it's a primary key, which
names are not.  The relid/attnum identity of a column will remain the
same as long as it exists.  A name-based API breaks the moment some
other user renames things; similar objections could be raised against
attlognum, if we had it.

The point about apps breaking if the system catalog layout changes is a
fair one, but that situation would be no worse than it's ever been; ODBC
and JDBC have coped well enough so far.  In any case the best solution
would be to offer functions that hide the details of where to find any
given bit of info.  We've got a few like that already (format_type() for
example) and I have no problem with adding more.
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Barry Lind <blind@xythos.com> writes:
> One addition I would personally like to see (it comes up in my apps 
> code) is the ability to detect wheather the server is big endian or 
> little endian.  When using binary cursors this is necessary in order to 
> read int data.

Actually, my hope is to eliminate that business entirely by
standardizing the on-the-wire representation for binary data; note the
reference to send/receive routines in the original message.  For integer
data this is simple enough: network byte order will be it.  I'm not sure
yet what to do about float data.

> 2) Better support for domains.  Currently the jdbc driver is broken with 
> regards to domains (although no one has reported this yet).  The driver 
> will treat a datatype that is a domain as an unknown/unsupported 
> datatype.  It would be great if the T response included the 'base' 
> datatype for a domain attribute so that the driver would know what 
> parsing routines to call to convert to/from the text representation the 
> backend expects.

I'm unconvinced that we need do this in the protocol, as opposed to
letting the client figure it out with metadata inquiries.  If we should,
I'd be inclined to just replace the typeid field with the base typeid,
and not mention the domain to the frontend at all.  Comments?

> So I would request the ability of the client to set a max rows parameter 
>    for query results.  If a query were to return more than the max 
> number of rows, the client would be given a handle (essentially a cursor 
> name) that it could use to fetch additional sets of rows.

How about simply erroring out if the query returns more than X rows?

> 4) Protocol level support of PREPARE.  In jdbc and most other 
> interfaces, there is support for parameterized SQL.  If you want to take 
> advantage of the performance benefits of reusing parsed plans you have 
> to use the PREPARE SQL statement.

This argument seems self-contradictory to me.  There is no such benefit
unless you're going to re-use the statement many times.  Nor do I see
how pushing PREPARE down to the protocol level will create any
improvement in its performance.

> So what I would like to see is the ability for the client to set a MAX 
> VALUE size parameter.  The server would send up to this amount of data 
> for any column.  If the value was longer than MAX VALUE, the server 
> would respond with a handle that the client could use to get the rest of 
> the value (in chunks of MAX VALUE) if it wanted to.

I don't think I want to embed this in the protocol, either; especially
not when we don't have even the beginnings of backend support for it.
I think such a feature should be implemented and proven as callable
functions first, and then we could think about pushing it down into the
protocol.

> 6)  Better over the wire support for bytea.  The current encoding of 
> binary data \000 results in a significant expansion in the size of data 
> transmitted.  It would be nice if bytea data didn't result in 2 or 3 
> times data expansion.

AFAICS the only context where this could make sense is binary
transmission of parameters for a previously-prepared statement.  We do
have all the pieces for that on the roadmap.
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Peter Eisentraut
Date:
Dave Page writes:

> Well what I *really* need has been made quite clear in other posts, but,
> when I say resultset in the same sentence as pgAdmin, I'm referring to
> the ability to enter an arbitrary SQL query, have the results displayed
> in a grid, which can then be editted. To do this pgAdmin needs to be
> able to figure out enough info about the source of the data to generate
> the required insert/update/delete statements.

Right.  But since you can't really write a literal SQL statement that does
an update that refers to a previous query, you are already doing a fair
amount of internal magic anyway, so if the meta-data is determined by
magic as well, that seems consistent.

What you need is an updateable cursor on the server side.  It has all the
facilities you need, including standardized ways to find out the
updatability metadata.  Please concentrate on that and do not attempt to
clutter the wire protocol with data that will not withstand a throrough
investigation of semantics.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> What you need is an updateable cursor on the server side.  It has all the
> facilities you need, including standardized ways to find out the
> updatability metadata.  Please concentrate on that and do not attempt to
> clutter the wire protocol with data that will not withstand a throrough
> investigation of semantics.

So are you voting against adding any attribute-ID info to
RowDescription?  While I'm not that thrilled with it myself, it seems
relatively harmless as long as we can keep the overhead down.  I'm
okay with attrelid/attnum, but would gripe about including more than that.

You're probably right that updatable cursors would be a nicer solution
in the long run, but I don't really see that happening in the next
release or two.
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Dave Cramer
Date:
On Wed, 2003-03-12 at 12:46, Peter Eisentraut wrote:
> Dave Page writes:
> 
> > Well what I *really* need has been made quite clear in other posts, but,
> > when I say resultset in the same sentence as pgAdmin, I'm referring to
> > the ability to enter an arbitrary SQL query, have the results displayed
> > in a grid, which can then be editted. To do this pgAdmin needs to be
> > able to figure out enough info about the source of the data to generate
> > the required insert/update/delete statements.
> 
> Right.  But since you can't really write a literal SQL statement that does
> an update that refers to a previous query, you are already doing a fair
> amount of internal magic anyway, so if the meta-data is determined by
> magic as well, that seems consistent.
> 
> What you need is an updateable cursor on the server side.  It has all the
> facilities you need, including standardized ways to find out the
> updatability metadata.  Please concentrate on that and do not attempt to
> clutter the wire protocol with data that will not withstand a throrough
> investigation of semantics.

And I have offered to pay for this work to be done. Someone?
-- 
Dave Cramer <dave@fastcrypt.com>
Cramer Consulting



Re: Roadmap for FE/BE protocol redesign

From
"Merlin Moncure"
Date:
Peter Eisentraut writes:
> Dave Page writes:
>
> > Well what I *really* need has been made quite clear in other posts,
but,
> > when I say resultset in the same sentence as pgAdmin, I'm referring
to
> > the ability to enter an arbitrary SQL query, have the results
displayed
> > in a grid, which can then be editted. To do this pgAdmin needs to be
> > able to figure out enough info about the source of the data to
generate
> > the required insert/update/delete statements.
>
> Right.  But since you can't really write a literal SQL statement that
does
> an update that refers to a previous query, you are already doing a
fair
> amount of internal magic anyway, so if the meta-data is determined by
> magic as well, that seems consistent.

While this may be true, it is possible to build a client side system
that can do this for you.  Views and cursors are great, but they are not
always the best tool for the job.
>
> What you need is an updateable cursor on the server side.  It has all
the
> facilities you need, including standardized ways to find out the
> updatability metadata.  Please concentrate on that and do not attempt
to
> clutter the wire protocol with data that will not withstand a
throrough
> investigation of semantics.

It's not foolproof and may even be foolhardy, but there are certain
advantages to client-side decision making.  A couple of integers or so
for each attribute is not a terribly high price to pay.  If a compelling
case can be made that it can be put to good use, why not do it?

Merlin


Re: Roadmap for FE/BE protocol redesign

From
"Dave Page"
Date:

> -----Original Message-----
> From: Peter Eisentraut [mailto:peter_e@gmx.net] 
> Sent: 12 March 2003 17:46
> To: Dave Page
> Cc: Tom Lane; PostgreSQL Development
> Subject: RE: [HACKERS] Roadmap for FE/BE protocol redesign
> 
> 
> What you need is an updateable cursor on the server side.  It 
> has all the facilities you need, including standardized ways 
> to find out the updatability metadata.  Please concentrate on 
> that and do not attempt to clutter the wire protocol with 
> data that will not withstand a throrough investigation of semantics.

That may be true for pgAdmin III, (though it probably won't help through
an ODBC/ADO interface as pgAdmin II uses), but it doesn't help ODBC,
JDBC, dotNET DP etc. provide the metadata on resultsets that their specs
require them to.

What prompted me to think about this issue (again), is that the ODBC
driver incorrectly reports columns in a resultset as nullable, because
it has no way of telling otherwise. This issue breaks OpenOffice which
is using perfectly valid apis. pgAdmin III just happens to be able to
benefit from such an update as well.

Regards, Dave.

Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Greg Stark
Date:
"Marc G. Fournier" <scrappy@hub.org> writes:

> Personally ... as long as a v8.x client can talk to a v7.x backend, you
> have my vote ... I'm more apt to upgrade my clients before my servers
> anyway ...

Surely that's not true for a production environment. You have one database but
potentially dozens of various programs around that access it. The main
application, some backend scripts for batch jobs, your backup process, your
monitoring systems... Not all of these are necessarily on the same machine.

It's upgrading the database that's likely to be the driving motivation for new
sql or storage features. People usually don't get excited about upgrading the
client libraries :)

--
greg



Re: Roadmap for FE/BE protocol redesign

From
Hiroshi Inoue
Date:
Peter Eisentraut wrote:
> 
> Dave Page writes:
> 
> > Well what I *really* need has been made quite clear in other
> > posts, but, when I say resultset in the same sentence as
> > pgAdmin, I'm referring to the ability to enter an arbitrary
> > SQL query, have the results displayed in a grid, which can
> > then be editted. To do this pgAdmin needs to be able to
> > figure out enough info about the source of the data to generate
> > the required insert/update/delete statements.
> 
> Right.  But since you can't really write a literal SQL statement
> that does an update that refers to a previous query, you are
> already doing a fair amount of internal magic anyway, so if the
> meta-data is determined by magic as well, that seems consistent.

Psqlodbc driver has to parse the queries in order to
implement driver side updatable cursors unwillingly.
I'm very suspicios if it should be the driver's job
because it's very hard and ineffective to parse and
analyze the queries in the same way as the backend does.
> What you need is an updateable cursor on the server side.
> It has all the facilities you need,

Really ? How did you confirm it ?

> including standardized ways to find out the
> updatability metadata.  Please concentrate on that and do not attempt to
> clutter the wire protocol with data that will not withstand a throrough
> investigation of semantics.

regards,
Hiroshi Inouehttp://www.geocities.jp/inocchichichi/psqlodbc/


Re: Roadmap for FE/BE protocol redesign

From
"Christopher Kings-Lynne"
Date:
> > I suggested using names to Tom for this reason, but he preferred to use
> > attrelid/attnum.
>
> Oh, and what happenned to the attlognum idea?  If something that needs
> it is going to be implemented the column should probably be added now
> and used instead of attnum.

Weeeell, it'd be nice, but I can see ways of doing ALTER COLUMN/SET TYPE
without it, so...

(My work on that has been stalled for a while BTW, due to sheer difficulty!)

Chris



Re: Roadmap for FE/BE protocol redesign

From
"Christopher Kings-Lynne"
Date:
> > One addition I would personally like to see (it comes up in my apps
> > code) is the ability to detect wheather the server is big endian or
> > little endian.  When using binary cursors this is necessary in order to
> > read int data.
>
> Actually, my hope is to eliminate that business entirely by
> standardizing the on-the-wire representation for binary data; note the
> reference to send/receive routines in the original message.  For integer
> data this is simple enough: network byte order will be it.  I'm not sure
> yet what to do about float data.

Is that something that the 'XDR' spec deals with?  (Thinking back to 3rd
year networking unit)...

Chris



Re: Roadmap for FE/BE protocol redesign

From
Dave Cramer
Date:
On Wed, 2003-03-12 at 20:45, Hiroshi Inoue wrote:
> Peter Eisentraut wrote:
> > 
> > Dave Page writes:
> > 
> > > Well what I *really* need has been made quite clear in other
> > > posts, but, when I say resultset in the same sentence as
> > > pgAdmin, I'm referring to the ability to enter an arbitrary
> > > SQL query, have the results displayed in a grid, which can
> > > then be editted. To do this pgAdmin needs to be able to
> > > figure out enough info about the source of the data to generate
> > > the required insert/update/delete statements.
> > 
> > Right.  But since you can't really write a literal SQL statement
> > that does an update that refers to a previous query, you are
> > already doing a fair amount of internal magic anyway, so if the
> > meta-data is determined by magic as well, that seems consistent.
> 
> Psqlodbc driver has to parse the queries in order to
> implement driver side updatable cursors unwillingly.
> I'm very suspicios if it should be the driver's job
> because it's very hard and ineffective to parse and
> analyze the queries in the same way as the backend does.

jdbc has to do this too, and the backend is in a much better position to
do the parsing IMO as well.
>  
> > What you need is an updateable cursor on the server side.
> > It has all the facilities you need,
> 
> Really ? How did you confirm it ?
> 
> > including standardized ways to find out the
> > updatability metadata.  Please concentrate on that and do not attempt to
> > clutter the wire protocol with data that will not withstand a throrough
> > investigation of semantics.
> 
> regards,
> Hiroshi Inoue
>     http://www.geocities.jp/inocchichichi/psqlodbc/
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
> 
> http://www.postgresql.org/docs/faqs/FAQ.html
-- 
Dave Cramer <dave@fastcrypt.com>
Cramer Consulting



Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> It's upgrading the database that's likely to be the driving motivation
> for new sql or storage features. People usually don't get excited
> about upgrading the client libraries :)

Usually not.  This cycle might be different though, if we are able to
finish the proposed improvements in error reporting and other issues
that are handicapping clients.  None of that work will help un-upgraded
clients...
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Hiroshi Inoue
Date:
Tom Lane wrote:
> 
> "Dave Page" <dpage@vale-housing.co.uk> writes:
> > No, but with them we can avoid cluttering the wire protocol with fields
> > for all this, and the JDBC required data. With 2 numeric columns
> > (attrelid, attnum), any application/interface can query the system
> > catalogs easily for whatever extra info they like.
> 
> This is my feeling also.  We shouldn't try to guess in the protocol
> exactly what set of information will be wanted by a frontend; we should
> just provide the catalog keys needed to look up whatever is wanted.

Does looking up by the catalog keys take no cost ?

regards,
Hiroshi Inouehttp://www.geocities.jp/inocchichichi/psqlodbc/


Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Hannu Krosing
Date:
Greg Stark kirjutas K, 12.03.2003 kell 07:10:
> "Marc G. Fournier" <scrappy@hub.org> writes:
> 
> > Personally ... as long as a v8.x client can talk to a v7.x backend, you
> > have my vote ... I'm more apt to upgrade my clients before my servers
> > anyway ...
> 
> Surely that's not true for a production environment. You have one database but
> potentially dozens of various programs around that access it. The main
> application, some backend scripts for batch jobs, your backup process, your
> monitoring systems... Not all of these are necessarily on the same machine.

For more radical protocol changes a viable approach could be "protocol
proxies", i.e. set up a _separate_ daemon which listens on a separate
port and translates v7.x wire protocol to v8.x of the database proper.

Then those needing it can keep it around and those who need it not don't
get the overhead. It could also be maintained by inerested parties long
after being dropped by core developers.

> It's upgrading the database that's likely to be the driving motivation for new
> sql or storage features. People usually don't get excited about upgrading the
> client libraries :)

But our SQL itself is slowly drifting towards ANSI/ISO compliance and
that has often brought subtle changes that break _applications_. It is
not a big issue to changes libraries if you have to change the
application anyway.

-----------------
Hannu



Re: Roadmap for FE/BE protocol redesign

From
Hannu Krosing
Date:
Tom Lane kirjutas K, 12.03.2003 kell 18:19:
> Barry Lind <blind@xythos.com> writes:
> > One addition I would personally like to see (it comes up in my apps 
> > code) is the ability to detect wheather the server is big endian or 
> > little endian.  When using binary cursors this is necessary in order to 
> > read int data.
> 
> Actually, my hope is to eliminate that business entirely by
> standardizing the on-the-wire representation for binary data; note the
> reference to send/receive routines in the original message.  For integer
> data this is simple enough: network byte order will be it.  I'm not sure
> yet what to do about float data.

Use IEEE floats or just report the representation in startup packet.

the X11 protocol does this for all data, even integers - the client
expresses a wish what it wants and the server tells it what it gets (so
two intel boxes need not to convert to "network byte order" at both
ends).

> > 2) Better support for domains.  Currently the jdbc driver is broken with 
> > regards to domains (although no one has reported this yet).  The driver 
> > will treat a datatype that is a domain as an unknown/unsupported 
> > datatype.  It would be great if the T response included the 'base' 
> > datatype for a domain attribute so that the driver would know what 
> > parsing routines to call to convert to/from the text representation the 
> > backend expects.
> 
> I'm unconvinced that we need do this in the protocol, as opposed to
> letting the client figure it out with metadata inquiries.  If we should,
> I'd be inclined to just replace the typeid field with the base typeid,
> and not mention the domain to the frontend at all.  Comments?
> 
> > So I would request the ability of the client to set a max rows parameter 
> >    for query results.  If a query were to return more than the max 
> > number of rows, the client would be given a handle (essentially a cursor 
> > name) that it could use to fetch additional sets of rows.
> 
> How about simply erroring out if the query returns more than X rows?

Or just using prepare/execute - fetch - fetch - fetch ...

> > 4) Protocol level support of PREPARE.  In jdbc and most other 
> > interfaces, there is support for parameterized SQL.  If you want to take 
> > advantage of the performance benefits of reusing parsed plans you have 
> > to use the PREPARE SQL statement.
> 
> This argument seems self-contradictory to me.  There is no such benefit
> unless you're going to re-use the statement many times.  Nor do I see
> how pushing PREPARE down to the protocol level will create any
> improvement in its performance.

I suspect that he actually means support for binary transmission of
parameters for a previously-prepared statement here.

> > So what I would like to see is the ability for the client to set a MAX 
> > VALUE size parameter.  The server would send up to this amount of data 
> > for any column.  If the value was longer than MAX VALUE, the server 
> > would respond with a handle that the client could use to get the rest of 
> > the value (in chunks of MAX VALUE) if it wanted to.
>
> I don't think I want to embed this in the protocol, either; especially
> not when we don't have even the beginnings of backend support for it.
> I think such a feature should be implemented and proven as callable
> functions first, and then we could think about pushing it down into the
> protocol.

IIRC, Oracle has such a feature in its support for Large Objects (LONG
datatype). If the object data is longer than xxx bytes you will need
special ized access to it.

also when stepping with single fetches, you will always get handles for
LONG objects, if fetching more than one row you'll get raw data. 

BTW, I'm not advocating such behaviour .


----------------
Hannu



Re: Roadmap for FE/BE protocol redesign

From
Hannu Krosing
Date:
Alvaro Herrera kirjutas K, 12.03.2003 kell 17:10:
> On Wed, Mar 12, 2003 at 03:02:23PM -0000, Dave Page wrote:
> 
> > > This may be problematic in the future if we change attrelid, attnum.
> > 
> > I suggested using names to Tom for this reason, but he preferred to use
> > attrelid/attnum.
> 
> Oh, and what happenned to the attlognum idea?  If something that needs
> it is going to be implemented the column should probably be added now
> and used instead of attnum.

I guess that attlognum is _not_ needed here. The only need for attlognum
would be for automatic column ordering when doing "select * " and 
"\d table", for all other purposes attnum is much more reliable.

-----------------
Hannu



Re: Roadmap for FE/BE protocol redesign

From
Sean Chittenden
Date:
> > One addition I would personally like to see (it comes up in my
> > apps code) is the ability to detect wheather the server is big
> > endian or little endian.  When using binary cursors this is
> > necessary in order to read int data.
>
> Actually, my hope is to eliminate that business entirely by
> standardizing the on-the-wire representation for binary data; note
> the reference to send/receive routines in the original message.  For
> integer data this is simple enough: network byte order will be it.
> I'm not sure yet what to do about float data.

When were talking sending data across the wire, are we talking about a
format that would let the server use sendfile() for sending the data
to the client? Having a database that can send data to the client
efficiently would be a nice change of pace given most databases since
RDBMSs are notoriously slow (slower than NFS) at sending files to
clients.  -sc

--
Sean Chittenden

Re: Roadmap for FE/BE protocol redesign

From
Mike Mascari
Date:
Hannu Krosing wrote:
> Tom Lane kirjutas K, 12.03.2003 kell 18:19:
>
>>Actually, my hope is to eliminate that business entirely by
>>standardizing the on-the-wire representation for binary data; note the
>>reference to send/receive routines in the original message.  For integer
>>data this is simple enough: network byte order will be it.  I'm not sure
>>yet what to do about float data.
> 
> 
> Use IEEE floats or just report the representation in startup packet.
> 
> the X11 protocol does this for all data, even integers - the client
> expresses a wish what it wants and the server tells it what it gets (so
> two intel boxes need not to convert to "network byte order" at both
> ends).

IIOP/CDR behaves similarly for performance reasons- "receiver 
makes it right". It also defines a representation for all of the 
CORBA idl basic types, wide characters, fixed-point types, 
structures, etc. A far-reaching, wild suggestion would be to 
replace the postmaster with a CORBA-based server process with a 
well defined interface. At a minimum, if a binary protocol is 
the ultimate destination, perhaps some of the mapping of various 
types could be borrowed from the specs.

Mike Mascari
mascarm@mascari.com




Re: Roadmap for FE/BE protocol redesign

From
"Dave Page"
Date:
It's rumoured that Hiroshi Inoue once said:
> Tom Lane wrote:
>>
>> "Dave Page" <dpage@vale-housing.co.uk> writes:
>> > No, but with them we can avoid cluttering the wire protocol with
>> > fields for all this, and the JDBC required data. With 2 numeric
>> > columns (attrelid, attnum), any application/interface can query the
>> > system catalogs easily for whatever extra info they like.
>>
>> This is my feeling also.  We shouldn't try to guess in the protocol
>> exactly what set of information will be wanted by a frontend; we
>> should just provide the catalog keys needed to look up whatever is
>> wanted.
>
> Does looking up by the catalog keys take no cost ?

Obviously there is cost, but doing a lookup only on demand, has got to be
cheaper in the long run than including the entire column definition in the
message whether it's wanted or not?
Regards, Dave.




Re: Roadmap for FE/BE protocol redesign

From
Hiroshi Inoue
Date:
Dave Page wrote:
> 
> It's rumoured that Hiroshi Inoue once said:
> > Tom Lane wrote:
> >>
> >> "Dave Page" <dpage@vale-housing.co.uk> writes:
> >> > No, but with them we can avoid cluttering the wire protocol with
> >> > fields for all this, and the JDBC required data. With 2 numeric
> >> > columns (attrelid, attnum), any application/interface can query the
> >> > system catalogs easily for whatever extra info they like.
> >>
> >> This is my feeling also.  We shouldn't try to guess in the protocol
> >> exactly what set of information will be wanted by a frontend; we
> >> should just provide the catalog keys needed to look up whatever is
> >> wanted.
> >
> > Does looking up by the catalog keys take no cost ?
> 
> Obviously there is cost, but doing a lookup only on demand, has got to be
> cheaper in the long run than including the entire column definition in the
> message whether it's wanted or not?

So if there are 100 fields, should we ask the backend 
the column name 100 times ?

regards,
Hiroshi Inouehttp://www.geocities.jp/inocchichichi/psqlodbc/


Re: Roadmap for FE/BE protocol redesign

From
Christopher Kings-Lynne
Date:
> > Obviously there is cost, but doing a lookup only on demand, has got to be
> > cheaper in the long run than including the entire column definition in the
> > message whether it's wanted or not?
>
> So if there are 100 fields, should we ask the backend
> the column name 100 times ?

No, you do a single select that returns 100 results...

Then you cache them in your frontend...

Chris




Re: Roadmap for FE/BE protocol redesign

From
Hannu Krosing
Date:
Hiroshi Inoue kirjutas N, 13.03.2003 kell 12:03:
> Dave Page wrote:
> > 
> > > Does looking up by the catalog keys take no cost ?
> > 
> > Obviously there is cost, but doing a lookup only on demand, has got to be
> > cheaper in the long run than including the entire column definition in the
> > message whether it's wanted or not?
> 
> So if there are 100 fields, should we ask the backend 
> the column name 100 times ?

or once for all columns if you prefer using IN.

> regards,
> Hiroshi Inoue
>     http://www.geocities.jp/inocchichichi/psqlodbc/
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org


Re: Roadmap for FE/BE protocol redesign

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Christopher Kings-Lynne [mailto:chriskl@familyhealth.com.au] 
> 
> > > Obviously there is cost, but doing a lookup only on 
> demand, has got to be
> > > cheaper in the long run than including the entire column 
> definition in the
> > > message whether it's wanted or not?
> >
> > So if there are 100 fields, should we ask the backend
> > the column name 100 times ?
> 
> No, you do a single select that returns 100 results...

Does a single select mean a fast select ?

regards,
Hiroshi Inoue 



Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
"Dave Page" <dpage@vale-housing.co.uk> writes:
> It's rumoured that Hiroshi Inoue once said:
>> Does looking up by the catalog keys take no cost ?

> Obviously there is cost, but doing a lookup only on demand, has got to be
> cheaper in the long run than including the entire column definition in the
> message whether it's wanted or not?

More to the point, the cost is paid by applications that want the
functionality, and not by those that don't.

It'd probably be reasonable for client libraries to maintain a cache
of column info, so that they only have to query the backend about a
particular column ID once per connection.  (This is one reason why
I want attrelid/attnum to be the info given on the wire: it's stable
enough to be used over a whole session to index such a cache.  Names
could not safely be used that way.)
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Barry Lind <blind@xythos.com> writes:
> 
> > 4) Protocol level support of PREPARE.  In jdbc and most other 
> > interfaces, there is support for parameterized SQL.  If you want to take 
> > advantage of the performance benefits of reusing parsed plans you have 
> > to use the PREPARE SQL statement.
> 
> This argument seems self-contradictory to me.  There is no such benefit
> unless you're going to re-use the statement many times.  Nor do I see
> how pushing PREPARE down to the protocol level will create any
> improvement in its performance.

"you're going to re-use the statement many times" is true (or should be true)
for every statement in every web site and other OLTP system. Even if the query
appears on only a single web page and is executed only once on that web page,
the nature of high volume web sites is that that page will be executed
hundreds or thousands of times per minute.

This is why the Perl DBI, for example, has a prepare_cached() which provides a
automatic caching of prepared handles. With Oracle I was able to use this
exclusively on a large high volume web site to keep thousands of prepared
handles. Every query was prepared only once per apache process.

There is a performance benefit to using placeholders and prepared queries in
that the plan doesn't need to be regenerated repeatedly. Ideally every query
should either be a big DSS query where the time spent in the optimizer is
irrelevant, or an OLTP transaction using placeholders where again the time
spent in the optimizer is irrelevant because it only needs to be run once.

This would allow the optimizer to grow in complexity. For example it could
explore both sides of the decision tree in places where now we have heuristics
to pick the probable better plan. Postgres's optimizer is pretty impressive
currently, but the constant attention to avoiding high cost optimizations
limits it.

There is also a security benefit. The idea of mixing parameters into the
queries even at the driver level gives me the willies. The database then has
to parse them back out of the query string. If there's a bug in the driver or
any kind of mismatch between the backend parser and the driver quoting then
there could be security holes.

--
greg



Re: Roadmap for FE/BE protocol redesign

From
cbbrowne@cbbrowne.com
Date:
> > No, you do a single select that returns 100 results...
> 
> Does a single select mean a fast select ?

Provably always?  Perhaps not.

But requesting data via a single select statement returning 100 rows
means that you only have to submit the request once, and it means that
the result set only has to be set up once, which is quite likely to save
some work.

If you're going to read 100 lines of data from a file, would it make
more sense to:
a) Open the file 100 times, fseek() to the right spot, and then close   the file, or
b) Open the file once, and do 100 fseek() calls?

If you're just opening a file, there's not a huge amount of overhead
associated with that.  Setting up a DBMS query is quite a bit more
costly, so that the effort looks like:
a) Open a session connection, submit a query for one record, and   process it, then close the connection, and do this
100times,    versus
 
b) Open a session connection, submit a query returning 100 records,   and process them all.

The overhead surrounding submitting and processing the query is quite a
bit more than that involved in opening/closing files, so cutting down on
it should be pretty worthwhile...
--
(reverse (concatenate 'string "gro.gultn@" "enworbbc"))
http://cbbrowne.com/info/nonrdbms.html
"When we understand knowledge-based systems, it will be as before 
-- except our fingertips will have been singed."  -- Alan Perlis


Re: Roadmap for FE/BE protocol redesign

From
Christof Petig
Date:
Barry Lind wrote:
> 3) Protocol level support for CURSORs.  It would be nice if cursor 
> support was done at the protocol level and not as a SQL command.

I want to second this proposal. Currently I avoid using cursors in my 
programs since
a) they need much more logic and _string_concatenation_ to be handled 
transparently by a library (prepend the query with DECLARE X CURSOR 
FOR), then (FETCH n FROM X), then (CLOSE X). That's inefficient.
b) I have a really bad feeling to have the backend parse (FETCH FROM X) 
every time I ask for a (single) row
c) I hate that the backend retransmits column names etc. for every fetch 
I issue. This information is mostly unneeded but the backend cannot know 
better

Of course these issues can be addressed by using FETCH n (n>10) but this  kludge is only needed because the FETCH
protocolis so inefficient. 
 
Think about the amount of bytes transferred for "select 2000 lines of 
integers" with and without declare/fetch/close. Imagine a result set of 
1 to 20000 integers given back (depending on parameters) for an 
interactive program (e.g. browsing a customer list by initials). Prefer 
a cursor (much more constant overhead even for single results) or all in 
one (and wait longer for a first result)?

I'd love to tell the backend to give a "descriptor" for this query back 
and use it efficiently to get data and/or metadata (see ODBC, JDBC, 
sqlda or dynamic sql). Perhaps it's most efficient to ask for N initial 
results (which are instantly returned).
   Christof (who implemented dynamic sql for ecpg)

PS: perhaps this protocol infrastructure is also well suited to return 
large bytea values (<M bytes : return inline, > return a descriptor). 
[Also proposed by Barry Lind.]

PPS: I'm perfectly fine with returning attrelid/attnum. Then the client 
can control how many effort is spent for determining only the asked for 
metadata.




Re: Roadmap for FE/BE protocol redesign

From
"Marc G. Fournier"
Date:
On Tue, 11 Mar 2003, Bruce Momjian wrote:

> Six months would be June 1 beta, so maybe that is still a good target.

We released v7.3 just before Dec 1st, so six months is May 1st, not June
1st ...



Re: Roadmap for FE/BE protocol redesign

From
Barry Lind
Date:

Tom Lane wrote:
> Barry Lind <blind@xythos.com> writes:
> 
>>One addition I would personally like to see (it comes up in my apps 
>>code) is the ability to detect wheather the server is big endian or 
>>little endian.  When using binary cursors this is necessary in order to 
>>read int data.
> 
> 
> Actually, my hope is to eliminate that business entirely by
> standardizing the on-the-wire representation for binary data; note the
> reference to send/receive routines in the original message.  For integer
> data this is simple enough: network byte order will be it.  I'm not sure
> yet what to do about float data.
> 

Great.

> 
>>2) Better support for domains.  Currently the jdbc driver is broken with 
>>regards to domains (although no one has reported this yet).  The driver 
>>will treat a datatype that is a domain as an unknown/unsupported 
>>datatype.  It would be great if the T response included the 'base' 
>>datatype for a domain attribute so that the driver would know what 
>>parsing routines to call to convert to/from the text representation the 
>>backend expects.
> 
> 
> I'm unconvinced that we need do this in the protocol, as opposed to
> letting the client figure it out with metadata inquiries.  If we should,
> I'd be inclined to just replace the typeid field with the base typeid,
> and not mention the domain to the frontend at all.  Comments?
> 

I don't have a strong opinion on this one.  I can live with current 
functionality.  It isn't too much work to look up the base type.

> 
>>So I would request the ability of the client to set a max rows parameter 
>>   for query results.  If a query were to return more than the max 
>>number of rows, the client would be given a handle (essentially a cursor 
>>name) that it could use to fetch additional sets of rows.
> 
> 
> How about simply erroring out if the query returns more than X rows?
> 
This shouldn't be an error condition.  I want to fetch all of the rows, 
I just don't want to have to buffer them all in memory.  Consider the 
following example.  Select statement #1 is 'select id from foo', 
statement #2 is 'update bar set x = y where foo_id = ?'.  The program 
logic issues statement #1 and then starts iterating through the results 
and the issues statement #2 for some of those results.  If statement #1 
returns a large number of rows the program can run out of memory if all 
the rows from #1 need to be buffered in memory.  What would be nice is 
if the protocol allowed getting some rows from #1 but not all so that 
the connection could be used to issue some #2 statements.

> 
>>4) Protocol level support of PREPARE.  In jdbc and most other 
>>interfaces, there is support for parameterized SQL.  If you want to take 
>>advantage of the performance benefits of reusing parsed plans you have 
>>to use the PREPARE SQL statement.
> 
> 
> This argument seems self-contradictory to me.  There is no such benefit
> unless you're going to re-use the statement many times.  Nor do I see
> how pushing PREPARE down to the protocol level will create any
> improvement in its performance.
> 
There is a benefit if you do reuse the statement multiple times.  The 
performance problem is the two round trips minimum to the server that 
are required.  A protocol solution to this would be to allow the client 
to send multiple requests at one time to the server.  But as I type that 
I realize that can already be done, by having multiple semi-colon 
separated SQL commands sent at once.  So I probably have everything I 
need for this already.  I can just cue up the 'deallocate' calls and 
piggyback them on to the next real call to the server.

> 
>>So what I would like to see is the ability for the client to set a MAX 
>>VALUE size parameter.  The server would send up to this amount of data 
>>for any column.  If the value was longer than MAX VALUE, the server 
>>would respond with a handle that the client could use to get the rest of 
>>the value (in chunks of MAX VALUE) if it wanted to.
> 
> 
> I don't think I want to embed this in the protocol, either; especially
> not when we don't have even the beginnings of backend support for it.
> I think such a feature should be implemented and proven as callable
> functions first, and then we could think about pushing it down into the
> protocol.
> 

That is fine.

> 
>>6)  Better over the wire support for bytea.  The current encoding of 
>>binary data \000 results in a significant expansion in the size of data 
>>transmitted.  It would be nice if bytea data didn't result in 2 or 3 
>>times data expansion.
> 
> 
> AFAICS the only context where this could make sense is binary
> transmission of parameters for a previously-prepared statement.  We do
> have all the pieces for that on the roadmap.
> 
Actually it is the select of binary data that I was refering to.  Are 
you suggesting that the over the wire format for bytea in a query result 
will be binary (instead of the ascii encoded text format as it currently 
exists)?

>             regards, tom lane
> 

I am looking forward to all of the protocol changes.

thanks,
--Barry





Re: Roadmap for FE/BE protocol redesign

From
"Marc G. Fournier"
Date:
On Wed, 12 Mar 2003, Justin Clift wrote:

> Marc G. Fournier wrote:
> <snip>
>  >>Would it be feasible to investigate approaches for having the Win32 and
>  >>PITR work be shared amongst a few very-interested volunteers, so that
>  >>people can cover for each other's downtime?  Not sure of the
>  >>confidentiality level of the Win32/PITR patches at present, but I'd
>  >>guess there would be at least a few solid volunteers willing to
>  >>contribute to the Win32/PITR ports if we asked for people to step
>  >>forwards.
>  >
>  > Why should we be the ones to ask for ppl to step forward to volunteer to
>  > help?  Shouldn't it be the responsibility of the developer working on it
>  > to admit that there is no way they will make the scheduale and call for
>  > help?
>
> Hadn't thought of that.  Um.. how about "whatever works"?  It's sounds
> like the kind of thing where some people may be offended if we suddenly
> started asking for extra volunteers for their bits, and others in the
> same situation wouldn't.

The thing is, its not often, but ppl have been pop'ng up on the list
asking for things to do ... we have the TODO list that we point ppl to,
but if someone is working on a project and needs help, that is the perfect
time for them to pop up and try and lure the person over :)

But it also doesn't negate someone from asking for help as well ... hell,
how many ppl are working on something that someone else is working on from
a different angle, that would end up stronger by working together?



Re: Roadmap for FE/BE protocol redesign

From
"Zeugswetter Andreas SB SD"
Date:
> > > Does looking up by the catalog keys take no cost ?
> > 
> > Obviously there is cost, but doing a lookup only on demand, has got to be
> > cheaper in the long run than including the entire column definition in the
> > message whether it's wanted or not?
> 
> So if there are 100 fields, should we ask the backend 
> the column name 100 times ?

Also doesn't the planner/executor already have all needed info available ?
Thus answering from prepare would be a lot cheaper than selecting the descriptor
after prepare (more network roundtrips, parse, plan and execute the descriptor lookup).

If this where at the protocol level, the client could flag what info is wanted,
and thus avoid all unwanted overhead. 

Andreas


Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:
> Also doesn't the planner/executor already have all needed info available ?

Not directly, and not necessarily in the form that the client would want
it in (eg, converting type OID to type name isn't free).  I don't care
to load either the backend or the protocol down with the responsibility
for offering every piece of column data that a client could possibly
want as part of RowDescription.

Besides, elsewhere in this thread we were hearing about how
RowDescription is already too much overhead for some people ;-)

To my mind, the argument in favor of this feature is essentially that
it saves ODBC/JDBC from needing to duplicate the backend's SQL parser;
which is a legitimate concern.  But that doesn't translate to saying
that we should push functionality out of the clients and into the
backend when it wouldn't be in the backend otherwise.  That's just
moving code around on the basis of some rather-shaky arguments about
performance.  And what happens when your client wants something
different from the exact functionality that was pushed to the backend?
You're back to square one.
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Barry Lind <blind@xythos.com> writes:
>> AFAICS the only context where this could make sense is binary
>> transmission of parameters for a previously-prepared statement.  We do
>> have all the pieces for that on the roadmap.
>> 
> Actually it is the select of binary data that I was refering to.  Are 
> you suggesting that the over the wire format for bytea in a query result 
> will be binary (instead of the ascii encoded text format as it currently 
> exists)?

See binary cursors ...
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
"Dave Page"
Date:

> -----Original Message-----
> From: Hiroshi Inoue [mailto:Inoue@tpf.co.jp] 
> Sent: 13 March 2003 10:04
> To: Dave Page
> Cc: tgl@sss.pgh.pa.us; ZeugswetterA@spardat.at; 
> pgman@candle.pha.pa.us; pgsql-hackers@postgresql.org; 
> pgsql-interfaces@postgresql.org
> Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign
> 
> 
> Dave Page wrote:
> > 
> > It's rumoured that Hiroshi Inoue once said:
> > >
> > > Does looking up by the catalog keys take no cost ?
> > 
> > Obviously there is cost, but doing a lookup only on demand, 
> has got to 
> > be cheaper in the long run than including the entire column 
> definition 
> > in the message whether it's wanted or not?
> 
> So if there are 100 fields, should we ask the backend 
> the column name 100 times ?

No, the column name is there already. This is useful for additional info
such as uniqueness, nullability, default value, contraints and so on.
Surely you don't want to get the entire relevant bits of the system
catalogues with every query unless we specifically request it?

Regards, Dave.

Re: Roadmap for FE/BE protocol redesign

From
"Dave Page"
Date:

> -----Original Message-----
> From: Zeugswetter Andreas SB SD [mailto:ZeugswetterA@spardat.at] 
> Sent: 13 March 2003 17:07
> To: Hiroshi Inoue; Dave Page
> Cc: tgl@sss.pgh.pa.us; pgman@candle.pha.pa.us; 
> pgsql-hackers@postgresql.org
> Subject: RE: [HACKERS] Roadmap for FE/BE protocol redesign
>
> 
> If this where at the protocol level, the client could flag 
> what info is wanted, and thus avoid all unwanted overhead. 

But the client usually won't know at that point. For example, an ODBC
app executes a query and reads the tuples returned. After that has
occured, the app calls SQLDescribeCol to describe a column in the
resultset...

Regards, Dave.


Re: Roadmap for FE/BE protocol redesign

From
"Andrew Dunstan"
Date:
Couldn't it be done optionally, so the clients that want the info pay the
price and those that don't want it get the speed and lower bandwidth?

Just a thought

andrew

----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>

> "Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:
> > Also doesn't the planner/executor already have all needed info available
?
>
> Not directly, and not necessarily in the form that the client would want
> it in (eg, converting type OID to type name isn't free).  I don't care
> to load either the backend or the protocol down with the responsibility
> for offering every piece of column data that a client could possibly
> want as part of RowDescription.
>
> Besides, elsewhere in this thread we were hearing about how
> RowDescription is already too much overhead for some people ;-)
>
> To my mind, the argument in favor of this feature is essentially that
> it saves ODBC/JDBC from needing to duplicate the backend's SQL parser;
> which is a legitimate concern.  But that doesn't translate to saying
> that we should push functionality out of the clients and into the
> backend when it wouldn't be in the backend otherwise.  That's just
> moving code around on the basis of some rather-shaky arguments about
> performance.  And what happens when your client wants something
> different from the exact functionality that was pushed to the backend?
> You're back to square one.
>
> regards, tom lane



Re: Roadmap for FE/BE protocol redesign

From
Barry Lind
Date:

Tom Lane wrote:
> Barry Lind <blind@xythos.com> writes:
> 
>>>AFAICS the only context where this could make sense is binary
>>>transmission of parameters for a previously-prepared statement.  We do
>>>have all the pieces for that on the roadmap.
>>>
>>
>>Actually it is the select of binary data that I was refering to.  Are 
>>you suggesting that the over the wire format for bytea in a query result 
>>will be binary (instead of the ascii encoded text format as it currently 
>>exists)?
> 
> 
> See binary cursors ...

Generally that is not an option.  It either requires users to code to 
postgresql specific sql syntax, or requires the driver to do it 
magically for them.  The later runs into all the issues that I raised on 
cursor support.

In general the jdbc driver is expected to execute arbitrary sql 
statements any application might want to send it.  The driver is 
handicaped because it doesn't know really anything about that sql 
statement (other than it is a select vs an update or delete). 
Specifically it doesn't know what tables or columns that SQL will access  or how many rows a select will return.  All
ofthis knowledge is in 
 
the backend, and short of implementing a full sql parser in java this 
knowledge will never exist in the front end.  Many of the things I put 
on my wish list for the protocol stem from this.

Where there are two ways to do something (use cursors or not, use 
prepared statements or not, use binary cursors or not) the driver either 
needs to a) choose one way and always use it, b) infer from the sql 
statement which way will be better, or c) require the user to tell us. 
The problem with a) is that it may not always be the correct choice. 
The problem with b) is that generally this isn't possible and the 
problem with c) is it requires that the user write code that isn't 
portable across different databases.

I would like to simply do a) in all cases.  But that means that one of 
the two options should always (or almost always) be the best choice.  So 
in the case of "use cursors or not", it would be nice if using cursors 
added little or no overhead such that it could always be used.  In the 
case of "use prepared statements vs not", it would be nice if prepared 
statements added little or no overhead so that they could always be 
used.  And finally in the case of "use binary or regular cursors" it 
would be nice if binary cursors could always be used.

The Oracle SQLNet protocol supports most of this.  Though it has been a 
few years since I worked with it, the oracle protocol has many of the 
features I am looking for (and perhaps the reason I am looking for them, 
is that I have seen them used there before).  Essentially the Oracle 
protocol lets you do the following operations:  open, parse, describe, 
bind, execute, fetch, close.  A request from the client to the server 
specifies what operations it wants to perform on a sql statement.  So a 
client could request to do all seven operations (which is essentially 
what the current postgres protocol does today).  Or it could issue an 
open,parse call which essentially is that same thing as the PREPARE sql 
statement, followed by a describe,bind,execute,fetch which is similar to 
an EXECUTE and FETCH sql statement and finally a close which is similar 
to a CLOSE and DEALLOCATE sql.  The describe request is generally only 
done once even though you may do multiple fetchs (unlike todays protocol 
which includes the describe information on every fetch, even if you are 
fetching one row at a time).  The oracle approach gives the client 
complete flexibility to do a lot, without requiring that the client 
start parsing sql statements and doing things like appending on DECLARE 
CURSOR, or FETCH in order to reformate the applications sql statement 
into the postgresql sql way of doing this.

--Barry




Re: Roadmap for FE/BE protocol redesign

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Dave Page
> 
> > -----Original Message-----
> > From: Hiroshi Inoue [mailto:Inoue@tpf.co.jp] 
> > Sent: 13 March 2003 10:04
> > To: Dave Page
> > Cc: tgl@sss.pgh.pa.us; ZeugswetterA@spardat.at; 
> > pgman@candle.pha.pa.us; pgsql-hackers@postgresql.org; 
> > pgsql-interfaces@postgresql.org
> > Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign
> > 
> > 
> > Dave Page wrote:
> > > 
> > > It's rumoured that Hiroshi Inoue once said:
> > > >
> > > > Does looking up by the catalog keys take no cost ?
> > > 
> > > Obviously there is cost, but doing a lookup only on demand, 
> > has got to 
> > > be cheaper in the long run than including the entire column 
> > definition 
> > > in the message whether it's wanted or not?
> > 
> > So if there are 100 fields, should we ask the backend 
> > the column name 100 times ?
> 
> No, the column name is there already. 

The column name isn't there. If a field has its alias name the alias
is there.

regards,
Hiroshi Inoue 



Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Barry Lind <blind@xythos.com> writes:
> Tom Lane wrote:
>> See binary cursors ...

> Generally that is not an option.  It either requires users to code to 
> postgresql specific sql syntax, or requires the driver to do it 
> magically for them.

Fair enough.  I don't see anything much wrong with a GUC option that
says "send SELECT output in binary format".  This is not really a
protocol issue since the ASCII and BINARY choices both exist at the
protocol level --- there is nothing in the protocol saying binary data
can only be returned by FETCH and not by SELECT.  The main problem with
it in present releases is that binary data is architecture-dependent and
so encouraging its general use seems like a really bad idea.  But if we
manage to get send/receive conversion routines in there, most of that
issue would go away.

> The describe request is generally only 
> done once even though you may do multiple fetchs (unlike todays protocol 
> which includes the describe information on every fetch, even if you are 
> fetching one row at a time).

I'm less than excited about changing that, because it breaks clients
that don't want to remember past RowDescriptions (libpq being the
front-line victim), and it guarantees loss-of-synchronization failures
anytime the client misassociates rowdescription with query.  In exchange
for that, we get what exactly?  Fetching one row at a time is
*guaranteed* to be inefficient.  The correct response if that bothers
you is to fetch multiple rows at a time, not to make a less robust
protocol.
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Hiroshi Inoue
Date:
Tom Lane wrote:
> 
> "Dave Page" <dpage@vale-housing.co.uk> writes:
> > It's rumoured that Hiroshi Inoue once said:
> >> Does looking up by the catalog keys take no cost ?
> 
> > Obviously there is cost, but doing a lookup only on demand, has got to be
> > cheaper in the long run than including the entire column definition in the
> > message whether it's wanted or not?
> 
> More to the point, the cost is paid by applications that want the
> functionality, and not by those that don't.
> 
> It'd probably be reasonable for client libraries to maintain a cache
> of column info, so that they only have to query the backend about a
> particular column ID once per connection.

Is it a kind of thing that the server forces the clients
easily ?
Hmm as for PREPAREd statements, it seems much better to
implement functions which returns fields info for the
statement than relying on such a protocol level change.
regards,
Hiroshi Inouehttp://www.geocities.jp/inocchichichi/psqlodbc/


Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> Hmm as for PREPAREd statements, it seems much better to
> implement functions which returns fields info for the
> statement than relying on such a protocol level change.

Well, we're changing the protocol anyway for other purposes, so the
extra burden of a change here doesn't seem large.  I think it's more
useful to worry about what functionality we want than to worry about
whether it's changed...
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Barry Lind
Date:

Tom Lane wrote:
> Barry Lind <blind@xythos.com> writes:
> 
>>Tom Lane wrote:
>>
>>>See binary cursors ...
> 
> 
>>Generally that is not an option.  It either requires users to code to 
>>postgresql specific sql syntax, or requires the driver to do it 
>>magically for them.
> 
> 
> Fair enough.  I don't see anything much wrong with a GUC option that
> says "send SELECT output in binary format".  This is not really a
> protocol issue since the ASCII and BINARY choices both exist at the
> protocol level --- there is nothing in the protocol saying binary data
> can only be returned by FETCH and not by SELECT.  The main problem with
> it in present releases is that binary data is architecture-dependent and
> so encouraging its general use seems like a really bad idea.  But if we
> manage to get send/receive conversion routines in there, most of that
> issue would go away.
> 
That would be great.

> 
>>The describe request is generally only 
>>done once even though you may do multiple fetchs (unlike todays protocol 
>>which includes the describe information on every fetch, even if you are 
>>fetching one row at a time).
> 
> 
> I'm less than excited about changing that, because it breaks clients
> that don't want to remember past RowDescriptions (libpq being the
> front-line victim), and it guarantees loss-of-synchronization failures
> anytime the client misassociates rowdescription with query.  In exchange
> for that, we get what exactly?  Fetching one row at a time is
> *guaranteed* to be inefficient.  The correct response if that bothers
> you is to fetch multiple rows at a time, not to make a less robust
> protocol.
I don't feel strongly either way on this one, but IIRC the SQL standard 
for cursors only specifies fetching one record at a time (at least that 
is how MSSQL and DB2 implement it).  Thus portable code is likely to 
only fetch one record at a time.  The current row description isn't too 
big, but with the changes being suggested it might become so.

thanks,
--Barry





Re: Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Marc G. Fournier wrote:
> On Tue, 11 Mar 2003, Bruce Momjian wrote:
> 
> > Six months would be June 1 beta, so maybe that is still a good target.
> 
> We released v7.3 just before Dec 1st, so six months is May 1st, not June
> 1st ...

Six months is June 1 --- December (1), January-May (5) == 6.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Roadmap for FE/BE protocol redesign

From
Larry Rosenman
Date:

--On Thursday, March 13, 2003 21:32:36 -0500 Bruce Momjian 
<pgman@candle.pha.pa.us> wrote:

> Marc G. Fournier wrote:
>> On Tue, 11 Mar 2003, Bruce Momjian wrote:
>>
>> > Six months would be June 1 beta, so maybe that is still a good target.
>>
>> We released v7.3 just before Dec 1st, so six months is May 1st, not June
>> 1st ...
>
> Six months is June 1 --- December (1), January-May (5) == 6.
>
PostgreSQL agrees:

ler=# select date('2002-12-01') +
ler-# '6 months'::interval;     ?column?
---------------------2003-06-01 00:00:00
(1 row)

ler=#


-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749





Re: Roadmap for FE/BE protocol redesign

From
"Dave Page"
Date:
It's rumoured that Hiroshi Inoue once said:
>> -----Original Message-----
>> From: Dave Page
>>
>> > -----Original Message-----
>> > From: Hiroshi Inoue [mailto:Inoue@tpf.co.jp]
>> > Sent: 13 March 2003 10:04
>> > To: Dave Page
>> > Cc: tgl@sss.pgh.pa.us; ZeugswetterA@spardat.at;
>> > pgman@candle.pha.pa.us; pgsql-hackers@postgresql.org;
>> > pgsql-interfaces@postgresql.org
>> > Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign
>> >
>> >
>> > Dave Page wrote:
>> > >
>> > > It's rumoured that Hiroshi Inoue once said:
>> > > >
>> > > > Does looking up by the catalog keys take no cost ?
>> > >
>> > > Obviously there is cost, but doing a lookup only on demand,
>> > has got to
>> > > be cheaper in the long run than including the entire column
>> > definition
>> > > in the message whether it's wanted or not?
>> >
>> > So if there are 100 fields, should we ask the backend
>> > the column name 100 times ?
>>
>> No, the column name is there already.
>
> The column name isn't there. If a field has its alias name the alias is
> there.

Ahh yes, you're right of course. The docs are a little sketchy on that
detail
(http://www.postgresql.org/docs/view.php?version=7.3&idoc=1&file=protocol-message-formats.html).
Regards, Dave




Re: Roadmap for FE/BE protocol redesign

From
Christof Petig
Date:
Tom Lane wrote:
> Barry Lind <blind@xythos.com> writes:
>>The describe request is generally only 
>>done once even though you may do multiple fetchs (unlike todays protocol 
>>which includes the describe information on every fetch, even if you are 
>>fetching one row at a time).
> 
> 
> I'm less than excited about changing that, because it breaks clients
> that don't want to remember past RowDescriptions (libpq being the
> front-line victim), and it guarantees loss-of-synchronization failures
> anytime the client misassociates rowdescription with query.  In exchange
> for that, we get what exactly?  Fetching one row at a time is
> *guaranteed* to be inefficient.  The correct response if that bothers
> you is to fetch multiple rows at a time, not to make a less robust
> protocol.

I don't think that protocol support for cursors should change the 
behavior of executing all seven stages by default. A "FETCH ..." 
commmand would get processed like any other (e.g. "SELECT ...") and 
metadata is sent back, too (which corresponds to decribe stage IIRC).

New programs have the option to use the backwards compatible high level 
access via PQexec(c,"FETCH FROM X") which does all seven steps at once, 
or use the new low level way e.g. PQexec_new(c,"SELECT ...", 
query_parameter_descriptor, what_to_do (*), 
lines_to_return_without_cursor_overhead) which should return at most the 
specified lines and (if needed) a cursor descriptor (most likely an int) 
for subsequent PQfetch and PQclose calls.

I really like the idea of PGresult as an argument (cursor descriptor) 
for PQfetch (instead of an int) because it may even copy the metadata to 
the new PGresult, or perhaps replace the values in the original PGresult 
(if we decide to go this way). [proposed signature: PGresult 
*PQfetch(PGresult*result_of_the_select, how_many_lines, 
perhaps_even_offset/position)]

Additional there should be a PQclose and perhaps a PQprocess(PGresult *, 
things_to_do (*)) if we want to be able to separate every step.

If you know you are never interested in metadata, you can omit the 
describe flag at all. [null indication and type specification is of 
course always needed to access the actual data]
   Christof

*) open, parse, describe, bind, execute, fetch, close

PS: If we decide to omit the lines_to_return_without_cursor_overhead 
optimization, the new architecture would still be a big win for *DBC.

This optimization can not get a GUC variable instead of a protocol 
parameter since this would break clients: should they specify 
fetch+close to enable it? If yes, there's no easy way to implement the 
old behavior (all seven stages, no limit on returned lines). If no, the 
client cannot specify to omit the fetch without changing it (limit 0).

PPS: Query parameter passing is another topic, but I tend to propose a 
PGresult variant for specifying them (of course each with its type).



Re: Roadmap for FE/BE protocol redesign

From
Christof Petig
Date:
Christof Petig wrote:
> If you know you are never interested in metadata, you can omit the 
> describe flag at all. [null indication and type specification is of 
> course always needed to access the actual data]

More exactly they are sent separately:
null indication is per row 'D'/'B' and type specification is per query 'T'.

If the client does not ask for metadata one might omit attrelid,attnum 
(*) and field name in the 'T' packet. One might argue whether this small 
win per query times column rectifies to implement the feature. But then 
we'd need a method to query them lateron (otherwise *DBC could never 
omit them at first).
  Christof

*) they are not there, yet ;-)



Re: Roadmap for FE/BE protocol redesign

From
"Zeugswetter Andreas SB SD"
Date:
> > for that, we get what exactly?  Fetching one row at a time is
> > *guaranteed* to be inefficient.  The correct response if that bothers
> > you is to fetch multiple rows at a time, not to make a less robust
> > protocol.
> I don't feel strongly either way on this one, but IIRC the SQL standard
> for cursors only specifies fetching one record at a time (at least that
> is how MSSQL and DB2 implement it).  Thus portable code is likely to
> only fetch one record at a time.  The current row description isn't too
> big, but with the changes being suggested it might become so.

Yes, I think it should be the responsibility of the higer level interfaces
(ODBC, esql/c, ...) to fetch rows in optimal packets (e.g. 4-16k at a time).

This is what other db's interfaces do for you. The first fetch gets a packet
from the db, subsequent fetches are answered from the client side cache until
it is empty, then the next fetch gets the next packet from the db ...

The trick would be to ask for n bytes of row data (maybe a range), which might
be 1 - n rows (or maybe even only part of one row if it is very wide ?).
I think this is one of the things the new protocol should be able to cope with.

Andreas


Re: Roadmap for FE/BE protocol redesign

From
Greg Stark
Date:
So, just to throw out a wild idea: If you're talking about making large
changes to the on-the-wire protocol. Have you considered using an existing
database protocol? This would avoid having to reinvent the wheel every time
postgres implements a new feature like prepared queries, bind arrays, or
metadata information.

There is a free implementation of the TDS (Tabular DataStream) protocol used
by Sybase and MSSQL. I don't know how much of it would be interesting for
postgres and how much is Sybase/MSSQL-specific.

It would be pretty neat if postgres could use precisely the same on-the-wire
protocol as other major databases, just requiring a separate high level driver
to interpret the semantic meaning of the data.

At the very least it sounds like interesting to do a compare and contrast as
far as understanding what advantages the features TDS has have and what
disadvantages, before postgres possibly misses good ideas or makes the same
mistakes.


-- 
greg



Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> So, just to throw out a wild idea: If you're talking about making large
> changes to the on-the-wire protocol. Have you considered using an existing
> database protocol?

Yeah, I have.  Didn't look promising --- there's no percentage unless
we're 100% compatible, which creates a lot of problems (eg, can't ship
type OIDs to frontend anymore).

What I actually looked into was RDA, but I doubt that TDS would be any
closer to our needs...
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Christopher Browne
Date:
A long time ago, in a galaxy far, far away, gsstark@mit.edu (Greg Stark) wrote:
> So, just to throw out a wild idea: If you're talking about making large
> changes to the on-the-wire protocol. Have you considered using an existing
> database protocol? This would avoid having to reinvent the wheel every time
> postgres implements a new feature like prepared queries, bind arrays, or
> metadata information.
>
> There is a free implementation of the TDS (Tabular DataStream) protocol used
> by Sybase and MSSQL. I don't know how much of it would be interesting for
> postgres and how much is Sybase/MSSQL-specific.
>
> It would be pretty neat if postgres could use precisely the same on-the-wire
> protocol as other major databases, just requiring a separate high level driver
> to interpret the semantic meaning of the data.
>
> At the very least it sounds like interesting to do a compare and contrast as
> far as understanding what advantages the features TDS has have and what
> disadvantages, before postgres possibly misses good ideas or makes the same
> mistakes.

Let me take the liberty of pointing people to documentation to the TDS
protocol: <http://www.freetds.org/tds.html>

I agree that there are far worse ideas, when opening up the DB
protocol, than to look at some existing protocols, and TDS would seem
a reasonable one.  It doesn't look overly efficient, but it's not
overtly gratuitously inefficient...
-- 
(reverse (concatenate 'string "moc.enworbbc@" "enworbbc"))
http://www.ntlug.org/~cbbrowne/linuxxian.html
DOS: n.,  A small annoying  boot virus that causes  random spontaneous
system crashes, usually just  before saving a massive project.  Easily
cured  by Unix.   See also  MS-DOS,  IBM-DOS, DR-DOS.   
-- from  David Vicker's .plan


Re: Roadmap for FE/BE protocol redesign

From
Peter Eisentraut
Date:
Tom Lane writes:

> So are you voting against adding any attribute-ID info to
> RowDescription?  While I'm not that thrilled with it myself, it seems
> relatively harmless as long as we can keep the overhead down.  I'm
> okay with attrelid/attnum, but would gripe about including more than that.

At the beginning of this thread you raised a number of points where the
identity of the column of origin is not well-defined.  I haven't seen an
answer to that.  Whether the identity of the column is provided through
numbers or, as was originally requested, through names seems to be
irrelevant for those questions.

Now assume someone wanted to define a method to identify the column of
origin for a limited set of query types.  Would you align the definition
with what the current planner and executor structures can easily give you
or would you use a more "mathematical" definition?  And assuming it's the
latter, do you feel confident that that definition will not constrain
development of the planner and executor structures in the future?

-- 
Peter Eisentraut   peter_e@gmx.net



Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> At the beginning of this thread you raised a number of points where the
> identity of the column of origin is not well-defined.  I haven't seen an
> answer to that.

Yes, Dave did answer --- basically, he's happy with not providing any
column identity data in the cases where it's not obvious what the answer
should be.  And in particular he doesn't want the mechanism to drill
down into view definitions (which is less than obviously right to me,
but if that's what he wants it sure eliminates a lot of definitional
issues).

Given that agreement I don't have a problem with providing the
functionality.  It will take a few more lines in the parser, but not an
unreasonable amount I think.

> Would you align the definition
> with what the current planner and executor structures can easily give you
> or would you use a more "mathematical" definition?  And assuming it's the
> latter, do you feel confident that that definition will not constrain
> development of the planner and executor structures in the future?

I'm not too concerned about it given the before-view-expansion proviso.
Once the rewriter and planner go into action, the contents of the query
tree do start to look rather implementation-dependent, but what the
parser does is pretty well constrained by the SQL spec.
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Emmanuel Charpentier
Date:
Justin Clift wrote:

[ ... ]

> The problem Dave is suggesting this as a first attempt at a solution for 
> is that with ODBC, a frontend (i.e. OpenOffice) asks the ODBC driver 
> which columns are NULLable, etc.  And the ODBC driver is getting the 
> info wrong, then passing back the incorrect info.

And that's probably why OpenOffice doesn't allow you to update a view, 
event if there are rules allowing this from psql ...

This, in my book is an EPITA for end users ...

> So, when a person goes to insert a row into a table with a 
> SERIAL/SEQUENCE based column, OpenOffice has been told the column isn't 
> NULLable and forces the user to enter a value.  Voila, it doesn't work 
> with sequences.  :(
> 
> It's likely possible to add to the ODBC driver some way of getting the 
> info right, but Dave is also looking for a way of making this easier 
> into the future for similar problems.  i.e. Let the database explicitly 
> have info about what each column can do.

I'd second that, even if it takes time ...
                Emmanuel Charpentier

-- 
Emmanuel Charpentier



Re: Roadmap for FE/BE protocol redesign

From
Hannu Krosing
Date:
Tom Lane kirjutas R, 14.03.2003 kell 19:15:
> Greg Stark <gsstark@mit.edu> writes:
> > So, just to throw out a wild idea: If you're talking about making large
> > changes to the on-the-wire protocol. Have you considered using an existing
> > database protocol?
> 
> Yeah, I have.  Didn't look promising --- there's no percentage unless
> we're 100% compatible, which creates a lot of problems (eg, can't ship
> type OIDs to frontend anymore).

Surely there is a way to ship type info, even for UDT's (user defined
types) as nowadays most big databases support those.

> What I actually looked into was RDA, but I doubt that TDS would be any
> closer to our needs...

I remember someone started cleaning up IO in order to move it into a
separate module with the aim of making multiple implementations (RDA,
TDS, XML, native JDBC wire protocol if it ever becomes a reality, etc.)
possible.

While not exactly pertinent to new wire protocol this effort if
completed would make it much easier to have backwards compatibility on
the wire level.

------------
Hannu


PS. Another feature I'd like is individually turning off warnings and
notices. 

------------
Hannu



Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Peter Eisentraut
Date:
Tom Lane writes:

> * Backend's ReadyForQuery message (Z message) should carry an indication
> of current transaction status (idle/in transaction/in aborted transaction)
> so that frontend need not guess at state.  Perhaps also indicate
> autocommit status.

If we do this, could we get rid of the messy autocommit GUC option and
handle autocommit in the client?  Before sending a command, the client
could check the transaction status and automatically choose to start a new
transaction.  That way we could get rid of the current mess that every
client needs to send a SET autocommit command before it can safely do
anything.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: Roadmap for FE/BE protocol redesign

From
Peter Eisentraut
Date:
Tom Lane writes:

> Yes, Dave did answer --- basically, he's happy with not providing any
> column identity data in the cases where it's not obvious what the answer
> should be.  And in particular he doesn't want the mechanism to drill
> down into view definitions (which is less than obviously right to me,
> but if that's what he wants it sure eliminates a lot of definitional
> issues).

I don't get it.  Say I execute SELECT a, b, c FROM foo;.  In order to
update that query, the application needs to create some update statement,
say UPDATE foo SET a = entered_value;.  So the application already knows
that "foo" is the table and "a" is the column.  So if the application
wants to know about details on the column "a", it can execute

SELECT whatever FROM pg_attribute, pg_class WHERE relname = 'foo' AND attname = 'a';

With this proposed change, it can replace that with

SELECT whatever FROM pg_attribute, pg_class WHERE oid = X AND attnum = Y;

With the difference that the first version always works and the second
version sometimes works, and when that sometimes is is determined by the
rule that it should be "obvious".  That doesn't seem right to me.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Tom Lane writes:
> 
> > * Backend's ReadyForQuery message (Z message) should carry an indication
> > of current transaction status (idle/in transaction/in aborted transaction)
> > so that frontend need not guess at state.  Perhaps also indicate
> > autocommit status.
> 
> If we do this, could we get rid of the messy autocommit GUC option and
> handle autocommit in the client?  Before sending a command, the client
> could check the transaction status and automatically choose to start a new
> transaction.  That way we could get rid of the current mess that every
> client needs to send a SET autocommit command before it can safely do
> anything.

What if folks want all their connections autocommit off.  Seems it is
best in the server.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
>> * Backend's ReadyForQuery message (Z message) should carry an indication
>> of current transaction status (idle/in transaction/in aborted transaction)
>> so that frontend need not guess at state.  Perhaps also indicate
>> autocommit status.

> If we do this, could we get rid of the messy autocommit GUC option and
> handle autocommit in the client?

Hmm ... that's a thought ... not very backwards-compatible with 7.3,
but I think I like it better than continuing on with the GUC option.
As you say, that path is looking messier all the time.

Comments anyone?
        regards, tom lane


Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Neil Conway
Date:
On Mon, 2003-03-17 at 20:48, Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > If we do this, could we get rid of the messy autocommit GUC option and
> > handle autocommit in the client?
> 
> Hmm ... that's a thought ... not very backwards-compatible with 7.3,
> but I think I like it better than continuing on with the GUC option.
> As you say, that path is looking messier all the time.

Yes, definitely -- replacing the 7.3 GUC option would be great, that's a
pretty bad hack IMHO.

Cheers,

Neil

-- 
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC





Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> ... So the application already knows
> that "foo" is the table and "a" is the column.  So if the application
> wants to know about details on the column "a", it can execute
> SELECT whatever FROM pg_attribute, pg_class WHERE relname = 'foo' AND attname = 'a';
> With this proposed change, it can replace that with
> SELECT whatever FROM pg_attribute, pg_class WHERE oid = X AND attnum = Y;

Dave will correct me if I'm wrong --- but I think the issue here is that
the client-side library (think ODBC or JDBC) needs to gain this level of
understanding of a query that is presented to it as an SQL-source
string.  So no, it doesn't already know that "foo" is the table and "a"
is the column.  To find that out, it has to duplicate a lot of backend
code.
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Christof Petig
Date:
Peter Eisentraut wrote:
> I don't get it.  Say I execute SELECT a, b, c FROM foo;.  In order to
> update that query, the application needs to create some update statement,
> say UPDATE foo SET a = entered_value;.  So the application already knows
> that "foo" is the table and "a" is the column.  So if the application
> wants to know about details on the column "a", it can execute
> 
> SELECT whatever FROM pg_attribute, pg_class WHERE relname = 'foo' AND attname = 'a';

Once the statement uses aliases for the result columns this can go 
wrong. And to determine the correct table is difficult at best.

> With this proposed change, it can replace that with
> 
> SELECT whatever FROM pg_attribute, pg_class WHERE oid = X AND attnum = Y;
> 
> With the difference that the first version always works and the second
> version sometimes works, and when that sometimes is is determined by the
> rule that it should be "obvious".  That doesn't seem right to me.

I have the impression that the first solution sometimes works (and other 
times gets it really wrong, you can't tell without parsing the query) 
and the second solution works when possible (and gives no answer 
elsewhere). The only cases I expect the second one to fail are: natural 
joins (where the table is ambigious), expressions and function calls. Of 
course you can't expect a sane result in these cases.

On a second thought I do not want the column name (or alias) to get 
discarded (because otherwise you can't get the alias), but *DBC 
desperately needs attrelid and attnum.
   Christof



Re: Roadmap for FE/BE protocol redesign

From
"Dave Page"
Date:

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us] 
> Sent: 18 March 2003 06:26
> To: Peter Eisentraut
> Cc: Dave Page; PostgreSQL Development
> Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign 
> 
> 
> Peter Eisentraut <peter_e@gmx.net> writes:
> > ... So the application already knows
> > that "foo" is the table and "a" is the column.  So if the 
> application 
> > wants to know about details on the column "a", it can 
> execute SELECT 
> > whatever FROM pg_attribute, pg_class WHERE relname = 'foo' 
> AND attname 
> > = 'a'; With this proposed change, it can replace that with SELECT 
> > whatever FROM pg_attribute, pg_class WHERE oid = X AND attnum = Y;
> 
> Dave will correct me if I'm wrong --- but I think the issue 
> here is that the client-side library (think ODBC or JDBC) 
> needs to gain this level of understanding of a query that is 
> presented to it as an SQL-source string.  So no, it doesn't 
> already know that "foo" is the table and "a" is the column.  
> To find that out, it has to duplicate a lot of backend code.

That's basically it. pgAdmin jumps through hoops trying to parse the SQL
the user enters to figure out tables and columns etc, and currently the
ODBC driver doesn't even bother in many cases (it does have a 'Parse
Statements' option but that rarely helps) and ends up returning
incorrect info. I don't know much about the JDBC driver, but Barry did
say this fix would save him exactly the same problems.

Regards, Dave.

Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Peter Eisentraut
Date:
Bruce Momjian writes:

> What if folks want all their connections autocommit off.

For interactive use, people can record their preferred setting in
~/.psqlrc or something like that.

But any non-interactive program is written with a specific autocommit
setting in mind.  Either it assumes that single statements are
automatically committed or it issues explicit COMMIT commands.  That
means, the choice of autocommit mode is always ultimately determined by
the client, never by the server or its administrator.

For that very reason most standard interfaces define in their
specification whether applications must assume autocommit behavior or must
not do so.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Bruce Momjian writes:
> 
> > What if folks want all their connections autocommit off.
> 
> For interactive use, people can record their preferred setting in
> ~/.psqlrc or something like that.

But that only works for psql, right?  How would this be done at the
libpq level?  Environment variables?  GUC seems a whole lot cleaner.

> But any non-interactive program is written with a specific autocommit
> setting in mind.  Either it assumes that single statements are
> automatically committed or it issues explicit COMMIT commands.  That
> means, the choice of autocommit mode is always ultimately determined by
> the client, never by the server or its administrator.
> 
> For that very reason most standard interfaces define in their
> specification whether applications must assume autocommit behavior or must
> not do so.

I understand.  I just don't see any value in pushing that logic into
each client when we can do it centrally in the server.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Alvaro Herrera
Date:
On Tue, Mar 18, 2003 at 05:18:02PM -0500, Bruce Momjian wrote:
> Peter Eisentraut wrote:
> > Bruce Momjian writes:
> > 
> > > What if folks want all their connections autocommit off.
> > 
> > For interactive use, people can record their preferred setting in
> > ~/.psqlrc or something like that.
> 
> But that only works for psql, right?  How would this be done at the
> libpq level?  Environment variables?  GUC seems a whole lot cleaner.

I think an environment variable would be right.  The current method is
not clean in the sense that a client cannot decide what she wants; she
just accepts the decision from the DBA.  Thus, an application can't be
written with a certain value in mind, because the DBA can change the
setting at any time.

Client-side decision is the wiser proposal, I think.

> I understand.  I just don't see any value in pushing that logic into
> each client when we can do it centrally in the server.

The server doesn't know beforehand what the client wants.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"No renuncies a nada. No te aferres a nada"


Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> On Tue, Mar 18, 2003 at 05:18:02PM -0500, Bruce Momjian wrote:
> > Peter Eisentraut wrote:
> > > Bruce Momjian writes:
> > > 
> > > > What if folks want all their connections autocommit off.
> > > 
> > > For interactive use, people can record their preferred setting in
> > > ~/.psqlrc or something like that.
> > 
> > But that only works for psql, right?  How would this be done at the
> > libpq level?  Environment variables?  GUC seems a whole lot cleaner.
> 
> I think an environment variable would be right.  The current method is
> not clean in the sense that a client cannot decide what she wants; she
> just accepts the decision from the DBA.  Thus, an application can't be
> written with a certain value in mind, because the DBA can change the
> setting at any time.

The client can say "SET autocommit to off" or on.  It can use SHOW to
to see the setting.  Environment variables are used mostly for
connecting, and once connected, we use GUC.

In fact, an environment variable seems wrong because it isn't integrated
into the client, like a SET command is.

> Client-side decision is the wiser proposal, I think.
> 
> > I understand.  I just don't see any value in pushing that logic into
> > each client when we can do it centrally in the server.
> 
> The server doesn't know beforehand what the client wants.

True, but GUC seems like the way to go, and we have per-user/db settings
for GUC.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Roadmap for FE/BE protocol redesign

From
Brian Bruns
Date:
On 16 Mar 2003, Hannu Krosing wrote:

> Tom Lane kirjutas R, 14.03.2003 kell 19:15:
> > Greg Stark <gsstark@mit.edu> writes:
> > > So, just to throw out a wild idea: If you're talking about making large
> > > changes to the on-the-wire protocol. Have you considered using an existing
> > > database protocol?
> > 

> > What I actually looked into was RDA, but I doubt that TDS would be any
> > closer to our needs...
> 
> I remember someone started cleaning up IO in order to move it into a
> separate module with the aim of making multiple implementations (RDA,
> TDS, XML, native JDBC wire protocol if it ever becomes a reality, etc.)
> possible.

That was me, I did an initial cut of separating the FE/BE code from the 
rest, but ran short on time.  Hoping to get back to it one of these days.  
My primary interest was in getting the DRDA protocol supported in a clean 
fashion.  For those mentioning RDA, I believe that standard is pushing up 
the daisys.  DRDA is about the only standards game in town at this point, 
it has client side support from just about every vendor (IBM obviously, 
Oracle, Sybase, MS) and server side support of some sort from DB2 and a 
couple others (MS SNA gateway, for example is/has a DRDA server).  Mostly 
through gateways and add on products, but it's a far cry better than any 
other effort I'm aware of.

> While not exactly pertinent to new wire protocol this effort if
> completed would make it much easier to have backwards compatibility on
> the wire level.

I think this would be a good idea all around, and would make future 
changes/replacements to FE/BE protocol a lot cleaner. 

> ------------
> Hannu

Brian



Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Peter Eisentraut
Date:
Bruce Momjian writes:

> True, but GUC seems like the way to go, and we have per-user/db settings
> for GUC.

But the required autocommit setting depends neither on the user nor the
database, it depends on the identity of the client application.  That type
of granularity is not offered by GUC.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Bruce Momjian writes:
> 
> > True, but GUC seems like the way to go, and we have per-user/db settings
> > for GUC.
> 
> But the required autocommit setting depends neither on the user nor the
> database, it depends on the identity of the client application.  That type
> of granularity is not offered by GUC.

True, but the standard also requires autocommit off, so I can imagine
folks wanting it off by default for various users/database.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Roadmap for FE/BE protocol redesign

From
Jeff Davis
Date:
On Monday 10 March 2003 10:51 am, Tom Lane wrote:
>
> * XML support?  If we do anything, I'd want some extensible solution to
> allowing multiple query-result output formats from the backend, not an
> XML-specific hack.  For one thing, that would allow the actual appearance
> of any XML support to happen later.

It seems this would also be a good solution to a previous discussion about 
boolean representation.

The postgres output of t/f is perfectly resonable, but can be somewhat 
confusing to someone that relies on PHP's typecasting. In the discussion, 
someone mentioned that if you take in a variable directly from the database 
and cast it to boolean, both 't' and 'f' will cast to true. It turned out to 
be even more confusing because MySQL uses 0/1 which cast properly. 

If I remember correctly, there was even talk of adding a run-time parameter 
similar to the datestyle. If it were all handled in the query-result output 
formatting functions like you suggest, that would seem like a much cleaner 
solution. 

Regards,Jeff Davis


Re: Roadmap for FE/BE protocol redesign

From
Dave Cramer
Date:
I am wondering if it would be possible to return the value of the first
serial field in the row that was incremented on an insert ?

-- 
Dave Cramer <dave@fastcrypt.com>
Cramer Consulting



Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Dave Cramer <dave@fastcrypt.com> writes:
> I am wondering if it would be possible to return the value of the first
> serial field in the row that was incremented on an insert ?

That's a language issue, not a protocol issue, and no I'm not taking it
on for 7.4.
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Tom Lane wrote:
> Dave Cramer <dave@fastcrypt.com> writes:
> > I am wondering if it would be possible to return the value of the first
> > serial field in the row that was incremented on an insert ?
> 
> That's a language issue, not a protocol issue, and no I'm not taking it
> on for 7.4.

What I was wondering is if we could create a currval() call that takes
no arguments, and returns the most recent sequence id assigned.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Roadmap for FE/BE protocol redesign

From
Dave Cramer
Date:
Tom,

Forgive me for being dense, but how do you delineate this as being a
"language issue"?  

Dave
On Fri, 2003-03-21 at 10:54, Bruce Momjian wrote:
> Tom Lane wrote:
> > Dave Cramer <dave@fastcrypt.com> writes:
> > > I am wondering if it would be possible to return the value of the first
> > > serial field in the row that was incremented on an insert ?
> > 
> > That's a language issue, not a protocol issue, and no I'm not taking it
> > on for 7.4.
> 
> What I was wondering is if we could create a currval() call that takes
> no arguments, and returns the most recent sequence id assigned.
-- 
Dave Cramer <dave@fastcrypt.com>
Cramer Consulting



Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> What I was wondering is if we could create a currval() call that takes
> no arguments, and returns the most recent sequence id assigned.

Why?  That's still an extra query that the client has to issue, and
currval in that form would be an amazingly fragile programming tool.
(What if some trigger causes an autoincrement on some other sequence
than the one you are thinking about?)

I liked the INSERT ... RETURNING and UPDATE ... RETURNING syntax
extensions that Philip Warner (IIRC) proposed awhile back.  Those would
get the job done much more flexibly than anything else that's been
suggested.  That's why I think it's a language problem and not a
protocol problem.
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Sounds good.

---------------------------------------------------------------------------

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > What I was wondering is if we could create a currval() call that takes
> > no arguments, and returns the most recent sequence id assigned.
> 
> Why?  That's still an extra query that the client has to issue, and
> currval in that form would be an amazingly fragile programming tool.
> (What if some trigger causes an autoincrement on some other sequence
> than the one you are thinking about?)
> 
> I liked the INSERT ... RETURNING and UPDATE ... RETURNING syntax
> extensions that Philip Warner (IIRC) proposed awhile back.  Those would
> get the job done much more flexibly than anything else that's been
> suggested.  That's why I think it's a language problem and not a
> protocol problem.
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
> 
> http://www.postgresql.org/docs/faqs/FAQ.html
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Peter Eisentraut
Date:
Bruce Momjian writes:

> True, but the standard also requires autocommit off, so I can imagine
> folks wanting it off by default for various users/database.

The standard only covers embedded C programs.  Other interfaces that are
covered by other standards require other settings.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Bruce Momjian writes:
> 
> > True, but the standard also requires autocommit off, so I can imagine
> > folks wanting it off by default for various users/database.
> 
> The standard only covers embedded C programs.  Other interfaces that are
> covered by other standards require other settings.

My point was that _the_ standard requires autocommit off, not that
everyone is using autocommit off.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Peter Eisentraut wrote:
>> The standard only covers embedded C programs.  Other interfaces that are
>> covered by other standards require other settings.

> My point was that _the_ standard requires autocommit off, not that
> everyone is using autocommit off.

Peter's point is good: the only interface that is actually subject to the
part of the spec you are quoting is ecpg.  It could reasonably be argued
that libpq should only support autocommit-on, because that's the
historical behavior that programs written atop libpq expect.  The other
client libraries have their own specs to adhere to.
        regards, tom lane


Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Peter Eisentraut wrote:
> >> The standard only covers embedded C programs.  Other interfaces that are
> >> covered by other standards require other settings.
> 
> > My point was that _the_ standard requires autocommit off, not that
> > everyone is using autocommit off.
> 
> Peter's point is good: the only interface that is actually subject to the
> part of the spec you are quoting is ecpg.  It could reasonably be argued
> that libpq should only support autocommit-on, because that's the
> historical behavior that programs written atop libpq expect.  The other
> client libraries have their own specs to adhere to.

But isn't that like saying that the spec doesn't apply to libpq at all. 
Why would autocommit not apply but other queries specification apply?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> But isn't that like saying that the spec doesn't apply to libpq at all. 
> Why would autocommit not apply but other queries specification apply?

No, you're missing the point.  Essentially all the client libraries
offer their own autocommit behavior on top of what the spec says.
libpq is alone in not having any such client-side logic.  So it's not
a foregone conclusion that we should implement autocommit on/off logic
on the server side as a substitute for adding it to libpq.  We have
now tried doing it on the server side, and we are finding that we don't
like the side-effects; so it's time to revisit that decision.
        regards, tom lane


Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > But isn't that like saying that the spec doesn't apply to libpq at all. 
> > Why would autocommit not apply but other queries specification apply?
> 
> No, you're missing the point.  Essentially all the client libraries
> offer their own autocommit behavior on top of what the spec says.

Well, which ones?  jdbc (which prevers the server autocommit), ecpg,
odbc (?), and Perl.  Anyone else?

> libpq is alone in not having any such client-side logic.  So it's not
> a foregone conclusion that we should implement autocommit on/off logic
> on the server side as a substitute for adding it to libpq.  We have
> now tried doing it on the server side, and we are finding that we don't
> like the side-effects; so it's time to revisit that decision.

My concern is bloating the API for all languages based on libpq, and
psql and stuff like that.  Heck, even pgadmin would have to have a
button for it because a SET couldn't control it.

I know the server-side isn't optimal, but is the alternative worse?

Also, if Peter wants clients to be able to just issue a query and know
it commits, should we just disable setting autocommit in postgresql.conf
and per-user/db, and force applications to turn it on explicitly?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> My concern is bloating the API for all languages based on libpq, and
> psql and stuff like that.  Heck, even pgadmin would have to have a
> button for it because a SET couldn't control it.

Peter's point, AIUI, is that that is a good thing.

The problem with SET for autocommit is that every client program has to
*know*, without question, which setting it is using.  Autocommit is just
about as dangerous as a GUC variable that, say, silently swaps the
meanings of INSERT and DELETE --- if you don't know which setting you
are using then you will get the wrong results.

Thus it is not "convenient" to allow autocommit to be set in all the
weird and wonderful ways that we allow GUC variables to be set; it's
more like handing a razor blade to a baby.  We've already found out that
all client-side apps have to explicitly force autocommit to the setting
they want, or they break.  How is that a good thing?

I think we *should* go back to the assumption that libpq-based programs
see autocommit-on behavior unless they take specific action to prevent
it.  And that means that the client program has to take action to select
autocommit off, not that the admin can flick a switch remotely that will
affect clients.

The real point is that both the client application and the client
library need to know what the autocommit behavior is.  This is why
adding autocommit to the library APIs is the right thing, not the wrong
thing.  When there are ways other than a library API call to set the
autocommit behavior, then one party or the other is out of the loop
about what the current behavior is, and that gets us right back into the
same mess.

Basically I think that Peter is arguing that autocommit as a GUC
variable is a wrong and dangerous idea.  And I'm forced to agree.
I was wrong to put it in, and I'm now willing to take it out again.
At the time it seemed like a reasonable shortcut around changing the
protocol --- but now that we're changing the protocol anyway, it's
better to get rid of the shortcut.
        regards, tom lane


Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Barry Lind
Date:
In general I agree with Tom.  GUC is the wrong mechanism for autocommit.  The reason being that it isn't a system
administratorsdecision what 
 
value autocommit should have.  It is generally dictated by the client 
protocol or application.

Now that being said, it is nice for the client to be able to simply tell 
the server "you are in autocommit mode until I tell you otherwise". 
Instead of having to worry about trapping each commit and rollback 
request to make sure you insert to proper begin.

The current GUC autocommit is nice in that it makes it easier for the 
cleint to simply turn on or off the state.  It is a problem because it 
is a GUC parameter and therefore can be changed by the admin (thus you 
don't know what your initial state is without asking the server) or the 
user (via sql SET, thus you don't know that it has changed).

As I said in my earlier mail note from a jdbc perspective I can get it 
to work which ever way is decided (in fact the jdbc driver will probably 
need to support all of the ways, depending on if it is talking to a 7.2, 
7.3 or 7.4 backend).

My preference (given that I am detecting a willingness to make more 
significant changes in this area that I was expecting) would be to drop 
the GUC autocommit parameter.  Replacing that functionality with the 
ability to set and discover the autocommit state via the FE/BE protocol.

thanks,
--Barry

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> 
>>My concern is bloating the API for all languages based on libpq, and
>>psql and stuff like that.  Heck, even pgadmin would have to have a
>>button for it because a SET couldn't control it.
> 
> 
> Peter's point, AIUI, is that that is a good thing.
> 
> The problem with SET for autocommit is that every client program has to
> *know*, without question, which setting it is using.  Autocommit is just
> about as dangerous as a GUC variable that, say, silently swaps the
> meanings of INSERT and DELETE --- if you don't know which setting you
> are using then you will get the wrong results.
> 
> Thus it is not "convenient" to allow autocommit to be set in all the
> weird and wonderful ways that we allow GUC variables to be set; it's
> more like handing a razor blade to a baby.  We've already found out that
> all client-side apps have to explicitly force autocommit to the setting
> they want, or they break.  How is that a good thing?
> 
> I think we *should* go back to the assumption that libpq-based programs
> see autocommit-on behavior unless they take specific action to prevent
> it.  And that means that the client program has to take action to select
> autocommit off, not that the admin can flick a switch remotely that will
> affect clients.
> 
> The real point is that both the client application and the client
> library need to know what the autocommit behavior is.  This is why
> adding autocommit to the library APIs is the right thing, not the wrong
> thing.  When there are ways other than a library API call to set the
> autocommit behavior, then one party or the other is out of the loop
> about what the current behavior is, and that gets us right back into the
> same mess.
> 
> Basically I think that Peter is arguing that autocommit as a GUC
> variable is a wrong and dangerous idea.  And I'm forced to agree.
> I was wrong to put it in, and I'm now willing to take it out again.
> At the time it seemed like a reasonable shortcut around changing the
> protocol --- but now that we're changing the protocol anyway, it's
> better to get rid of the shortcut.
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
> 




Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Man, I lost another vote!  :-)

---------------------------------------------------------------------------

Barry Lind wrote:
> In general I agree with Tom.  GUC is the wrong mechanism for autocommit. 
>   The reason being that it isn't a system administrators decision what 
> value autocommit should have.  It is generally dictated by the client 
> protocol or application.
> 
> Now that being said, it is nice for the client to be able to simply tell 
> the server "you are in autocommit mode until I tell you otherwise". 
> Instead of having to worry about trapping each commit and rollback 
> request to make sure you insert to proper begin.
> 
> The current GUC autocommit is nice in that it makes it easier for the 
> cleint to simply turn on or off the state.  It is a problem because it 
> is a GUC parameter and therefore can be changed by the admin (thus you 
> don't know what your initial state is without asking the server) or the 
> user (via sql SET, thus you don't know that it has changed).
> 
> As I said in my earlier mail note from a jdbc perspective I can get it 
> to work which ever way is decided (in fact the jdbc driver will probably 
> need to support all of the ways, depending on if it is talking to a 7.2, 
> 7.3 or 7.4 backend).
> 
> My preference (given that I am detecting a willingness to make more 
> significant changes in this area that I was expecting) would be to drop 
> the GUC autocommit parameter.  Replacing that functionality with the 
> ability to set and discover the autocommit state via the FE/BE protocol.
> 
> thanks,
> --Barry
> 
> Tom Lane wrote:
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > 
> >>My concern is bloating the API for all languages based on libpq, and
> >>psql and stuff like that.  Heck, even pgadmin would have to have a
> >>button for it because a SET couldn't control it.
> > 
> > 
> > Peter's point, AIUI, is that that is a good thing.
> > 
> > The problem with SET for autocommit is that every client program has to
> > *know*, without question, which setting it is using.  Autocommit is just
> > about as dangerous as a GUC variable that, say, silently swaps the
> > meanings of INSERT and DELETE --- if you don't know which setting you
> > are using then you will get the wrong results.
> > 
> > Thus it is not "convenient" to allow autocommit to be set in all the
> > weird and wonderful ways that we allow GUC variables to be set; it's
> > more like handing a razor blade to a baby.  We've already found out that
> > all client-side apps have to explicitly force autocommit to the setting
> > they want, or they break.  How is that a good thing?
> > 
> > I think we *should* go back to the assumption that libpq-based programs
> > see autocommit-on behavior unless they take specific action to prevent
> > it.  And that means that the client program has to take action to select
> > autocommit off, not that the admin can flick a switch remotely that will
> > affect clients.
> > 
> > The real point is that both the client application and the client
> > library need to know what the autocommit behavior is.  This is why
> > adding autocommit to the library APIs is the right thing, not the wrong
> > thing.  When there are ways other than a library API call to set the
> > autocommit behavior, then one party or the other is out of the loop
> > about what the current behavior is, and that gets us right back into the
> > same mess.
> > 
> > Basically I think that Peter is arguing that autocommit as a GUC
> > variable is a wrong and dangerous idea.  And I'm forced to agree.
> > I was wrong to put it in, and I'm now willing to take it out again.
> > At the time it seemed like a reasonable shortcut around changing the
> > protocol --- but now that we're changing the protocol anyway, it's
> > better to get rid of the shortcut.
> > 
> >             regards, tom lane
> > 
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
> > 
> 
> 
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Man, I lost another vote!  :-)

Happens to us all ;-)

But this discussion is far from over ... not that many people have
weighed in yet.  Also, even if autocommit's fate is sealed, we still
have to think about how to handle the other variables Barry identified
as trouble spots.
        regards, tom lane


Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
I was thinking if we had a SET PERMANENT that couldn't be changed,
interfaces that want to control variables could set them perminantly,
and others could let users control it.

Actually, if we reported change to the client, the interface could
override any change the user made --- just an idea.

---------------------------------------------------------------------------

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Man, I lost another vote!  :-)
> 
> Happens to us all ;-)
> 
> But this discussion is far from over ... not that many people have
> weighed in yet.  Also, even if autocommit's fate is sealed, we still
> have to think about how to handle the other variables Barry identified
> as trouble spots.
> 
>             regards, tom lane
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Barry Lind
Date:

Bruce Momjian wrote:
> I was thinking if we had a SET PERMANENT that couldn't be changed,
> interfaces that want to control variables could set them perminantly,
> and others could let users control it.

SET PERMANENT only works for those variables that can only have one 
value for a given client protocol (for example datestyle in jdbc). 
Whereas autocommit needs to be changeable by the jdbc driver since the 
jdbc spec allows the user to call an API method to change the setting.

> 
> Actually, if we reported change to the client, the interface could
> override any change the user made --- just an idea.

Reporting the change to the client seems sufficient for the client to 
take whatever actions are necessary.

--Barry

> 
> ---------------------------------------------------------------------------
> 
> Tom Lane wrote:
> 
>>Bruce Momjian <pgman@candle.pha.pa.us> writes:
>>
>>>Man, I lost another vote!  :-)
>>
>>Happens to us all ;-)
>>
>>But this discussion is far from over ... not that many people have
>>weighed in yet.  Also, even if autocommit's fate is sealed, we still
>>have to think about how to handle the other variables Barry identified
>>as trouble spots.
>>
>>            regards, tom lane
>>
> 
> 




Re: [INTERFACES] Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Barry Lind wrote:
> 
> 
> Bruce Momjian wrote:
> > I was thinking if we had a SET PERMANENT that couldn't be changed,
> > interfaces that want to control variables could set them perminantly,
> > and others could let users control it.
> 
> SET PERMANENT only works for those variables that can only have one 
> value for a given client protocol (for example datestyle in jdbc). 
> Whereas autocommit needs to be changeable by the jdbc driver since the 
> jdbc spec allows the user to call an API method to change the setting.

Oh, good point.

> 
> > 
> > Actually, if we reported change to the client, the interface could
> > override any change the user made --- just an idea.
> 
> Reporting the change to the client seems sufficient for the client to 
> take whatever actions are necessary.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 



Re: Roadmap for FE/BE protocol redesign

From
Kevin Brown
Date:
Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > ... So the application already knows
> > that "foo" is the table and "a" is the column.  So if the application
> > wants to know about details on the column "a", it can execute
> > SELECT whatever FROM pg_attribute, pg_class WHERE relname = 'foo' AND attname = 'a';
> > With this proposed change, it can replace that with
> > SELECT whatever FROM pg_attribute, pg_class WHERE oid = X AND attnum = Y;
> 
> Dave will correct me if I'm wrong --- but I think the issue here is that
> the client-side library (think ODBC or JDBC) needs to gain this level of
> understanding of a query that is presented to it as an SQL-source
> string.  So no, it doesn't already know that "foo" is the table and "a"
> is the column.  To find that out, it has to duplicate a lot of backend
> code.

Perhaps, rather than changing the protocol to include attrelid/attnum
information for the query, we should instead implement a command that
would yield the query's result information directly:

fileinfo=> QUERY RESULTS SELECT * from files;classname |  attname   |         atttype          | classid  | typeid |
typemod
 
-----------+------------+--------------------------+----------+--------+---------files     | filename   | character
varying(1024) | 59422343 |   1043 |    1028files     | mode       | bit(32)                  | 59422343 |   1560 |
32files    | size       | bigint                   | 59422343 |     20 |      -1files     | uid        | integer
         | 59422343 |     23 |      -1files     | gid        | integer                  | 59422343 |     23 |
-1files    | createtime | timestamp with time zone | 59422343 |   1184 |      -1files     | modtime    | timestamp with
timezone | 59422343 |   1184 |      -1files     | device     | integer                  | 59422343 |     23 |
-1files    | inode      | integer                  | 59422343 |     23 |      -1files     | nlinks     | integer
         | 59422343 |     23 |      -1
 
(10 rows)

Each tuple result of the QUERY RESULTS command (some other name for it
could be selected, this is just an example) would describe a column
that would be returned by the query being examined, and the tuples
would be sent in the left-to-right order that the columns they
describe would appear (or, alternatively, another column like attnum
could be sent that numbers the columns, starting with 1).

When a particular piece of information is unavailable, a NULL is sent
in its place -- just as you'd expect.  An example of such a column
would be:

fileinfo=> QUERY RESULTS SELECT CAST(1 AS integer), CAST(2 AS bigint);classname | attname | atttype | classid | typeid
|typemod 
 
-----------+---------+---------+---------+--------+---------          | int4    | integer |         |     23 |      -1
       | int8    | bigint  |         |     20 |      -1
 
(2 rows)


(psql shows NULLs as no value, so that's what I'm showing above).



Anyway, it's just a thought, but it's something that could be used by
literally everything.  And, of course, QUERY RESULTS should be able to
operate recursively, thus "QUERY RESULTS QUERY RESULTS ... SELECT ..."
(which could be made a special case if necessary).

The downside of this is that client libraries that wanted information
about what a query would return would have to send two queries through
the parser.  But the upside is that you take that hit only if you need
the information.  And if you plan to issue a particular query a lot,
you can issue the above command once and you're done.

I have no idea how hard this would be to implement.  I'm assuming that
EXPLAIN goes through a lot of the same code paths that this does, so
it may make sense to make this a variant of EXPLAIN (e.g., EXPLAIN
RESULTS SELECT...).




-- 
Kevin Brown                          kevin@sysexperts.com



Re: Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
I like this idea because it used our existing query API to return result
information.

Added to TODO:

* Allow clients to get data types, typmod, schema.table.column names from result sets, either via the backend protocol
ora new QUERYINFO command
 



---------------------------------------------------------------------------

Kevin Brown wrote:
> Tom Lane wrote:
> > Peter Eisentraut <peter_e@gmx.net> writes:
> > > ... So the application already knows
> > > that "foo" is the table and "a" is the column.  So if the application
> > > wants to know about details on the column "a", it can execute
> > > SELECT whatever FROM pg_attribute, pg_class WHERE relname = 'foo' AND attname = 'a';
> > > With this proposed change, it can replace that with
> > > SELECT whatever FROM pg_attribute, pg_class WHERE oid = X AND attnum = Y;
> > 
> > Dave will correct me if I'm wrong --- but I think the issue here is that
> > the client-side library (think ODBC or JDBC) needs to gain this level of
> > understanding of a query that is presented to it as an SQL-source
> > string.  So no, it doesn't already know that "foo" is the table and "a"
> > is the column.  To find that out, it has to duplicate a lot of backend
> > code.
> 
> Perhaps, rather than changing the protocol to include attrelid/attnum
> information for the query, we should instead implement a command that
> would yield the query's result information directly:
> 
> fileinfo=> QUERY RESULTS SELECT * from files;
>  classname |  attname   |         atttype          | classid  | typeid | typemod 
> -----------+------------+--------------------------+----------+--------+---------
>  files     | filename   | character varying(1024)  | 59422343 |   1043 |    1028
>  files     | mode       | bit(32)                  | 59422343 |   1560 |      32
>  files     | size       | bigint                   | 59422343 |     20 |      -1
>  files     | uid        | integer                  | 59422343 |     23 |      -1
>  files     | gid        | integer                  | 59422343 |     23 |      -1
>  files     | createtime | timestamp with time zone | 59422343 |   1184 |      -1
>  files     | modtime    | timestamp with time zone | 59422343 |   1184 |      -1
>  files     | device     | integer                  | 59422343 |     23 |      -1
>  files     | inode      | integer                  | 59422343 |     23 |      -1
>  files     | nlinks     | integer                  | 59422343 |     23 |      -1
> (10 rows)
> 
> Each tuple result of the QUERY RESULTS command (some other name for it
> could be selected, this is just an example) would describe a column
> that would be returned by the query being examined, and the tuples
> would be sent in the left-to-right order that the columns they
> describe would appear (or, alternatively, another column like attnum
> could be sent that numbers the columns, starting with 1).
> 
> When a particular piece of information is unavailable, a NULL is sent
> in its place -- just as you'd expect.  An example of such a column
> would be:
> 
> fileinfo=> QUERY RESULTS SELECT CAST(1 AS integer), CAST(2 AS bigint);
>  classname | attname | atttype | classid | typeid | typemod 
> -----------+---------+---------+---------+--------+---------
>            | int4    | integer |         |     23 |      -1
>            | int8    | bigint  |         |     20 |      -1
> (2 rows)
> 
> 
> (psql shows NULLs as no value, so that's what I'm showing above).
> 
> 
> 
> Anyway, it's just a thought, but it's something that could be used by
> literally everything.  And, of course, QUERY RESULTS should be able to
> operate recursively, thus "QUERY RESULTS QUERY RESULTS ... SELECT ..."
> (which could be made a special case if necessary).
> 
> The downside of this is that client libraries that wanted information
> about what a query would return would have to send two queries through
> the parser.  But the upside is that you take that hit only if you need
> the information.  And if you plan to issue a particular query a lot,
> you can issue the above command once and you're done.
> 
> I have no idea how hard this would be to implement.  I'm assuming that
> EXPLAIN goes through a lot of the same code paths that this does, so
> it may make sense to make this a variant of EXPLAIN (e.g., EXPLAIN
> RESULTS SELECT...).
> 
> 
> 
> 
> -- 
> Kevin Brown                          kevin@sysexperts.com
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Added to TODO:

> * Allow clients to get data types, typmod, schema.table.column names from
>   result sets, either via the backend protocol or a new QUERYINFO command

Uh ... this is already done, at least as far as the protocol is
concerned.  We do need to backfill support in libpq, etc.
        regards, tom lane


Re: Roadmap for FE/BE protocol redesign

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Added to TODO:
> 
> > * Allow clients to get data types, typmod, schema.table.column names from
> >   result sets, either via the backend protocol or a new QUERYINFO command
> 
> Uh ... this is already done, at least as far as the protocol is
> concerned.  We do need to backfill support in libpq, etc.

Oh, really.  That is great.  I didn't realize you had handled that. I
will update the TODO:

* Update clients to use data types, typmod, schema.table.column names of result sets using new query protocol

I assume we will need new libpq functions?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Roadmap for FE/BE protocol redesign

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> Uh ... this is already done, at least as far as the protocol is
>> concerned.  We do need to backfill support in libpq, etc.

> I assume we will need new libpq functions?

Yeah.
        regards, tom lane