Thread: Roadmap for FE/BE protocol redesign
This is an attempt to lay out a road map for updating the frontend/backend protocol in 7.4. I don't at this point want to get into details on any one of the TODO items, just get consensus that this is the set of tasks to be tackled. Are there any areas that I've missed (that require protocol changes)? * Extend ERROR and NOTICE messages to carry multiple fields, including (as appropriate) a spec-compliant error code, a textual position in the original query string, information about the source-code location where the error was reported, etc. * Consider extending NOTIFY messages to allow a parameter to be carried. * Fix COPY protocol to allow graceful error recovery (more graceful than aborting the connection, at least) and to support COPY BINARY to/from frontend. * Redesign fastpath function call protocol to eliminate the problems cited in the source code comments (src/backend/tcop/fastpath.c), and to eliminate the security hole of accepting unchecked internal representation from frontend. Also consider a fastpath for execution of PREPAREd queries. * Re-institute type-specific send/receive conversion routines to allow some modest amount of architecture independence for binary data. This'd provide a place to check for bogus internal representation during fastpath input and COPY BINARY IN, too, thus alleviating security concerns. * Get rid of hardwired field sizes in StartupPacket --- use variable-length null-terminated strings. Fixes problem with usernames being limited to 32 characters, gets around unreasonable limitation on PGOPTIONS length. Also can remove unused fields. * Backend should pass its version number, database encoding, default client encoding, and possibly other data (any ideas?) to frontend during startup, to avoid need for explicit queries to get this info. We could also consider eliminating SET commands sent by libpq in favor of adding variable settings to startup packet's PGOPTIONS field. Ideally we could get back to the point where a standard connection startup takes only one packet in each direction. * Backend's ReadyForQuery message (Z message) should carry an indication of current transaction status (idle/in transaction/in aborted transaction) so that frontend need not guess at state. Perhaps also indicate autocommit status. (Is there anything else that frontends would Really Like To Know?) * XML support? If we do anything, I'd want some extensible solution to allowing multiple query-result output formats from the backend, not an XML-specific hack. For one thing, that would allow the actual appearance of any XML support to happen later. One of the $64 questions that has to be answered is how much work we're willing to expend on backwards compatibility. The path of least resistance would be to handle it the same way we've done protocol revisions in the past: the backend will be able to handle both old and new protocols (so it can talk to old clients) but libpq would be revised to speak only the new protocol (so new/recompiled clients couldn't talk to old backends). We've gotten away with this approach in the past, but the last time was release 6.4. I fully expect to hear more complaints now. One way to tamp down expectations of client backwards compatibility would be to call the release 8.0 instead of 7.4 ;-) Comments? regards, tom lane
Tom Lane wrote: <snip> > One way to tamp down expectations of client backwards compatibility > would be to call the release 8.0 instead of 7.4 ;-) > > Comments? Actually, I've been thinking about the numbering of the next PostgreSQL version for a few days now. The scenario that's appealing to me the most is this for the next release: PostgreSQL 8.0 ************** + Includes PITR and the Win32 port + Not sure where Satoshi is up to with his 2 phase commit proposal, but that might make sense to incorporate into a wire protocol revision. From memory he received funding to work on it, so itmight be coming along nicely. + Other things optional of course. Personally, I'd rather we go for PostgreSQL 8.0, waiting a while extra for PITR and Win32 if needed, and also properly co-ordinate all of the release process information (website updates, package builds, Announce to the mailing lists and news sources). Regards and best wishes, Justin Clift -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
> + Not sure where Satoshi is up to with his 2 phase commit proposal, but > that might make sense to incorporate into a wire protocol revision. > From memory he received funding to work on it, so it might be coming > along nicely. One should note that his protocol changes had absolutely nothing to do with 2 phase commits -- but were used as a marker to direct replication. We may want to consider leaving some space for a server / server style communication (Cluster ID, etc.) -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Tom Lane wrote: > * Backend should pass its version number, database encoding, default > client encoding, and possibly other data (any ideas?) to frontend during > startup, to avoid need for explicit queries to get this info. We could > also consider eliminating SET commands sent by libpq in favor of adding > variable settings to startup packet's PGOPTIONS field. Ideally we could > get back to the point where a standard connection startup takes only one > packet in each direction. Should we pass this in a way where we can add stuff later, like passing it as a simple NULL-terminated string that can get split up on the client end. > One of the $64 questions that has to be answered is how much work we're > willing to expend on backwards compatibility. The path of least > resistance would be to handle it the same way we've done protocol > revisions in the past: the backend will be able to handle both old and new > protocols (so it can talk to old clients) but libpq would be revised to > speak only the new protocol (so new/recompiled clients couldn't talk to > old backends). We've gotten away with this approach in the past, but the > last time was release 6.4. I fully expect to hear more complaints now. I think such compatibility is sufficient. We can mention in the releases notes that they should upgrade there servers before their clients. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Justin Clift <justin@postgresql.org> writes: > The scenario that's appealing to me the most is this for the next release: > PostgreSQL 8.0 > + Includes PITR and the Win32 port If the folks doing those things can get done in time, great. I'm even willing to push out the release schedule (now, not later) to make it more likely they can get done. What I'm not willing to do is define the release in terms of "it happens when these things are done". We learned the folly of that approach in 7.1 and 7.2. Setting a target date and sticking to it works *much* better. > + Not sure where Satoshi is up to with his 2 phase commit proposal, but > that might make sense to incorporate into a wire protocol revision. I can't see any need for protocol-level support for such a thing. Why wouldn't it just be some more SQL commands? (Not that I believe in 2PC as a real-world solution anyway, but that's a different argument...) regards, tom lane
> * Backend's ReadyForQuery message (Z message) should carry an indication > of current transaction status (idle/in transaction/in aborted transaction) > so that frontend need not guess at state. Perhaps also indicate > autocommit status. (Is there anything else that frontends would Really > Like To Know?) Could it include transaction depth with the assumption nested transactions will arrive at some point? > * XML support? If we do anything, I'd want some extensible solution to > allowing multiple query-result output formats from the backend, not an > XML-specific hack. For one thing, that would allow the actual appearance > of any XML support to happen later. > One of the $64 questions that has to be answered is how much work we're > willing to expend on backwards compatibility. The path of least > resistance would be to handle it the same way we've done protocol > revisions in the past: the backend will be able to handle both old and new > protocols (so it can talk to old clients) but libpq would be revised to > speak only the new protocol (so new/recompiled clients couldn't talk to > old backends). We've gotten away with this approach in the past, but the > last time was release 6.4. I fully expect to hear more complaints now. I wouldn't worry about backward compatibility complaints too much BUT I'd be tempted to make a startup packet that will allow libpq to revert back to old protocols easily enough for the future so that we can do incremental changes to the protocol. -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Tom Lane wrote: >> * Backend should pass its version number, database encoding, default >> client encoding, and possibly other data (any ideas?) to frontend during >> startup, to avoid need for explicit queries to get this info. > Should we pass this in a way where we can add stuff later, like passing > it as a simple NULL-terminated string that can get split up on the > client end. Yeah, I was envisioning something with multiple labeled fields so that more stuff can be added later without a protocol change (likewise for StartupPacket and ErrorMessage). But again, I don't want this thread to get into any details about specific tasks --- let's try to get a view of the forest before we start hewing down individual trees... >> We've gotten away with this approach in the past, but the >> last time was release 6.4. I fully expect to hear more complaints now. > I think such compatibility is sufficient. We can mention in the > releases notes that they should upgrade there servers before their > clients. I'd be really happy if we can make that stick. There's enough work to be done here without trying to develop a multiprotocol version of libpq. It would be good to hear some words from the JDBC and ODBC developers about what sort of plans they'd have for updating those interfaces. regards, tom lane
On Mon, 2003-03-10 at 14:05, Justin Clift wrote: > Tom Lane wrote: > <snip> > > One way to tamp down expectations of client backwards compatibility > > would be to call the release 8.0 instead of 7.4 ;-) > > > > Comments? > > Actually, I've been thinking about the numbering of the next PostgreSQL > version for a few days now. > > The scenario that's appealing to me the most is this for the next release: > > PostgreSQL 8.0 > ************** > > + Includes PITR and the Win32 port > > + Not sure where Satoshi is up to with his 2 phase commit proposal, but > that might make sense to incorporate into a wire protocol revision. > From memory he received funding to work on it, so it might be coming > along nicely. > > + Other things optional of course. > > > Personally, I'd rather we go for PostgreSQL 8.0, waiting a while extra > for PITR and Win32 if needed, and also properly co-ordinate all of the > release process information (website updates, package builds, Announce > to the mailing lists and news sources). > I don't think PITR or Win32 (or even replication) warrant an 8.0, since none of those should effect client/server interaction and/or backward compatibility. (Or at least not as much as schema support did, which required most "adminy" apps to be worked over) A protocol change however, would warrant a version number bump IMHO. I would guess that by the time all of the protocol changes could be completed, we'd have win32 or pitr, so it will hopefully be moot. Robert Treat
Rod Taylor <rbt@rbt.ca> writes: > I'd be tempted to make a startup packet that will allow libpq to revert > back to old protocols easily enough for the future so that we can do=20 > incremental changes to the protocol. We already have that: you send a startup packet with a version less than the latest, and the backend speaks that version to you. One thing I want to do though is relax the protocol-level constraints on certain message contents: for example, if ErrorMessage becomes a collection of labeled fields, it should be possible to add new field types without calling it a protocol revision. The protocol need only specify "ignore any fields whose label you do not recognize". regards, tom lane
On Mon, 2003-03-10 at 14:30, Tom Lane wrote: > Rod Taylor <rbt@rbt.ca> writes: > > I'd be tempted to make a startup packet that will allow libpq to revert > > back to old protocols easily enough for the future so that we can do=20 > > incremental changes to the protocol. > > We already have that: you send a startup packet with a version less than > the latest, and the backend speaks that version to you. Yes, but that requires you know the backend is less than the latest. If you send version A, and the backend responds don't know A, but I know A - 2, then libpq may want to try speaking A - 2. > types without calling it a protocol revision. The protocol need only > specify "ignore any fields whose label you do not recognize". This is probably just as good, if it's done for both sides. -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Justin Clift wrote: > > PostgreSQL 8.0 > ************** > > + Includes PITR and the Win32 port *snip* I feel like the upcoming 7.4 is the most important release since the introduction of toast, maybe even since the introduction of the sql language. I wholeheartedly agree with your proposition. Merlin
Rod Taylor <rbt@rbt.ca> writes: >> We already have that: you send a startup packet with a version less than >> the latest, and the backend speaks that version to you. > Yes, but that requires you know the backend is less than the latest. As opposed to knowing what? You send the version number you wish to speak; either the backend can handle it, or not. regards, tom lane
Tom Lane wrote: >This is an attempt to lay out a road map for updating the frontend/backend >protocol in 7.4. I don't at this point want to get into details on any >one of the TODO items, just get consensus that this is the set of tasks >to be tackled. Are there any areas that I've missed (that require >protocol changes)? > > > What about binding variables ala oracle's ociparse -> ocibindbyname -> ociexecute -> ocifetch ? I know you can do most of it via SQL (PREPARE/EXECUTE) but you can't do 'RETURN x INTO :x' as it stands. This would also get around the problem of getting values from newly inserted rows (eg PKs) without resorting to OIDs. Not entirely a FE/BE issue... but worth considering in any redesign. Ashley Cambrell
On Tue, 2003-03-11 at 06:52, Tom Lane wrote: > Rod Taylor <rbt@rbt.ca> writes: > >> We already have that: you send a startup packet with a version less than > >> the latest, and the backend speaks that version to you. > > > Yes, but that requires you know the backend is less than the latest. > > As opposed to knowing what? You send the version number you wish to speak; > either the backend can handle it, or not. If the backend can not handle the version I request, but can handle a prior version, I'd like to know. I am planning on having handlers for multiple protocol versions in the same memory space (I'm using Smalltalk, BTW) so that one application can talk to various databases of various vintages. I suppose that the client can just keep retrying the connection with different versions until it gets a match, though.
On Mon, 2003-03-10 at 14:52, Tom Lane wrote: > Rod Taylor <rbt@rbt.ca> writes: > >> We already have that: you send a startup packet with a version less than > >> the latest, and the backend speaks that version to you. > > > Yes, but that requires you know the backend is less than the latest. > > As opposed to knowing what? You send the version number you wish to speak; > either the backend can handle it, or not. At some point PostgreSQL will have enough users that changing it will piss them off. If the backend cannot handle whats been requested, we may want to consider negotiating a protocol that both can handle. Anyway, it doesn't really affect me one way or the other. So whatever you like is probably fine. -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Bruce Badger <bruce_badger@badgerse.com> writes: > I suppose that the client can just keep retrying the connection with > different versions until it gets a match, though. Right now, the backend just barfs withFATAL: unsupported frontend protocol so you have to do a blind search to see what it will take. It would probably be a good idea to improve that message to mention exactly what range of protocol versions the backend does support. However, this will do you little good for talking to existing backends :-( regards, tom lane
On Mon, 2003-03-10 at 16:37, Ashley Cambrell wrote: > This would also get around the problem of getting values from newly > inserted rows (eg PKs) without resorting to OIDs. That's not a problem: ensure that the newly inserted row has a SERIAL column, and use currval(). Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
Neil Conway wrote:<br /><blockquote cite="mid1047333552.357.286.camel@tokyo" type="cite"><pre wrap="">On Mon, 2003-03-10at 16:37, Ashley Cambrell wrote: </pre><blockquote type="cite"><pre wrap="">This would also get around the problemof getting values from newly inserted rows (eg PKs) without resorting to OIDs. </pre></blockquote><pre wrap=""> That's not a problem: ensure that the newly inserted row has a SERIAL column, and use currval(). </pre></blockquote> Ok. I forget about that. Even so, it would still be nice to have a bindlike interface... :-)<br /><br /> Ashley Cambrell<br /><br />
Neil Conway <neilc@samurai.com> writes: > On Mon, 2003-03-10 at 16:37, Ashley Cambrell wrote: >> This would also get around the problem of getting values from newly >> inserted rows (eg PKs) without resorting to OIDs. > That's not a problem: ensure that the newly inserted row has a SERIAL > column, and use currval(). There was some talk awhile back of inventing INSERT ... RETURNING and UPDATE ... RETURNING commands so that you could pass back computed values to the frontend without an extra query. It doesn't seem to have gotten further than a TODO item yet, though. AFAICS this does not need a protocol extension, anyway --- it'd look just the same as a SELECT at the protocol level. regards, tom lane
Tom Lane wrote: > Justin Clift <justin@postgresql.org> writes: > > The scenario that's appealing to me the most is this for the next release: > > PostgreSQL 8.0 > > + Includes PITR and the Win32 port > > If the folks doing those things can get done in time, great. I'm even > willing to push out the release schedule (now, not later) to make it > more likely they can get done. What I'm not willing to do is define > the release in terms of "it happens when these things are done". We > learned the folly of that approach in 7.1 and 7.2. Setting a target > date and sticking to it works *much* better. Well, we had that 7.4 discussion a few days ago, and only had two people comment on our scheduling. At that time, the discussion was for 7.4 beta starting May 1 vs June 1. June 1 would be roughly six months from our 7.3 final release, which is typical. I agree with Tom that we should decided sooner rather than later on a beta date. Right now I don't think we have enough to must-have features to justify a release, and as everyone knows, an upgrade isn't easy for our users. I also agree with Tom that we shouldn't peg our beta schedule on specific features. So, what should we do? Should we go another month or two and just wait until we have enough must-have features? While not waiting on specific features, it _is_ waiting for something to warrant a release. I guess the big question is whether we release on a scheduled-basis or a enough-features-basis. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > So, what should we do? Should we go another month or two and just wait > until we have enough must-have features? While not waiting on specific > features, it _is_ waiting for something to warrant a release. I guess > the big question is whether we release on a scheduled-basis or a > enough-features-basis. "Enough features" is such a judgment call that no one can predict what the schedule will be, if that's part of the decision. I had been leaning to May 1 beta, but am happy to switch to June 1 if you feel that makes an improvement in the odds of completing the Windows port. (I think it will also improve the odds of finishing this protocol stuff I've taken on...) I don't want to see it pushed further than that without good concrete arguments for doing so. regards, tom lane
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: 10 March 2003 18:51 > To: pgsql-hackers@postgresql.org; pgsql-interfaces@postgresql.org > Subject: [HACKERS] Roadmap for FE/BE protocol redesign > > > This is an attempt to lay out a road map for updating the > frontend/backend protocol in 7.4. I don't at this point want > to get into details on any one of the TODO items, just get > consensus that this is the set of tasks to be tackled. Are > there any areas that I've missed (that require protocol changes)? What about the addition of pg_attribute.attrelid & pg_attribute.attname/attnum in RowDesription messages to identify the underlying attribute (where appropriate)? I just looked at an issue in the ODBC driver where this would help - it seems we return incorrect column metadata from query results, I guess (I haven't look at the code yet) because it's somewhat tricky to figure out. Regards, Dave.
Tom Lane wrote: > "Enough features" is such a judgment call that no one can predict what > the schedule will be, if that's part of the decision. > > I had been leaning to May 1 beta, but am happy to switch to June 1 if > you feel that makes an improvement in the odds of completing the Windows > port. (I think it will also improve the odds of finishing this protocol > stuff I've taken on...) I don't want to see it pushed further than that > without good concrete arguments for doing so. > FWIW, if we're voting, I'd say: 1. on "firm date" vs "when feature x is ready": I vote "firm date" -- I know that last August the firm beta start date motivated me to get some things done that I would have dragged out without the scheduled cutoff. 2. June 1 beta cutoff sounds about right. Joe
"Dave Page" <dpage@vale-housing.co.uk> writes: > What about the addition of pg_attribute.attrelid & > pg_attribute.attname/attnum in RowDesription messages to identify the > underlying attribute (where appropriate)? Well, we can talk about it, but I still think that any frontend that relies on such information is broken by design. (And if that means the JDBC spec is broken, then the JDBC spec is broken.) Just to start with, if I do "SELECT * FROM view", am I going to see the info associated with the view column, or with the hypothetical underlying table column? (Actually, didn't I already make a list of a bunch of ways in which this concept is underspecified? AFAIR, you didn't suggest answers to any of those questions ... but we need answers to all of them if we are going to implement the feature.) regards, tom lane
Tom Lane wrote: > "Dave Page" <dpage@vale-housing.co.uk> writes: > > What about the addition of pg_attribute.attrelid & > > pg_attribute.attname/attnum in RowDesription messages to identify the > > underlying attribute (where appropriate)? > > Well, we can talk about it, but I still think that any frontend that > relies on such information is broken by design. (And if that means the > JDBC spec is broken, then the JDBC spec is broken.) > > Just to start with, if I do "SELECT * FROM view", am I going to see the > info associated with the view column, or with the hypothetical > underlying table column? (Actually, didn't I already make a list of a > bunch of ways in which this concept is underspecified? AFAIR, you > didn't suggest answers to any of those questions ... but we need answers > to all of them if we are going to implement the feature.) I was willing to add a hack to enable default column labels to be "table.column" --- that seemed like the least obtrusive. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > I was willing to add a hack to enable default column labels to be > "table.column" --- that seemed like the least obtrusive. Most of the definitional issues still apply: which table name are you going to insert, and under what conditions? If we're revising the protocol, there's no reason to hack up the column label to carry two pieces of info; it'd be cleaner to provide a separate slot in the T message to carry the table name. I just want to see a reasonably complete spec for what the feature is supposed to do, before we buy into it ... regards, tom lane
Bruce Momjian wrote: <snip> > So, what should we do? Should we go another month or two and just wait > until we have enough must-have features? While not waiting on specific > features, it _is_ waiting for something to warrant a release. I guess > the big question is whether we release on a scheduled-basis or a > enough-features-basis. Hmmm, I feel we should decide on features that will make an 8.0 release meaningful, and *somehow* work to making sure they are ready for the release. With 7.1/7.2, Tom mentioned us being delayed because specific features we were waiting for became dependant on one person. Would it be feasible to investigate approaches for having the Win32 and PITR work be shared amongst a few very-interested volunteers, so that people can cover for each other's downtime? Not sure of the confidentiality level of the Win32/PITR patches at present, but I'd guess there would be at least a few solid volunteers willing to contribute to the Win32/PITR ports if we asked for people to step forwards. :-) Regards and best wishes, Justin Clift -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
Re: Beta Schedule (was Re: Roadmap for FE/BE protocol redesign)
From
"Christopher Kings-Lynne"
Date:
> I had been leaning to May 1 beta, but am happy to switch to June 1 if > you feel that makes an improvement in the odds of completing the Windows > port. (I think it will also improve the odds of finishing this protocol > stuff I've taken on...) I don't want to see it pushed further than that > without good concrete arguments for doing so. There really is no rush... I'm well-versed in PostgreSQL, and even I haven't upgraded some of our production servers to even 7.3 yet (thanks to pg_dump dependency nightmare).. BTW, so no-one conflicts, I'm doing up pg_get_triggerdef(oid) at the moment. Once that's done, I'll be able to submit a redone psql \d output that includes the trigger definition. Cheers, Chris
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > I was willing to add a hack to enable default column labels to be > > "table.column" --- that seemed like the least obtrusive. > > Most of the definitional issues still apply: which table name are you > going to insert, and under what conditions? > > If we're revising the protocol, there's no reason to hack up the column > label to carry two pieces of info; it'd be cleaner to provide a separate > slot in the T message to carry the table name. I just want to see a > reasonably complete spec for what the feature is supposed to do, before > we buy into it ... I don't think we can get a complete spec, and hence the _hack_ idea. :-) -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tom Lane wrote: > "Dave Page" <dpage@vale-housing.co.uk> writes: > >>What about the addition of pg_attribute.attrelid & >>pg_attribute.attname/attnum in RowDesription messages to identify the >>underlying attribute (where appropriate)? > > Well, we can talk about it, but I still think that any frontend that > relies on such information is broken by design. (And if that means the > JDBC spec is broken, then the JDBC spec is broken.) > > Just to start with, if I do "SELECT * FROM view", am I going to see the > info associated with the view column, or with the hypothetical > underlying table column? (Actually, didn't I already make a list of a > bunch of ways in which this concept is underspecified? AFAIR, you > didn't suggest answers to any of those questions ... but we need answers > to all of them if we are going to implement the feature.) The problem Dave is suggesting this as a first attempt at a solution for is that with ODBC, a frontend (i.e. OpenOffice) asks the ODBC driver which columns are NULLable, etc. And the ODBC driver is getting the info wrong, then passing back the incorrect info. So, when a person goes to insert a row into a table with a SERIAL/SEQUENCE based column, OpenOffice has been told the column isn't NULLable and forces the user to enter a value. Voila, it doesn't work with sequences. :( It's likely possible to add to the ODBC driver some way of getting the info right, but Dave is also looking for a way of making this easier into the future for similar problems. i.e. Let the database explicitly have info about what each column can do. That's my understanding of it anyway. :-) Regards and best wishes, Justin Clift > regards, tom lane -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
Hi guys, As a thought, has anyone considered if it's worth doing data compression of the "new proposed" protocol for PostgreSQL 8.0/7.4? It was suggested a long time ago by Joshua Drake (and his version was well accepted by his customers from what I heard), so might this be worth adding too? :-) Regards and best wishes, Justin Clift -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
Justin Clift <justin@postgresql.org> writes: > With 7.1/7.2, Tom mentioned us being delayed because specific features > we were waiting for became dependant on one person. > Would it be feasible to investigate approaches for having the Win32 and > PITR work be shared amongst a few very-interested volunteers, so that > people can cover for each other's downtime? It would certainly be good to bring as much manpower to bear on those problems as we can. But that doesn't really address my concern: if the schedule is defined as "we go beta when feature X is done", then no one who's working on stuff other than feature X knows how to plan their time. The only fair way to run the project is "we go beta at time T"; that way everyone knows what they need to shoot for and can plan accordingly. I don't mind setting the planned time T on the basis of what we think it will take for certain popular feature X's to be done. But if the guys working on X aren't done at T, it's not fair to everyone else to hold our breaths waiting for them to be done at T-plus-who-knows-what. I don't really have any sympathy for the argument that "it won't be a compelling release if we don't have feature X". If the release isn't compelling for someone, they don't have to upgrade; they can wait for the next release. The folks who *are* eager for what's been gotten done will be glad of having a release now rather than N months from now. And do I need to point out that "it runs on Windoze" is not of earth-shattering importance for everyone? regards, tom lane
Tom Lane wrote: > Justin Clift <justin@postgresql.org> writes: > >>The scenario that's appealing to me the most is this for the next release: >>PostgreSQL 8.0 >>+ Includes PITR and the Win32 port > > If the folks doing those things can get done in time, great. I'm even > willing to push out the release schedule (now, not later) to make it > more likely they can get done. What I'm not willing to do is define > the release in terms of "it happens when these things are done". We > learned the folly of that approach in 7.1 and 7.2. Setting a target > date and sticking to it works *much* better. Yep, we both seem to be saying that we'd like these features, but we don't want to see them become delay-points. >>+ Not sure where Satoshi is up to with his 2 phase commit proposal, but >>that might make sense to incorporate into a wire protocol revision. > > I can't see any need for protocol-level support for such a thing. > Why wouldn't it just be some more SQL commands? Not sure. It seems like 2PC will be required/desirable within the year for better support of some clustering scenarios, so we "might as well look at it now". When I was reading Satoshi's stuff a while ago I thought it was a protcol level thing, not a SQL command level thing, but don't really care either way. :) Regards and best wishes, Justin Clift > (Not that I believe in 2PC as a real-world solution anyway, but that's > a different argument...) > > regards, tom lane -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
It's rumoured that Tom Lane once said: > "Dave Page" <dpage@vale-housing.co.uk> writes: >> What about the addition of pg_attribute.attrelid & >> pg_attribute.attname/attnum in RowDesription messages to identify the >> underlying attribute (where appropriate)? > > Well, we can talk about it, but I still think that any frontend that > relies on such information is broken by design. (And if that means the > JDBC spec is broken, then the JDBC spec is broken.) I don't know about JDBC, but ODBC could use it, and it would save a heck of a lot of pain in apps like pgAdmin that need to figure out if a column in an arbitrary resultset might be updateable. At the moment there is some nasty code in pgAdmin II that attempts to parse the SQL statement to figure out if the the resultset is updateable by trying to figure out the number of relations in the query, whether any of them is a view or sequence, whether there are any function calls or expressions in the attribute list and so on. It then has to try to figure out if there is a complete pkey in the resultset that can be used for the update, or whether it should attempt an update based on all existing values. That code is just plain nasty in VB. In pgAdmin III we've already mentioned stealing bits of the PostgreSQL parser. The addition of the base column identifier (the pg_attribute.oid would have sufficed, but I can live with attrelid, attname, or even nspnam, relname & attname or similar) would make this trivil, and allow interfaces like ODBC, JDBC, OLE-DB and Npgsql to gain easy access to any metadata they might require. > Just to start with, if I do "SELECT * FROM view", am I going to see the > info associated with the view column, or with the hypothetical > underlying table column? The view. We don't care where the data originally came from, only where it came from as far as the query creating the resultset is concerned. Of course, updateable views would make this irrelevant anyway... (Actually, didn't I already make a list of a > bunch of ways in which this concept is underspecified? AFAIR, you > didn't suggest answers to any of those questions ... but we need > answers to all of them if we are going to implement the feature.) I guess I must have missed that thread. Regards, Dave
It's rumoured that Bruce Momjian once said: > Tom Lane wrote: >> "Dave Page" <dpage@vale-housing.co.uk> writes: >> > What about the addition of pg_attribute.attrelid & >> > pg_attribute.attname/attnum in RowDesription messages to identify >> > the underlying attribute (where appropriate)? >> >> Well, we can talk about it, but I still think that any frontend that >> relies on such information is broken by design. (And if that means >> the JDBC spec is broken, then the JDBC spec is broken.) >> >> Just to start with, if I do "SELECT * FROM view", am I going to see >> the info associated with the view column, or with the hypothetical >> underlying table column? (Actually, didn't I already make a list of a >> bunch of ways in which this concept is underspecified? AFAIR, you >> didn't suggest answers to any of those questions ... but we need >> answers to all of them if we are going to implement the feature.) > > I was willing to add a hack to enable default column labels to be > "table.column" --- that seemed like the least obtrusive. That would help, but not in the cases that cause the most grief - for example when the column has been aliased in the original query - that should override the label of course, but then we still need to parse the SQL at the client to figure out what's going on. Regards, Dave.
It's rumoured that Bruce Momjian once said: > Tom Lane wrote: >> Bruce Momjian <pgman@candle.pha.pa.us> writes: >> > I was willing to add a hack to enable default column labels to be >> > "table.column" --- that seemed like the least obtrusive. >> >> Most of the definitional issues still apply: which table name are you >> going to insert, and under what conditions? >> >> If we're revising the protocol, there's no reason to hack up the >> column label to carry two pieces of info; it'd be cleaner to provide a >> separate slot in the T message to carry the table name. I just want >> to see a reasonably complete spec for what the feature is supposed to >> do, before we buy into it ... > > I don't think we can get a complete spec, and hence the _hack_ idea. > :-) Well, what would constitute a complete spec? I think I've told the group what I would like to be able to do, what unanswered questions can I (hopefully :-) ) answer? Regards, Dave.
Tom Lane wrote: > > > I think such compatibility is sufficient. We can mention in the > > releases notes that they should upgrade there servers before their > > clients. > > I'd be really happy if we can make that stick. There's enough work to > be done here without trying to develop a multiprotocol version of > libpq. > > It would be good to hear some words from the JDBC and ODBC developers > about what sort of plans they'd have for updating those interfaces. Psqlodbc driver couldn't use the library unless the library could handle multiple protocol. What the driver has suffered from is to get the fields' info of a query result or the parameters' info of a statement. The info is needed even before the execution of the statement(i.e it's only prepared). regards, Hiroshi Inouehttp://www.geocities.jp/inocchichichi/psqlodbc/
I agree, let's not wait for specific features. The issue was whether we had enough significant features done for a release --- I didn't think we did, so I am saying, let's get more features, rather than let's get feature X. As we fill in missing features, there will be less must-have features to add, so we are left with continuing with our present release pace or releasing less frequently with the same number of feature additions. --------------------------------------------------------------------------- Tom Lane wrote: > Justin Clift <justin@postgresql.org> writes: > > With 7.1/7.2, Tom mentioned us being delayed because specific features > > we were waiting for became dependant on one person. > > > Would it be feasible to investigate approaches for having the Win32 and > > PITR work be shared amongst a few very-interested volunteers, so that > > people can cover for each other's downtime? > > It would certainly be good to bring as much manpower to bear on those > problems as we can. But that doesn't really address my concern: if the > schedule is defined as "we go beta when feature X is done", then no one > who's working on stuff other than feature X knows how to plan their > time. The only fair way to run the project is "we go beta at time T"; > that way everyone knows what they need to shoot for and can plan > accordingly. > > I don't mind setting the planned time T on the basis of what we think > it will take for certain popular feature X's to be done. But if the > guys working on X aren't done at T, it's not fair to everyone else to > hold our breaths waiting for them to be done at T-plus-who-knows-what. > > I don't really have any sympathy for the argument that "it won't be a > compelling release if we don't have feature X". If the release isn't > compelling for someone, they don't have to upgrade; they can wait for > the next release. The folks who *are* eager for what's been gotten done > will be glad of having a release now rather than N months from now. > And do I need to point out that "it runs on Windoze" is not of > earth-shattering importance for everyone? > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Justin Clift wrote: > confidentiality level of the Win32/PITR patches at present, but I'd > guess there would be at least a few solid volunteers willing to > contribute to the Win32/PITR ports if we asked for people to step > forwards. I'd like to help. I've been following the list for several months now. I'd like to submit a patch or two and get a little active if I can cut the mustard, so to speak. If not, I can always supply a little testing and benchmarking. I have some experience writing backends, GIS, xml, and all manner of database. Unfortunately I have very low familiarity with unix tools, except cvs. Is there a non cygwin version of the source that will compile (or not) on a win32 machine? That would be a great place to start. Merlin
Sure, Neil Conway updated Jan's patches for 7.3. It is in: ftp://candle.pha.pa.us/pub/postgresql/mypatches/ --------------------------------------------------------------------------- Merlin Moncure wrote: > Justin Clift wrote: > > confidentiality level of the Win32/PITR patches at present, but I'd > > guess there would be at least a few solid volunteers willing to > > contribute to the Win32/PITR ports if we asked for people to step > > forwards. > > I'd like to help. I've been following the list for several months now. > I'd like to submit a patch or two and get a little active if I can cut > the mustard, so to speak. If not, I can always supply a little testing > and benchmarking. I have some experience writing backends, GIS, xml, > and all manner of database. Unfortunately I have very low familiarity > with unix tools, except cvs. > > Is there a non cygwin version of the source that will compile (or not) > on a win32 machine? That would be a great place to start. > > Merlin > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Hiroshi Inoue <Inoue@tpf.co.jp> writes: > What the driver has suffered from is to get the > fields' info of a query result or the parameters' > info of a statement. The info is needed even before > the execution of the statement(i.e it's only prepared). Hm. Are you saying that you would like PREPARE to send back a RowDescription ('T') message? Or is there more to it than that? regards, tom lane
"Dave Page" <dpage@vale-housing.co.uk> writes: > Well, what would constitute a complete spec? I think I've told the group > what I would like to be able to do, what unanswered questions can I > (hopefully :-) ) answer? I'm still unclear on exactly what your needs are. In the first place, are you expecting to obtain data from arbitrary SELECT statements, or only from statements of the form "SELECT * FROM single_table"? You've also been confusing as to whether you want transparency of views (ie, does a select from a view return data about the view's nominal columns or about the underlying base table columns?). What about cases involving aggregates or grouping --- there may be simple Vars in the target list, but they can hardly be thought to represent updatable values. Also, you muttered something about inferring primary keys and number of relations in query; seems like this feature isn't solving your problem as far as that goes, because the set of attributes visible in the target list isn't much help for either. regards, tom lane
If there's a build of this available we'd love to test it in a major project we're working on. The project is currently using the 7.2 build that was made available, but we had to work around the lack of schema support by kludging table names as "namespace_table", so a 7.3 build would be great, with or without installer. So far, the 7.2 build has worked without a hitch, although our Db is *very* simple, so the feature set has hardly been stressed. FWIW, all access in our app is via JDBC. One of the big things holding us back from using Pg in our shipped product is lack of a released native Windows port, so I'm very keen to see this progress. andrew ----- Original Message ----- From: "Bruce Momjian" <pgman@candle.pha.pa.us> > > Sure, Neil Conway updated Jan's patches for 7.3. It is in: > > ftp://candle.pha.pa.us/pub/postgresql/mypatches/ > > -------------------------------------------------------------------------- - > > Merlin Moncure wrote: > > > > Is there a non cygwin version of the source that will compile (or not) > > on a win32 machine? That would be a great place to start. > > > > Merlin > >
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: 11 March 2003 15:06 > To: Dave Page > Cc: pgman@candle.pha.pa.us; pgsql-hackers@postgresql.org; > pgsql-interfaces@postgresql.org > Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign > > > "Dave Page" <dpage@vale-housing.co.uk> writes: > > Well, what would constitute a complete spec? I think I've told the > > group what I would like to be able to do, what unanswered questions > > can I (hopefully :-) ) answer? > > I'm still unclear on exactly what your needs are. In the > first place, are you expecting to obtain data from arbitrary > SELECT statements, or only from statements of the form > "SELECT * FROM single_table"? From any statement that returns a resultset. > You've also been confusing as > to whether you want transparency of views (ie, does a select > from a view return data about the view's nominal columns or > about the underlying base table columns?). The view's nominal columns. Resolving updates back to the base tables will (hopefully) be the job of updateable views. > What about cases > involving aggregates or grouping --- there may be simple Vars > in the target list, but they can hardly be thought to > represent updatable values. Omit the information. That will tell the client that the data did not come directly from an underlying relation. > Also, you muttered something about inferring primary keys and > number of relations in query; seems like this feature isn't > solving your problem as far as that goes, because the set of > attributes visible in the target list isn't much help for either. That one is my problem. If I can identify the source attributes as I would like, then I can work out those details easily. For aggregates, functions, expressions etc. non-existant meta data will also tell me what I need to know. The attributes in the target list do help with this, because if I know I have a complete primary key, then I know I can build an update query for any given row. As I see it, the tidy/simple answer is to add nspname, relname and attname information to each attribute in the row description, and if any one of those cannot be easily determined, or the value doesn't come directly from there (ie an aggregate), it is simply omitted or left empty. It could be done more efficiently with attnum/attrelid, but I wonder if that might cause problems in the future if attnum gets hacked about. Regards, Dave.
"Dave Page" <dpage@vale-housing.co.uk> writes: > As I see it, the tidy/simple answer is to add nspname, relname and > attname information to each attribute in the row description, and if any > one of those cannot be easily determined, or the value doesn't come > directly from there (ie an aggregate), it is simply omitted or left > empty. It could be done more efficiently with attnum/attrelid, but I > wonder if that might cause problems in the future if attnum gets hacked > about. I'd prefer to stick with rel OID and attnum; those are the keys you are going to need for looking in the system catalogs, anyway, and they're surely a lot more compact to transmit than three Names (potentially 180 bytes or so). So let me get this straight: we want the attribute ID information for any SELECT output column that is a simple Var (no expressions, and no aggregated cases either), treating view output columns as Vars belonging to the view. What about: * Joins. Does an aliased JOIN output variable get resolved to the underlying table variable? What about common columns in USING/NATURAL joins? * Sub-selects. Are you expectingSELECT x FROM (SELECT x FROM foo) ss to produce a reference to foo.x? (If so, why is this different from the case where ss is a view?) regards, tom lane
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: 11 March 2003 18:39 > To: Dave Page > Cc: pgman@candle.pha.pa.us; pgsql-hackers@postgresql.org; > pgsql-interfaces@postgresql.org > Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign > > > "Dave Page" <dpage@vale-housing.co.uk> writes: > >> What about: > >> > >> * Joins. Does an aliased JOIN output variable get resolved > >> to the underlying table variable? What about common columns > >> in USING/NATURAL joins? > > > Sorry, not sure what you mean - can you give me an example? > > Sure, consider table1 with columns a,b,c and table2 with a,x,y: > > SELECT * FROM table1 NATURAL JOIN table2 > > has output columns a,b,c,x,y (notice only one column a). How > do you expect these to be labeled? Does the answer change if > the join is OUTER? For example, in a LEFT JOIN x and y might > return NULLs that have nothing to do with any table2 row; is > it still appropriate to label them as table2 outputs? And > what do we do with column a in each outer-join case? Oh, I see. Well, we cannot resolve those cases unless we decide in advance that we will return metadata from either the left or right table. That sings future FAQ to me so I would be inclined to omit the additional info as we would for an aggregate. > Also, how about an aliased join > > SELECT * FROM (table1 NATURAL JOIN table2) AS q(r,s,t,u,v) > > In this case the output columns are named q.r, q.s, q.t, q.u, > q.v as far as the SELECT list can tell --- it's actually > illegal to refer to > table1.* or table2.* in the SELECT list, IIRC. Do we drill > down into the JOIN? (Probably so, if you expect the thing to > drill down into sub-SELECTs, but I want clarity ...) Likewise, if we can't figure out the info accurately, simply don't report it. In these, and any other similar cases, we will be no worse off with no underlying attribute data than we are now, and in the vast majority of cases we will benefit. Regards, Dave.
"Dave Page" <dpage@vale-housing.co.uk> writes: >> What about: >> >> * Joins. Does an aliased JOIN output variable get resolved >> to the underlying table variable? What about common columns >> in USING/NATURAL joins? > Sorry, not sure what you mean - can you give me an example? Sure, consider table1 with columns a,b,c and table2 with a,x,y: SELECT * FROM table1 NATURAL JOIN table2 has output columns a,b,c,x,y (notice only one column a). How do you expect these to be labeled? Does the answer change if the join is OUTER? For example, in a LEFT JOIN x and y might return NULLs that have nothing to do with any table2 row; is it still appropriate to label them as table2 outputs? And what do we do with column a in each outer-join case? Also, how about an aliased join SELECT * FROM (table1 NATURAL JOIN table2) AS q(r,s,t,u,v) In this case the output columns are named q.r, q.s, q.t, q.u, q.v as far as the SELECT list can tell --- it's actually illegal to refer to table1.* or table2.* in the SELECT list, IIRC. Do we drill down into the JOIN? (Probably so, if you expect the thing to drill down into sub-SELECTs, but I want clarity ...) >> * Sub-selects. Are you expecting >> SELECT x FROM (SELECT x FROM foo) ss >> to produce a reference to foo.x? (If so, why is this >> different from the case where ss is a view?) > Yes, I would expect that reference. I don't think it's different from a > view - we are trying to deciper the SQL itself, not what a view actually > represents so I would expect to see references to whatever relation is > in the original SQL, whether that relation is the result of a subselect > from another table/view, or a view. Okay, that clarifies the intent quite a bit. So we need to do the labeling processing before any view rewriting occurs. That probably means sticking the info into Resdom nodes during the parser, rather than extracting it on-the-fly from the planned query. regards, tom lane
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: 11 March 2003 15:55 > To: Dave Page > Cc: pgman@candle.pha.pa.us; pgsql-hackers@postgresql.org; > pgsql-interfaces@postgresql.org > Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign > > > I'd prefer to stick with rel OID and attnum; those are the > keys you are going to need for looking in the system > catalogs, anyway, and they're surely a lot more compact to > transmit than three Names (potentially 180 bytes or so). OK. > So let me get this straight: we want the attribute ID > information for any SELECT output column that is a simple Var > (no expressions, and no aggregated cases either), treating > view output columns as Vars belonging to the view. Yes. We could include attrelid for aggregates I suppose, but obviously not attnum. Nothing at all for expressions of course. > What about: > > * Joins. Does an aliased JOIN output variable get resolved > to the underlying table variable? What about common columns > in USING/NATURAL joins? Sorry, not sure what you mean - can you give me an example? > * Sub-selects. Are you expecting > SELECT x FROM (SELECT x FROM foo) ss > to produce a reference to foo.x? (If so, why is this > different from the case where ss is a view?) Yes, I would expect that reference. I don't think it's different from a view - we are trying to deciper the SQL itself, not what a view actually represents so I would expect to see references to whatever relation is in the original SQL, whether that relation is the result of a subselect from another table/view, or a view. Regards, Dave.
Bruce Momjian [mailto:pgman@candle.pha.pa.us] wrote: > Sure, Neil Conway updated Jan's patches for 7.3. It is in: > > ftp://candle.pha.pa.us/pub/postgresql/mypatches/ I have not been able to run the patch against either the 7.2.1 source or the 7.3.2 source. It always quits with assertion failure after trying to patch refint.c To run the patch I am typing (from inside the postgresql-7.x.x folder) patch -p1 < win32-port-changes-2.diff Running (from root) patch -p0 < win32-port-changes-2.diff does the same. Looking inside the patch, I see its looking for the 7.2.1 source tree. Tried running gnu patch and mingw patch without success. Should I be patching the head or am I doing something wrong? Merlin
Tom Lane wrote: > "Dave Page" <dpage@vale-housing.co.uk> writes: > > Well, what would constitute a complete spec? I think I've told the group > > what I would like to be able to do, what unanswered questions can I > > (hopefully :-) ) answer? > > I'm still unclear on exactly what your needs are. In the first place, > are you expecting to obtain data from arbitrary SELECT statements, or > only from statements of the form "SELECT * FROM single_table"? You've > also been confusing as to whether you want transparency of views (ie, > does a select from a view return data about the view's nominal columns > or about the underlying base table columns?). What about cases > involving aggregates or grouping --- there may be simple Vars in the > target list, but they can hardly be thought to represent updatable values. These questions can't possibly be unique to PG -- other database vendors must have answered these questions for their implementations of ODBC/JDBC too, or their databases would give ODBC and JDBC client authors the same kinds of trouble. So ... how have these questions been answered by other database vendors? The spec may be underspecified on these points, but since the entire purpose of ODBC and JDBC is to provide a database-independent framework for application authors to use, it follows that compatibility with other implementations is highly desirable. So doing more or less what other database vendors have done in response to these underspecified issues is probably a sensible course of action when there's no other obviously better answer. -- Kevin Brown kevin@sysexperts.com
Dave Page writes: > I don't know about JDBC, but ODBC could use it, and it would save a heck > of a lot of pain in apps like pgAdmin that need to figure out if a column > in an arbitrary resultset might be updateable. Strictly speaking, result sets are never updatable, because there's no way you can refer to a result set and tell the system to update it. So let's hear what you *really* need and then consider interfaces for *that*. Maybe updatable views or updatable cursors? -- Peter Eisentraut peter_e@gmx.net
On Mon, 10 Mar 2003, Tom Lane wrote: > One of the $64 questions that has to be answered is how much work we're > willing to expend on backwards compatibility. The path of least > resistance would be to handle it the same way we've done protocol > revisions in the past: the backend will be able to handle both old and > new protocols (so it can talk to old clients) but libpq would be revised > to speak only the new protocol (so new/recompiled clients couldn't talk > to old backends). We've gotten away with this approach in the past, but > the last time was release 6.4. I fully expect to hear more complaints > now. Personally ... as long as a v8.x client can talk to a v7.x backend, you have my vote ... I'm more apt to upgrade my clients before my servers anyway ...
On Mon, 10 Mar 2003, Tom Lane wrote: > Justin Clift <justin@postgresql.org> writes: > > The scenario that's appealing to me the most is this for the next release: > > PostgreSQL 8.0 > > + Includes PITR and the Win32 port > > If the folks doing those things can get done in time, great. I'm even > willing to push out the release schedule (now, not later) to make it > more likely they can get done. What I'm not willing to do is define > the release in terms of "it happens when these things are done". We > learned the folly of that approach in 7.1 and 7.2. Setting a target > date and sticking to it works *much* better. The thing is, IMHO, everyone knew the release scheduale for v7.4, so if they aren't ready now, I can't really see justifying pushing things back in hopes that they will be ready then ...
On Mon, 10 Mar 2003, Bruce Momjian wrote: > Tom Lane wrote: > > Justin Clift <justin@postgresql.org> writes: > > > The scenario that's appealing to me the most is this for the next release: > > > PostgreSQL 8.0 > > > + Includes PITR and the Win32 port > > > > If the folks doing those things can get done in time, great. I'm even > > willing to push out the release schedule (now, not later) to make it > > more likely they can get done. What I'm not willing to do is define > > the release in terms of "it happens when these things are done". We > > learned the folly of that approach in 7.1 and 7.2. Setting a target > > date and sticking to it works *much* better. > > Well, we had that 7.4 discussion a few days ago, and only had two people > comment on our scheduling. At that time, the discussion was for 7.4 > beta starting May 1 vs June 1. June 1 would be roughly six months from > our 7.3 final release, which is typical. > > I agree with Tom that we should decided sooner rather than later on a > beta date. Right now I don't think we have enough to must-have features > to justify a release, and as everyone knows, an upgrade isn't easy for > our users. > > I also agree with Tom that we shouldn't peg our beta schedule on specific > features. > > So, what should we do? Should we go another month or two and just wait > until we have enough must-have features? While not waiting on specific > features, it _is_ waiting for something to warrant a release. I guess > the big question is whether we release on a scheduled-basis or a > enough-features-basis. Schedualed basis ... if we released on an 'enough features basis', I could see alot longer then 6 mos between releases happening very quickly ... we have enough problems staying within the scheduale as it is, let alot moving it to a 'sliding scale' ...
On Tue, 11 Mar 2003, Justin Clift wrote: > Bruce Momjian wrote: > <snip> > > So, what should we do? Should we go another month or two and just wait > > until we have enough must-have features? While not waiting on specific > > features, it _is_ waiting for something to warrant a release. I guess > > the big question is whether we release on a scheduled-basis or a > > enough-features-basis. > > Hmmm, I feel we should decide on features that will make an 8.0 release > meaningful, and *somehow* work to making sure they are ready for the > release. > > With 7.1/7.2, Tom mentioned us being delayed because specific features > we were waiting for became dependant on one person. > > Would it be feasible to investigate approaches for having the Win32 and > PITR work be shared amongst a few very-interested volunteers, so that > people can cover for each other's downtime? Not sure of the > confidentiality level of the Win32/PITR patches at present, but I'd > guess there would be at least a few solid volunteers willing to > contribute to the Win32/PITR ports if we asked for people to step > forwards. Why should we be the ones to ask for ppl to step forward to volunteer to help? Shouldn't it be the responsibility of the developer working on it to admit that there is no way they will make the scheduale and call for help?
Tom Lane wrote: > > Hiroshi Inoue <Inoue@tpf.co.jp> writes: > > What the driver has suffered from is to get the > > fields' info of a query result or the parameters' > > info of a statement. The info is needed even before > > the execution of the statement(i.e it's only prepared). > > Hm. Are you saying that you would like PREPARE to send back a > RowDescription ('T') message? I'm not sure if PREPARE should return the info directly. Maybe it should be returned only when it is requested. > Or is there more to it than that? More detailed field info is needed anyway. For example the RowDescription contains neither the base column name, the table name nor the schema name currently and so the current odbc driver couldn't return the info correctly. The ODBC function SQLDescribeCol or SQLColAttribute requires various kind of fields' info. It's almost impossible to get parameters' info currently. The ODBC function SQLDescribeParam isn't implemented yet.... regards, Hiroshi Inouehttp://www.geocities.jp/inocchichichi/psqlodbc/
Marc G. Fournier wrote: > On Mon, 10 Mar 2003, Tom Lane wrote: > > > One of the $64 questions that has to be answered is how much work we're > > willing to expend on backwards compatibility. The path of least > > resistance would be to handle it the same way we've done protocol > > revisions in the past: the backend will be able to handle both old and > > new protocols (so it can talk to old clients) but libpq would be revised > > to speak only the new protocol (so new/recompiled clients couldn't talk > > to old backends). We've gotten away with this approach in the past, but > > the last time was release 6.4. I fully expect to hear more complaints > > now. > > Personally ... as long as a v8.x client can talk to a v7.x backend, you > have my vote ... I'm more apt to upgrade my clients before my servers > anyway ... Actually, it is usually the opposite, where old clients can talk to newer servers, but not the reverse. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Tue, 11 Mar 2003, Bruce Momjian wrote: > Marc G. Fournier wrote: > > On Mon, 10 Mar 2003, Tom Lane wrote: > > > > > One of the $64 questions that has to be answered is how much work we're > > > willing to expend on backwards compatibility. The path of least > > > resistance would be to handle it the same way we've done protocol > > > revisions in the past: the backend will be able to handle both old and > > > new protocols (so it can talk to old clients) but libpq would be revised > > > to speak only the new protocol (so new/recompiled clients couldn't talk > > > to old backends). We've gotten away with this approach in the past, but > > > the last time was release 6.4. I fully expect to hear more complaints > > > now. > > > > Personally ... as long as a v8.x client can talk to a v7.x backend, you > > have my vote ... I'm more apt to upgrade my clients before my servers > > anyway ... > > Actually, it is usually the opposite, where old clients can talk to > newer servers, but not the reverse. D'oh, mis-read Tom's ... you are correct, and it does make sense to do so ... its not like old libraries aren't available if someone wanted to make a pre-compiled version that is 'backwards compatible' ...
Marc G. Fournier wrote: > > So, what should we do? Should we go another month or two and just wait > > until we have enough must-have features? While not waiting on specific > > features, it _is_ waiting for something to warrant a release. I guess > > the big question is whether we release on a scheduled-basis or a > > enough-features-basis. > > Schedualed basis ... if we released on an 'enough features basis', I could > see alot longer then 6 mos between releases happening very quickly ... we > have enough problems staying within the scheduale as it is, let alot > moving it to a 'sliding scale' ... I guess the big question is that if we can't get enough big features in 6 months, do we still stay on the 6 month schedule? I know Tom said folks don't have to upgrade --- that is true, but our releases do seem a little lighter lately. Six months would be June 1 beta, so maybe that is still a good target. I agree we should not hold up beta for any feature. So maybe the plan is June 1 beta, and we don't care if we have enough big features or not --- does that sound good to everyone? Or should we be looking at May 1 as Tom originally suggested? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Tue, 2003-03-11 at 15:38, Merlin Moncure wrote: > Bruce Momjian [mailto:pgman@candle.pha.pa.us] wrote: > > Sure, Neil Conway updated Jan's patches for 7.3 Erm, not exactly. I just removed some of the changes in Jan's 7.2.1 patch that had already been made in CVS HEAD (mostly trivial stuff like constant renaming) -- since that made the patch a fair bit smaller, I figured it was worth sending along to pgsql-patches. > I have not been able to run the patch against either the 7.2.1 source or > the 7.3.2 source. It always quits with assertion failure after trying > to patch refint.c Yeah -- the patch is certainly not in a state that can be applied against CVS HEAD (and I don't see the point in bothering with applying it to older versions). Rather than expecting the whole thing to apply cleanly to CVS HEAD (which would probably leave you with little work to do, as that's close to the end goal), my suggestion would be to read through the patch itself. Find a set of related changes made by the patch (e.g. libpq changes, FSM changes, etc.), and merge those into CVS HEAD. Then submit a patch to pgsql-patches. Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
Marc G. Fournier wrote: <snip>>>Would it be feasible to investigate approaches for having the Win32 and>>PITR work be shared amongst a few very-interestedvolunteers, so that>>people can cover for each other's downtime? Not sure of the>>confidentiality level ofthe Win32/PITR patches at present, but I'd>>guess there would be at least a few solid volunteers willing to>>contributeto the Win32/PITR ports if we asked for people to step>>forwards.>> Why should we be the ones to ask for pplto step forward to volunteer to> help? Shouldn't it be the responsibility of the developer working on it> to admit thatthere is no way they will make the scheduale and call for> help? Hadn't thought of that. Um.. how about "whatever works"? It's sounds like the kind of thing where some people may be offended if we suddenly started asking for extra volunteers for their bits, and others in the same situation wouldn't. Now that you mention it, it's probably a matter of getting that bit correct I suppose. :) Regards and best wishes, Justin Clift -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
Kevin Brown <kevin@sysexperts.com> writes: > doing more or less what other database vendors have done in response > to these underspecified issues is probably a sensible course of action > when there's no other obviously better answer. A good point, indeed. Who wants to do the legwork to check up on this? regards, tom lane
Peter Eisentraut wrote: > > Dave Page writes: > > > I don't know about JDBC, but ODBC could use it, > > and it would save a heck of a lot of pain in apps > > like pgAdmin that need to figure out if a column > > in an arbitrary resultset might be updateable. > > Strictly speaking, result sets are never updatable, > because there's no way you can refer to a result set > and tell the system to update it. > So let's hear what you *really* need and then consider > interfaces for *that*. Maybe updatable views or > updatable cursors? The word *result set* is used variously and *updatable cursors* has its meaning in some middleware applications. The word *cursor* is used variously too. For example, ODBC cursors don't necessarily mean(correspond to) the dbms cursors. If there's a suitable cursor, the ODBC driver may conveniently use it to implement ODBC cursors. Otherwise the ODBC driver may implement ODBC cursors by itself. PostgreSQL has paid attention to the cursor support little and I don't expect much of server-side cursors. regards, Hiroshi Inouehttp://www.geocities.jp/inocchichichi/psqlodbc/
> * Backend should pass its version number, database encoding, default > client encoding, and possibly other data (any ideas?) to frontend during > startup, to avoid need for explicit queries to get this info. We could > also consider eliminating SET commands sent by libpq in favor of adding > variable settings to startup packet's PGOPTIONS field. Ideally we could > get back to the point where a standard connection startup takes only one > packet in each direction. This handles the JDBC needs (currently on startup the jdbc driver selects the database encoding and version number and sets the datestyle and autocommit parameters). One addition I would personally like to see (it comes up in my apps code) is the ability to detect wheather the server is big endian or little endian. When using binary cursors this is necessary in order to read int data. Currently I issue a 'select 1' statement at connection startup to determine what format the server is using. Other things I would like to see to help jdbc: 1) More information about the attributes selected in a query (I see there is an entire thread on this already) to minimize the work necessary to implement updateable result sets as defined by the jdbc spec. 2) Better support for domains. Currently the jdbc driver is broken with regards to domains (although no one has reported this yet). The driver will treat a datatype that is a domain as an unknown/unsupported datatype. It would be great if the T response included the 'base' datatype for a domain attribute so that the driver would know what parsing routines to call to convert to/from the text representation the backend expects. 3) Protocol level support for CURSORs. It would be nice if cursor support was done at the protocol level and not as a SQL command. The current default behavior of returning all results from a query in the query response message is often a problem (can easily lead to out of memory problems for poorly written queries). So it is desirable to use cursors. But with the current implementation in SQL, cursors are not the appropriate choice if a query is only going to return one or a few rows. The reason is that using a cursor requires a minimum of three SQL statements: DECLARE, FETCH, CLOSE. The jdbc driver issues the DECLARE and FETCH in one server call, but the CLOSE needs to be a second call. Thus for simple one row selects (which in many cases are the majority of selects issued) using CURSORS requires two roundtrips to the server vs. one for the nonCursor case. This leaves me with a problem in the jdbc driver, I can either use standard fast/performant queries for single row selects that blowup with out of memory errors for large results, or I can use cursors and avoid large memory usage but hurt overall performance. What I have currently done is require that the developer call an extra method to turn on the use of cursors when they know that the cursor is going to return a large number of rows and leave the default be the non-cursor case. This works but requires that developers who are writing code to interact with multiple different databases, code differently for the postgres jdbc driver. And this is a problem since one of the goals of jdbc is to be able to write code that works against multiple different databases. So I would request the ability of the client to set a max rows parameter for query results. If a query were to returnmore than the max number of rows, the client would be given a handle (essentially a cursor name) that it could use to fetch additional sets of rows. 4) Protocol level support of PREPARE. In jdbc and most other interfaces, there is support for parameterized SQL. If you want to take advantage of the performance benefits of reusing parsed plans you have to use the PREPARE SQL statement. My complaint on doing this at the SQL level vs the protocol level is similar to the problem with cursors above. To use prepare you need to issue three SQL statements: PREPARE, EXCECUTE, DEALLOCATE. If you know ahead of time that you are going to reuse a statement many times doing PREPARE, EXECUTE, EXECUTE, ..., DEALLOCATE makes sense and can be a big win in performance. However if you only ever execute the statement once then you need to use two round trips (one for the PREPARE, EXECUTE and another for the DEALLOCATE) versus one round trip to execute the statement 'normally'. So it decreases performance to use prepares for all parameterized sql statements. So the current implementation in jdbc requires the user to issue a postgres specific call to turn on the use of prepared statements for those cases the developer knows will be a performance win. But this requires coding differently for postgres jdbc than for other databases. So being better able to handle this in the protocol would be nice. 5) Better support for "large values". Generally I recommend that users of jdbc use bytea to store large binary values. I generally tell people to avoid using LOs (Large Objects). The reason for this is that LOs have two significant problems: 1) security - any user on the database can access all LOs even though they may not be able to access the row that contains the LO reference, 2) cleanup - deleting the row containing the LO reference doesn't delete the LO requireing extra code or triggers to behave like a regular value in a regular column. Bytea works OK for small to medium sized values, but doesn't work for very large values, where by very large I mean over a few Megabytes. The reason very large values are a problem is memory usage. There is no way to 'stream' bytea values from the server like you can do with LOs, so the driver ends up storeing the entire value in memory as it reads the result from the backend for a query. And if the query returns multiple rows each with a large value you quickly run out of memory. So what I would like to see is the ability for the client to set a MAX VALUE size parameter. The server would send up to this amount of data for any column. If the value was longer than MAX VALUE, the server would respond with a handle that the client could use to get the rest of the value (in chunks of MAX VALUE) if it wanted to. This would allow the client to get the entire result set which could contain perhaps many large bytea values, but not use a lot of memory up front. Then fetch the entire values only when/if the application asked for them and stream the result to the application and never bring the entire contents of the column into memory at once. (There are probably a number of different implementation posibilities so use this one as a suggestion to explain what I would like to see not necessarily how it should be implemented). 6) Better over the wire support for bytea. The current encoding of binary data \000 results in a significant expansion in the size of data transmitted. It would be nice if bytea data didn't result in 2 or 3 times data expansion. (and all the cpu cycles to convert to/from the escaped format). This may not be a protocol issue, but IMHO the best way to fix this would be in the protocol. thanks, --Barry
Dave Page wrote: > I don't know about JDBC, but ODBC could use it, and it would save a heck > of a lot of pain in apps like pgAdmin that need to figure out if a column > in an arbitrary resultset might be updateable. > At the moment there is some nasty code in pgAdmin II that attempts to > parse the SQL statement to figure out if the the resultset is updateable > by trying to figure out the number of relations in the query, whether any > of them is a view or sequence, whether there are any function calls or > expressions in the attribute list and so on. It then has to try to figure > out if there is a complete pkey in the resultset that can be used for the > update, or whether it should attempt an update based on all existing > values. That code is just plain nasty in VB. In pgAdmin III we've already > mentioned stealing bits of the PostgreSQL parser. I will just add a "me to" here. This would be very useful for JDBC as well. We go through the same hoops to support the jdbc spec that Dave does. The jdbc spec has two features that require this level of information: 1) For every result set you can ask for a ResultSetMetaData object. This object provides you with the following methods: getColumnCount() isAutoIncrement(int column) isCaseSensitive(int column) isSearchable(int column) isNullable(int column) getColumnDisplaySize(int column) getColumnLabel(int column) getColumnName(int column) getSchemaName(int column) getPrecision(int column) getScale(int column) getTableName(int column) getColumnTypeName(int column) isReadOnly(int column) isWritable(int column) isDefinitelyWritable(int column) Now one can state the spec is broken and it doesn't make sense to ask this type of information about a query (and frankly I would agree with you), but that doesn't mean that I don't need to support it anyway. So anything that the server can do to make this easier is greatly appreciated. And I believe ODBC has almost the exact same issue since in general the JDBC spec was copied from the ODBC spec. 2) Updateable result sets. The jdbc spec allows the user to declare any select statement to be updateable. This means that as you scroll through the result (the ResultSet object) you can issue modify the data and expect the jdbc driver to reflect that change back to the base tables. The following if from the JDBC API doc: * A set of updater methods were added to this interface * in the JDBC 2.0 API (JavaTM 2 SDK, * Standard Edition, version1.2). The comments regarding parameters * to the getter methods also apply to parameters to the * updater methods.* * The updater methods may be used in two ways: * * to update a column value in the current row. In a scrollable* ResultSet object, the cursor can be moved backwards * and forwards, to an absolute position, or to aposition * relative to the current row. * The following code fragment updates the NAME column * in the fifthrow of the ResultSet object * rs and then uses the method updateRow * to update the data source table from whichrs was derived. * * * rs.absolute(5); // moves the cursor to the fifth row of rs * rs.updateString("NAME","AINSWORTH"); // updates the * // NAME column of row 5 to be AINSWORTH * rs.updateRow();// updates the row in the data source * * * to insert column values into the insert row. An updatable * ResultSet object has a special row associated with * it that serves as a staging area for building a row to be inserted.* The following code fragment moves the cursor to the insert row, builds * a three-column row, and inserts it into rs and into * the data source table using the method insertRow.* * * rs.moveToInsertRow(); // moves cursor to the insert row * rs.updateString(1, "AINSWORTH"); //updates the * // first column of the insert row to be AINSWORTH * rs.updateInt(2,35); // updates the secondcolumn to be 35 * rs.updateBoolean(3, true); // updates the third row to true * rs.insertRow(); * rs.moveToCurrentRow(); * Now application developers love this functionality. It allows them to implement fairly complex apps with very little sql knowledge. They only need to know how to do a simple select statement and that is it. The jdbc driver handles the rest for them automatically (updates, inserts, deletes). As a jdbc maintainer I personally hate this functionality as it is a real pain to implement, and can't work in any but the most simple select statements. But is is part of the spec and needs to be supported in the best manner possible. thanks, --Barry
Hiroshi Inoue wrote: > Tom Lane wrote: > > > > Hiroshi Inoue <Inoue@tpf.co.jp> writes: > > > What the driver has suffered from is to get the > > > fields' info of a query result or the parameters' > > > info of a statement. The info is needed even before > > > the execution of the statement(i.e it's only prepared). > > > > Hm. Are you saying that you would like PREPARE to send back a > > RowDescription ('T') message? > > I'm not sure if PREPARE should return the info directly. > Maybe it should be returned only when it is requested. > > > Or is there more to it than that? > > More detailed field info is needed anyway. > For example the RowDescription contains neither > the base column name, the table name nor the > schema name currently and so the current odbc > driver couldn't return the info correctly. But it only needs that info to do system catalog lookups for type, length etc. Imho the target should be to not need catalog lookups, but get all needed info from the 'T' message, no? > The ODBC function SQLDescribeCol or SQLColAttribute > requires various kind of fields' info. RETCODE SQL_API SQLDescribeCol( HSTMT hstmt, UWORD icol, UCHAR FAR *szColName, SWORD cbColNameMax, SWORD FAR *pcbColName, SWORD FAR *pfSqlType, UDWORD FAR *pcbColDef, SWORD FAR *pibScale, SWORD FAR *pfNullable); So you see it does not need tablename or schema. > > It's almost impossible to get parameters' info > currently. The ODBC function SQLDescribeParam > isn't implemented yet.... Would be same as above, the 'T' message should describe input and output columns. Andreas
> > I'm still unclear on exactly what your needs are. In the first place, > > are you expecting to obtain data from arbitrary SELECT statements, or > > only from statements of the form "SELECT * FROM single_table"? You've > > also been confusing as to whether you want transparency of views (ie, > > does a select from a view return data about the view's nominal columns > > or about the underlying base table columns?). What about cases > > involving aggregates or grouping --- there may be simple Vars in the > > target list, but they can hardly be thought to represent > updatable values. > > These questions can't possibly be unique to PG -- other database > vendors must have answered these questions for their implementations > of ODBC/JDBC too, or their databases would give ODBC and JDBC client > authors the same kinds of trouble. > > So ... how have these questions been answered by other database > vendors? They return type info (RowDescription ('T') message) for "prepare", like Tom suggested. See the more or less standardized ESQL/C data structure sqlda on what is needed (label, type, length + precision, nullability, pointer to data (for binding host variables), ...). Note that this structis used for both directions (so when inserting from host variables the interface knows what conversions need to beapplied). Andreas
Barry Lind kirjutas K, 12.03.2003 kell 10:43: > Dave Page wrote: > > I don't know about JDBC, but ODBC could use it, and it would save a heck > > of a lot of pain in apps like pgAdmin that need to figure out if a column > > in an arbitrary resultset might be updateable. > > At the moment there is some nasty code in pgAdmin II that attempts to > > parse the SQL statement to figure out if the the resultset is updateable > > by trying to figure out the number of relations in the query, whether any > > of them is a view or sequence, whether there are any function calls or > > expressions in the attribute list and so on. It then has to try to figure > > out if there is a complete pkey in the resultset that can be used for the > > update, or whether it should attempt an update based on all existing > > values. That code is just plain nasty in VB. In pgAdmin III we've already > > mentioned stealing bits of the PostgreSQL parser. > > I will just add a "me to" here. This would be very useful for JDBC as > well. We go through the same hoops to support the jdbc spec that Dave > does. The jdbc spec has two features that require this level of > information: ... > Now one can state the spec is broken and it doesn't make sense to ask > this type of information about a query (and frankly I would agree with > you), but that doesn't mean that I don't need to support it anyway. So > anything that the server can do to make this easier is greatly > appreciated. And I believe ODBC has almost the exact same issue since > in general the JDBC spec was copied from the ODBC spec. And IIRC they both are mostly the same as ANSI/ISO SQL/CLI spec. The CLI being Call Level Interface. > 2) Updateable result sets. The jdbc spec allows the user to declare any > select statement to be updateable. This means that as you scroll > through the result (the ResultSet object) you can issue modify the data > and expect the jdbc driver to reflect that change back to the base > tables. The following if from the JDBC API doc: ... > Now application developers love this functionality. It allows them to > implement fairly complex apps with very little sql knowledge. They only > need to know how to do a simple select statement and that is it. The > jdbc driver handles the rest for them automatically (updates, inserts, > deletes). As a jdbc maintainer I personally hate this functionality as > it is a real pain to implement, and can't work in any but the most > simple select statements. But is is part of the spec and needs to be > supported in the best manner possible. Updatable cursors should probably come out as a side benefit when backend gains the ability to have updatable views. If you have done some work towards implementing it in JDBC driver, some it will probably be handy for anyone who is going to implement updatable views. ----------------- Hannu
Dave Page wrote: > I don't know about JDBC, but ODBC could use it, and it would save a heck > of a lot of pain in apps like pgAdmin that need to figure out if a column > in an arbitrary resultset might be updateable. I use an open source toolkit, zeos components for postgresql, that make this possible. You can run simple queries (even with some joins) and it allows updating of the result set *sometimes*. This can be done inline with a grid control. These components work with Delphi (object pascal), Kylix, and C++ Builder. They are the fastest middleware I have ever seen for any database, period. They also already have a built in SQL parser and are open source. The project is hosted in sourceforge and I believe it to be GPL. > values. That code is just plain nasty in VB. In pgAdmin III we've already > mentioned stealing bits of the PostgreSQL parser. Do not be swayed by the dark side. In my spare time I threw together a 'proof of concept' replacement for the technology used in pgAdmin. It is written in C++. In three seconds I can query a table and put 100000 records in a grid. I plan to finish it and release it. The main reason not to use it is that it relies on commercial tools to build. If you or anybody else is intereted, let me know and I'll send it your way. Merlin
> -----Original Message----- > From: Zeugswetter Andreas SB SD [mailto:ZeugswetterA@spardat.at] > > Hiroshi Inoue wrote: > > Tom Lane wrote: > > > > > > Hiroshi Inoue <Inoue@tpf.co.jp> writes: > > > > What the driver has suffered from is to get the > > > > fields' info of a query result or the parameters' > > > > info of a statement. The info is needed even before > > > > the execution of the statement(i.e it's only prepared). > > > > > > Hm. Are you saying that you would like PREPARE to send back a > > > RowDescription ('T') message? > > > > I'm not sure if PREPARE should return the info directly. > > Maybe it should be returned only when it is requested. > > > > > Or is there more to it than that? > > > > More detailed field info is needed anyway. > > For example the RowDescription contains neither > > the base column name, the table name nor the > > schema name currently and so the current odbc > > driver couldn't return the info correctly. > > But it only needs that info to do system catalog lookups for > type, length etc. > Imho the target should be to not need catalog lookups, but > get all needed info > from the 'T' message, no? Maybe. > > > The ODBC function SQLDescribeCol or SQLColAttribute > > requires various kind of fields' info. > > RETCODE SQL_API SQLDescribeCol( > HSTMT hstmt, > UWORD icol, > UCHAR FAR *szColName, > SWORD cbColNameMax, > SWORD FAR *pcbColName, > SWORD FAR *pfSqlType, > UDWORD FAR *pcbColDef, > SWORD FAR *pibScale, > SWORD FAR *pfNullable); > > So you see it does not need tablename or schema. Oh yes but SQLColAtrribute should return such info if requested. regards, Hiroshi Inoue
> -----Original Message----- > From: Peter Eisentraut [mailto:peter_e@gmx.net] > Sent: 12 March 2003 00:34 > To: Dave Page > Cc: Tom Lane; PostgreSQL Development > Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign > > > Dave Page writes: > > > I don't know about JDBC, but ODBC could use it, and it would save a > > heck of a lot of pain in apps like pgAdmin that need to > figure out if > > a column in an arbitrary resultset might be updateable. > > Strictly speaking, result sets are never updatable, because > there's no way you can refer to a result set and tell the > system to update it. So let's hear what you *really* need > and then consider interfaces for *that*. Maybe updatable > views or updatable cursors? Well what I *really* need has been made quite clear in other posts, but, when I say resultset in the same sentence as pgAdmin, I'm referring to the ability to enter an arbitrary SQL query, have the results displayed in a grid, which can then be editted. To do this pgAdmin needs to be able to figure out enough info about the source of the data to generate the required insert/update/delete statements. It also happens that ODBC, JDBC etc, also need the same information to meet their specs. Regards, Dave.
> -----Original Message----- > From: Zeugswetter Andreas SB SD [mailto:ZeugswetterA@spardat.at] > Sent: 12 March 2003 09:50 > To: Hiroshi Inoue; Tom Lane > Cc: Bruce Momjian; pgsql-hackers@postgresql.org; > pgsql-interfaces@postgresql.org > Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign > > > The ODBC function SQLDescribeCol or SQLColAttribute > > requires various kind of fields' info. > > RETCODE SQL_API SQLDescribeCol( > HSTMT hstmt, > UWORD icol, > UCHAR FAR *szColName, > SWORD cbColNameMax, > SWORD FAR *pcbColName, > SWORD FAR *pfSqlType, > UDWORD FAR *pcbColDef, > SWORD FAR *pibScale, > SWORD FAR *pfNullable); > > So you see it does not need tablename or schema. No, but with them we can avoid cluttering the wire protocol with fields for all this, and the JDBC required data. With 2 numeric columns (attrelid, attnum), any application/interface can query the system catalogs easily for whatever extra info they like. Regards, Dave.
> -----Original Message----- > From: Merlin Moncure [mailto:merlin.moncure@rcsonline.com] > Sent: 12 March 2003 13:35 > To: Dave Page > Cc: pgsql-hackers@postgresql.org > Subject: RE: [HACKERS] Roadmap for FE/BE protocol redesign > > > > values. That code is just plain nasty in VB. In pgAdmin III we've > already > > mentioned stealing bits of the PostgreSQL parser. > > Do not be swayed by the dark side. In my spare time I threw > together a > 'proof of concept' replacement for the technology used in > pgAdmin. It > is written in C++. In three seconds I can query a table and > put 100000 records in a grid. I plan to finish it and > release it. The main reason not to use it is that it relies > on commercial tools to build. > > If you or anybody else is intereted, let me know and I'll > send it your way. You do realise I'm the pgAdmin project lead? Anyway, the 'technology' in pgAdmin II is ADO/ODBC which I quite agree doesn't play well with that many records - but then, not many humans do either. In pgAdmin III we use libpq directly from C++. If you can speed up that then I'm sure there will be more people than just me that are interested :-) The code I'm actually referring too in this thread, is that which decides whether and how the results from a given query can be updated. That would be vastly simplified is I could easily locate the pg_attribute row for each column. Regards, Dave.
This may be problematic in the future if we change attrelid, attnum. My preference would be to be able to query the backend for the information if I need it. It is rarely required. ie give me the meta data for the last resultset. Dave On Wed, 2003-03-12 at 09:49, Dave Page wrote: > > -----Original Message----- > > From: Zeugswetter Andreas SB SD [mailto:ZeugswetterA@spardat.at] > > Sent: 12 March 2003 09:50 > > To: Hiroshi Inoue; Tom Lane > > Cc: Bruce Momjian; pgsql-hackers@postgresql.org; > > pgsql-interfaces@postgresql.org > > Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign > > > > > The ODBC function SQLDescribeCol or SQLColAttribute > > > requires various kind of fields' info. > > > > RETCODE SQL_API SQLDescribeCol( > > HSTMT hstmt, > > UWORD icol, > > UCHAR FAR *szColName, > > SWORD cbColNameMax, > > SWORD FAR *pcbColName, > > SWORD FAR *pfSqlType, > > UDWORD FAR *pcbColDef, > > SWORD FAR *pibScale, > > SWORD FAR *pfNullable); > > > > So you see it does not need tablename or schema. > > No, but with them we can avoid cluttering the wire protocol with fields > for all this, and the JDBC required data. With 2 numeric columns > (attrelid, attnum), any application/interface can query the system > catalogs easily for whatever extra info they like. > > Regards, Dave. -- Dave Cramer <Dave@micro-automation.net> -- Dave Cramer <dave@fastcrypt.com> Cramer Consulting
> -----Original Message----- > From: Dave Cramer [mailto:Dave@micro-automation.net] > Sent: 12 March 2003 14:55 > To: Dave Page > Cc: Zeugswetter Andreas SB SD; Hiroshi Inoue; Tom Lane; Bruce > Momjian; pgsql-hackers@postgresql.org; pgsql-interfaces@postgresql.org > Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign > > > This may be problematic in the future if we change attrelid, attnum. I suggested using names to Tom for this reason, but he preferred to use attrelid/attnum. > My preference would be to be able to query the backend for > the information if I need it. It is rarely required. Rarely required by most applications, but most interfaces do need this ability, and so do apps like pgAdmin and OpenOffice that allow the execution of arbitrary queries and subsequent updating of the results. Regards, Dave.
On Wed, Mar 12, 2003 at 03:02:23PM -0000, Dave Page wrote: > > This may be problematic in the future if we change attrelid, attnum. > > I suggested using names to Tom for this reason, but he preferred to use > attrelid/attnum. Oh, and what happenned to the attlognum idea? If something that needs it is going to be implemented the column should probably be added now and used instead of attnum. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "In a specialized industrial society, it would be a disaster to have kids running around loose." (Paul Graham)
On Wed, 2003-03-12 at 10:02, Dave Page wrote: > > -----Original Message----- > > From: Dave Cramer [mailto:Dave@micro-automation.net] > > Sent: 12 March 2003 14:55 > > To: Dave Page > > Cc: Zeugswetter Andreas SB SD; Hiroshi Inoue; Tom Lane; Bruce > > Momjian; pgsql-hackers@postgresql.org; pgsql-interfaces@postgresql.org > > Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign > > > > > > This may be problematic in the future if we change attrelid, attnum. > > I suggested using names to Tom for this reason, but he preferred to use > attrelid/attnum. Ok, but I am willing to bet that this will be a problem later. However we can deal with it later, as well. > > > My preference would be to be able to query the backend for > > the information if I need it. It is rarely required. > > Rarely required by most applications, but most interfaces do need this > ability, and so do apps like pgAdmin and OpenOffice that allow the > execution of arbitrary queries and subsequent updating of the results. Yes, the interface requires it but how often? If I just do a select * from foo, I don't need all the metadata, it's only if I do getColumnName or something that I need metadata. However qualified column names would be nice, as in the instance when two column names are the same from different tables. > > Regards, Dave. -- Dave Cramer <dave@fastcrypt.com> Cramer Consulting
"Dave Page" <dpage@vale-housing.co.uk> writes: > No, but with them we can avoid cluttering the wire protocol with fields > for all this, and the JDBC required data. With 2 numeric columns > (attrelid, attnum), any application/interface can query the system > catalogs easily for whatever extra info they like. This is my feeling also. We shouldn't try to guess in the protocol exactly what set of information will be wanted by a frontend; we should just provide the catalog keys needed to look up whatever is wanted. I like attrelid/attnum because (a) it's compact --- not much added overhead for apps that don't care; and (b) it's a primary key, which names are not. The relid/attnum identity of a column will remain the same as long as it exists. A name-based API breaks the moment some other user renames things; similar objections could be raised against attlognum, if we had it. The point about apps breaking if the system catalog layout changes is a fair one, but that situation would be no worse than it's ever been; ODBC and JDBC have coped well enough so far. In any case the best solution would be to offer functions that hide the details of where to find any given bit of info. We've got a few like that already (format_type() for example) and I have no problem with adding more. regards, tom lane
Barry Lind <blind@xythos.com> writes: > One addition I would personally like to see (it comes up in my apps > code) is the ability to detect wheather the server is big endian or > little endian. When using binary cursors this is necessary in order to > read int data. Actually, my hope is to eliminate that business entirely by standardizing the on-the-wire representation for binary data; note the reference to send/receive routines in the original message. For integer data this is simple enough: network byte order will be it. I'm not sure yet what to do about float data. > 2) Better support for domains. Currently the jdbc driver is broken with > regards to domains (although no one has reported this yet). The driver > will treat a datatype that is a domain as an unknown/unsupported > datatype. It would be great if the T response included the 'base' > datatype for a domain attribute so that the driver would know what > parsing routines to call to convert to/from the text representation the > backend expects. I'm unconvinced that we need do this in the protocol, as opposed to letting the client figure it out with metadata inquiries. If we should, I'd be inclined to just replace the typeid field with the base typeid, and not mention the domain to the frontend at all. Comments? > So I would request the ability of the client to set a max rows parameter > for query results. If a query were to return more than the max > number of rows, the client would be given a handle (essentially a cursor > name) that it could use to fetch additional sets of rows. How about simply erroring out if the query returns more than X rows? > 4) Protocol level support of PREPARE. In jdbc and most other > interfaces, there is support for parameterized SQL. If you want to take > advantage of the performance benefits of reusing parsed plans you have > to use the PREPARE SQL statement. This argument seems self-contradictory to me. There is no such benefit unless you're going to re-use the statement many times. Nor do I see how pushing PREPARE down to the protocol level will create any improvement in its performance. > So what I would like to see is the ability for the client to set a MAX > VALUE size parameter. The server would send up to this amount of data > for any column. If the value was longer than MAX VALUE, the server > would respond with a handle that the client could use to get the rest of > the value (in chunks of MAX VALUE) if it wanted to. I don't think I want to embed this in the protocol, either; especially not when we don't have even the beginnings of backend support for it. I think such a feature should be implemented and proven as callable functions first, and then we could think about pushing it down into the protocol. > 6) Better over the wire support for bytea. The current encoding of > binary data \000 results in a significant expansion in the size of data > transmitted. It would be nice if bytea data didn't result in 2 or 3 > times data expansion. AFAICS the only context where this could make sense is binary transmission of parameters for a previously-prepared statement. We do have all the pieces for that on the roadmap. regards, tom lane
Dave Page writes: > Well what I *really* need has been made quite clear in other posts, but, > when I say resultset in the same sentence as pgAdmin, I'm referring to > the ability to enter an arbitrary SQL query, have the results displayed > in a grid, which can then be editted. To do this pgAdmin needs to be > able to figure out enough info about the source of the data to generate > the required insert/update/delete statements. Right. But since you can't really write a literal SQL statement that does an update that refers to a previous query, you are already doing a fair amount of internal magic anyway, so if the meta-data is determined by magic as well, that seems consistent. What you need is an updateable cursor on the server side. It has all the facilities you need, including standardized ways to find out the updatability metadata. Please concentrate on that and do not attempt to clutter the wire protocol with data that will not withstand a throrough investigation of semantics. -- Peter Eisentraut peter_e@gmx.net
Peter Eisentraut <peter_e@gmx.net> writes: > What you need is an updateable cursor on the server side. It has all the > facilities you need, including standardized ways to find out the > updatability metadata. Please concentrate on that and do not attempt to > clutter the wire protocol with data that will not withstand a throrough > investigation of semantics. So are you voting against adding any attribute-ID info to RowDescription? While I'm not that thrilled with it myself, it seems relatively harmless as long as we can keep the overhead down. I'm okay with attrelid/attnum, but would gripe about including more than that. You're probably right that updatable cursors would be a nicer solution in the long run, but I don't really see that happening in the next release or two. regards, tom lane
On Wed, 2003-03-12 at 12:46, Peter Eisentraut wrote: > Dave Page writes: > > > Well what I *really* need has been made quite clear in other posts, but, > > when I say resultset in the same sentence as pgAdmin, I'm referring to > > the ability to enter an arbitrary SQL query, have the results displayed > > in a grid, which can then be editted. To do this pgAdmin needs to be > > able to figure out enough info about the source of the data to generate > > the required insert/update/delete statements. > > Right. But since you can't really write a literal SQL statement that does > an update that refers to a previous query, you are already doing a fair > amount of internal magic anyway, so if the meta-data is determined by > magic as well, that seems consistent. > > What you need is an updateable cursor on the server side. It has all the > facilities you need, including standardized ways to find out the > updatability metadata. Please concentrate on that and do not attempt to > clutter the wire protocol with data that will not withstand a throrough > investigation of semantics. And I have offered to pay for this work to be done. Someone? -- Dave Cramer <dave@fastcrypt.com> Cramer Consulting
Peter Eisentraut writes: > Dave Page writes: > > > Well what I *really* need has been made quite clear in other posts, but, > > when I say resultset in the same sentence as pgAdmin, I'm referring to > > the ability to enter an arbitrary SQL query, have the results displayed > > in a grid, which can then be editted. To do this pgAdmin needs to be > > able to figure out enough info about the source of the data to generate > > the required insert/update/delete statements. > > Right. But since you can't really write a literal SQL statement that does > an update that refers to a previous query, you are already doing a fair > amount of internal magic anyway, so if the meta-data is determined by > magic as well, that seems consistent. While this may be true, it is possible to build a client side system that can do this for you. Views and cursors are great, but they are not always the best tool for the job. > > What you need is an updateable cursor on the server side. It has all the > facilities you need, including standardized ways to find out the > updatability metadata. Please concentrate on that and do not attempt to > clutter the wire protocol with data that will not withstand a throrough > investigation of semantics. It's not foolproof and may even be foolhardy, but there are certain advantages to client-side decision making. A couple of integers or so for each attribute is not a terribly high price to pay. If a compelling case can be made that it can be put to good use, why not do it? Merlin
> -----Original Message----- > From: Peter Eisentraut [mailto:peter_e@gmx.net] > Sent: 12 March 2003 17:46 > To: Dave Page > Cc: Tom Lane; PostgreSQL Development > Subject: RE: [HACKERS] Roadmap for FE/BE protocol redesign > > > What you need is an updateable cursor on the server side. It > has all the facilities you need, including standardized ways > to find out the updatability metadata. Please concentrate on > that and do not attempt to clutter the wire protocol with > data that will not withstand a throrough investigation of semantics. That may be true for pgAdmin III, (though it probably won't help through an ODBC/ADO interface as pgAdmin II uses), but it doesn't help ODBC, JDBC, dotNET DP etc. provide the metadata on resultsets that their specs require them to. What prompted me to think about this issue (again), is that the ODBC driver incorrectly reports columns in a resultset as nullable, because it has no way of telling otherwise. This issue breaks OpenOffice which is using perfectly valid apis. pgAdmin III just happens to be able to benefit from such an update as well. Regards, Dave.
"Marc G. Fournier" <scrappy@hub.org> writes: > Personally ... as long as a v8.x client can talk to a v7.x backend, you > have my vote ... I'm more apt to upgrade my clients before my servers > anyway ... Surely that's not true for a production environment. You have one database but potentially dozens of various programs around that access it. The main application, some backend scripts for batch jobs, your backup process, your monitoring systems... Not all of these are necessarily on the same machine. It's upgrading the database that's likely to be the driving motivation for new sql or storage features. People usually don't get excited about upgrading the client libraries :) -- greg
Peter Eisentraut wrote: > > Dave Page writes: > > > Well what I *really* need has been made quite clear in other > > posts, but, when I say resultset in the same sentence as > > pgAdmin, I'm referring to the ability to enter an arbitrary > > SQL query, have the results displayed in a grid, which can > > then be editted. To do this pgAdmin needs to be able to > > figure out enough info about the source of the data to generate > > the required insert/update/delete statements. > > Right. But since you can't really write a literal SQL statement > that does an update that refers to a previous query, you are > already doing a fair amount of internal magic anyway, so if the > meta-data is determined by magic as well, that seems consistent. Psqlodbc driver has to parse the queries in order to implement driver side updatable cursors unwillingly. I'm very suspicios if it should be the driver's job because it's very hard and ineffective to parse and analyze the queries in the same way as the backend does. > What you need is an updateable cursor on the server side. > It has all the facilities you need, Really ? How did you confirm it ? > including standardized ways to find out the > updatability metadata. Please concentrate on that and do not attempt to > clutter the wire protocol with data that will not withstand a throrough > investigation of semantics. regards, Hiroshi Inouehttp://www.geocities.jp/inocchichichi/psqlodbc/
> > I suggested using names to Tom for this reason, but he preferred to use > > attrelid/attnum. > > Oh, and what happenned to the attlognum idea? If something that needs > it is going to be implemented the column should probably be added now > and used instead of attnum. Weeeell, it'd be nice, but I can see ways of doing ALTER COLUMN/SET TYPE without it, so... (My work on that has been stalled for a while BTW, due to sheer difficulty!) Chris
> > One addition I would personally like to see (it comes up in my apps > > code) is the ability to detect wheather the server is big endian or > > little endian. When using binary cursors this is necessary in order to > > read int data. > > Actually, my hope is to eliminate that business entirely by > standardizing the on-the-wire representation for binary data; note the > reference to send/receive routines in the original message. For integer > data this is simple enough: network byte order will be it. I'm not sure > yet what to do about float data. Is that something that the 'XDR' spec deals with? (Thinking back to 3rd year networking unit)... Chris
On Wed, 2003-03-12 at 20:45, Hiroshi Inoue wrote: > Peter Eisentraut wrote: > > > > Dave Page writes: > > > > > Well what I *really* need has been made quite clear in other > > > posts, but, when I say resultset in the same sentence as > > > pgAdmin, I'm referring to the ability to enter an arbitrary > > > SQL query, have the results displayed in a grid, which can > > > then be editted. To do this pgAdmin needs to be able to > > > figure out enough info about the source of the data to generate > > > the required insert/update/delete statements. > > > > Right. But since you can't really write a literal SQL statement > > that does an update that refers to a previous query, you are > > already doing a fair amount of internal magic anyway, so if the > > meta-data is determined by magic as well, that seems consistent. > > Psqlodbc driver has to parse the queries in order to > implement driver side updatable cursors unwillingly. > I'm very suspicios if it should be the driver's job > because it's very hard and ineffective to parse and > analyze the queries in the same way as the backend does. jdbc has to do this too, and the backend is in a much better position to do the parsing IMO as well. > > > What you need is an updateable cursor on the server side. > > It has all the facilities you need, > > Really ? How did you confirm it ? > > > including standardized ways to find out the > > updatability metadata. Please concentrate on that and do not attempt to > > clutter the wire protocol with data that will not withstand a throrough > > investigation of semantics. > > regards, > Hiroshi Inoue > http://www.geocities.jp/inocchichichi/psqlodbc/ > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html -- Dave Cramer <dave@fastcrypt.com> Cramer Consulting
Greg Stark <gsstark@mit.edu> writes: > It's upgrading the database that's likely to be the driving motivation > for new sql or storage features. People usually don't get excited > about upgrading the client libraries :) Usually not. This cycle might be different though, if we are able to finish the proposed improvements in error reporting and other issues that are handicapping clients. None of that work will help un-upgraded clients... regards, tom lane
Tom Lane wrote: > > "Dave Page" <dpage@vale-housing.co.uk> writes: > > No, but with them we can avoid cluttering the wire protocol with fields > > for all this, and the JDBC required data. With 2 numeric columns > > (attrelid, attnum), any application/interface can query the system > > catalogs easily for whatever extra info they like. > > This is my feeling also. We shouldn't try to guess in the protocol > exactly what set of information will be wanted by a frontend; we should > just provide the catalog keys needed to look up whatever is wanted. Does looking up by the catalog keys take no cost ? regards, Hiroshi Inouehttp://www.geocities.jp/inocchichichi/psqlodbc/
Greg Stark kirjutas K, 12.03.2003 kell 07:10: > "Marc G. Fournier" <scrappy@hub.org> writes: > > > Personally ... as long as a v8.x client can talk to a v7.x backend, you > > have my vote ... I'm more apt to upgrade my clients before my servers > > anyway ... > > Surely that's not true for a production environment. You have one database but > potentially dozens of various programs around that access it. The main > application, some backend scripts for batch jobs, your backup process, your > monitoring systems... Not all of these are necessarily on the same machine. For more radical protocol changes a viable approach could be "protocol proxies", i.e. set up a _separate_ daemon which listens on a separate port and translates v7.x wire protocol to v8.x of the database proper. Then those needing it can keep it around and those who need it not don't get the overhead. It could also be maintained by inerested parties long after being dropped by core developers. > It's upgrading the database that's likely to be the driving motivation for new > sql or storage features. People usually don't get excited about upgrading the > client libraries :) But our SQL itself is slowly drifting towards ANSI/ISO compliance and that has often brought subtle changes that break _applications_. It is not a big issue to changes libraries if you have to change the application anyway. ----------------- Hannu
Tom Lane kirjutas K, 12.03.2003 kell 18:19: > Barry Lind <blind@xythos.com> writes: > > One addition I would personally like to see (it comes up in my apps > > code) is the ability to detect wheather the server is big endian or > > little endian. When using binary cursors this is necessary in order to > > read int data. > > Actually, my hope is to eliminate that business entirely by > standardizing the on-the-wire representation for binary data; note the > reference to send/receive routines in the original message. For integer > data this is simple enough: network byte order will be it. I'm not sure > yet what to do about float data. Use IEEE floats or just report the representation in startup packet. the X11 protocol does this for all data, even integers - the client expresses a wish what it wants and the server tells it what it gets (so two intel boxes need not to convert to "network byte order" at both ends). > > 2) Better support for domains. Currently the jdbc driver is broken with > > regards to domains (although no one has reported this yet). The driver > > will treat a datatype that is a domain as an unknown/unsupported > > datatype. It would be great if the T response included the 'base' > > datatype for a domain attribute so that the driver would know what > > parsing routines to call to convert to/from the text representation the > > backend expects. > > I'm unconvinced that we need do this in the protocol, as opposed to > letting the client figure it out with metadata inquiries. If we should, > I'd be inclined to just replace the typeid field with the base typeid, > and not mention the domain to the frontend at all. Comments? > > > So I would request the ability of the client to set a max rows parameter > > for query results. If a query were to return more than the max > > number of rows, the client would be given a handle (essentially a cursor > > name) that it could use to fetch additional sets of rows. > > How about simply erroring out if the query returns more than X rows? Or just using prepare/execute - fetch - fetch - fetch ... > > 4) Protocol level support of PREPARE. In jdbc and most other > > interfaces, there is support for parameterized SQL. If you want to take > > advantage of the performance benefits of reusing parsed plans you have > > to use the PREPARE SQL statement. > > This argument seems self-contradictory to me. There is no such benefit > unless you're going to re-use the statement many times. Nor do I see > how pushing PREPARE down to the protocol level will create any > improvement in its performance. I suspect that he actually means support for binary transmission of parameters for a previously-prepared statement here. > > So what I would like to see is the ability for the client to set a MAX > > VALUE size parameter. The server would send up to this amount of data > > for any column. If the value was longer than MAX VALUE, the server > > would respond with a handle that the client could use to get the rest of > > the value (in chunks of MAX VALUE) if it wanted to. > > I don't think I want to embed this in the protocol, either; especially > not when we don't have even the beginnings of backend support for it. > I think such a feature should be implemented and proven as callable > functions first, and then we could think about pushing it down into the > protocol. IIRC, Oracle has such a feature in its support for Large Objects (LONG datatype). If the object data is longer than xxx bytes you will need special ized access to it. also when stepping with single fetches, you will always get handles for LONG objects, if fetching more than one row you'll get raw data. BTW, I'm not advocating such behaviour . ---------------- Hannu
Alvaro Herrera kirjutas K, 12.03.2003 kell 17:10: > On Wed, Mar 12, 2003 at 03:02:23PM -0000, Dave Page wrote: > > > > This may be problematic in the future if we change attrelid, attnum. > > > > I suggested using names to Tom for this reason, but he preferred to use > > attrelid/attnum. > > Oh, and what happenned to the attlognum idea? If something that needs > it is going to be implemented the column should probably be added now > and used instead of attnum. I guess that attlognum is _not_ needed here. The only need for attlognum would be for automatic column ordering when doing "select * " and "\d table", for all other purposes attnum is much more reliable. ----------------- Hannu
> > One addition I would personally like to see (it comes up in my > > apps code) is the ability to detect wheather the server is big > > endian or little endian. When using binary cursors this is > > necessary in order to read int data. > > Actually, my hope is to eliminate that business entirely by > standardizing the on-the-wire representation for binary data; note > the reference to send/receive routines in the original message. For > integer data this is simple enough: network byte order will be it. > I'm not sure yet what to do about float data. When were talking sending data across the wire, are we talking about a format that would let the server use sendfile() for sending the data to the client? Having a database that can send data to the client efficiently would be a nice change of pace given most databases since RDBMSs are notoriously slow (slower than NFS) at sending files to clients. -sc -- Sean Chittenden
Hannu Krosing wrote: > Tom Lane kirjutas K, 12.03.2003 kell 18:19: > >>Actually, my hope is to eliminate that business entirely by >>standardizing the on-the-wire representation for binary data; note the >>reference to send/receive routines in the original message. For integer >>data this is simple enough: network byte order will be it. I'm not sure >>yet what to do about float data. > > > Use IEEE floats or just report the representation in startup packet. > > the X11 protocol does this for all data, even integers - the client > expresses a wish what it wants and the server tells it what it gets (so > two intel boxes need not to convert to "network byte order" at both > ends). IIOP/CDR behaves similarly for performance reasons- "receiver makes it right". It also defines a representation for all of the CORBA idl basic types, wide characters, fixed-point types, structures, etc. A far-reaching, wild suggestion would be to replace the postmaster with a CORBA-based server process with a well defined interface. At a minimum, if a binary protocol is the ultimate destination, perhaps some of the mapping of various types could be borrowed from the specs. Mike Mascari mascarm@mascari.com
It's rumoured that Hiroshi Inoue once said: > Tom Lane wrote: >> >> "Dave Page" <dpage@vale-housing.co.uk> writes: >> > No, but with them we can avoid cluttering the wire protocol with >> > fields for all this, and the JDBC required data. With 2 numeric >> > columns (attrelid, attnum), any application/interface can query the >> > system catalogs easily for whatever extra info they like. >> >> This is my feeling also. We shouldn't try to guess in the protocol >> exactly what set of information will be wanted by a frontend; we >> should just provide the catalog keys needed to look up whatever is >> wanted. > > Does looking up by the catalog keys take no cost ? Obviously there is cost, but doing a lookup only on demand, has got to be cheaper in the long run than including the entire column definition in the message whether it's wanted or not? Regards, Dave.
Dave Page wrote: > > It's rumoured that Hiroshi Inoue once said: > > Tom Lane wrote: > >> > >> "Dave Page" <dpage@vale-housing.co.uk> writes: > >> > No, but with them we can avoid cluttering the wire protocol with > >> > fields for all this, and the JDBC required data. With 2 numeric > >> > columns (attrelid, attnum), any application/interface can query the > >> > system catalogs easily for whatever extra info they like. > >> > >> This is my feeling also. We shouldn't try to guess in the protocol > >> exactly what set of information will be wanted by a frontend; we > >> should just provide the catalog keys needed to look up whatever is > >> wanted. > > > > Does looking up by the catalog keys take no cost ? > > Obviously there is cost, but doing a lookup only on demand, has got to be > cheaper in the long run than including the entire column definition in the > message whether it's wanted or not? So if there are 100 fields, should we ask the backend the column name 100 times ? regards, Hiroshi Inouehttp://www.geocities.jp/inocchichichi/psqlodbc/
> > Obviously there is cost, but doing a lookup only on demand, has got to be > > cheaper in the long run than including the entire column definition in the > > message whether it's wanted or not? > > So if there are 100 fields, should we ask the backend > the column name 100 times ? No, you do a single select that returns 100 results... Then you cache them in your frontend... Chris
Hiroshi Inoue kirjutas N, 13.03.2003 kell 12:03: > Dave Page wrote: > > > > > Does looking up by the catalog keys take no cost ? > > > > Obviously there is cost, but doing a lookup only on demand, has got to be > > cheaper in the long run than including the entire column definition in the > > message whether it's wanted or not? > > So if there are 100 fields, should we ask the backend > the column name 100 times ? or once for all columns if you prefer using IN. > regards, > Hiroshi Inoue > http://www.geocities.jp/inocchichichi/psqlodbc/ > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> -----Original Message----- > From: Christopher Kings-Lynne [mailto:chriskl@familyhealth.com.au] > > > > Obviously there is cost, but doing a lookup only on > demand, has got to be > > > cheaper in the long run than including the entire column > definition in the > > > message whether it's wanted or not? > > > > So if there are 100 fields, should we ask the backend > > the column name 100 times ? > > No, you do a single select that returns 100 results... Does a single select mean a fast select ? regards, Hiroshi Inoue
"Dave Page" <dpage@vale-housing.co.uk> writes: > It's rumoured that Hiroshi Inoue once said: >> Does looking up by the catalog keys take no cost ? > Obviously there is cost, but doing a lookup only on demand, has got to be > cheaper in the long run than including the entire column definition in the > message whether it's wanted or not? More to the point, the cost is paid by applications that want the functionality, and not by those that don't. It'd probably be reasonable for client libraries to maintain a cache of column info, so that they only have to query the backend about a particular column ID once per connection. (This is one reason why I want attrelid/attnum to be the info given on the wire: it's stable enough to be used over a whole session to index such a cache. Names could not safely be used that way.) regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Barry Lind <blind@xythos.com> writes: > > > 4) Protocol level support of PREPARE. In jdbc and most other > > interfaces, there is support for parameterized SQL. If you want to take > > advantage of the performance benefits of reusing parsed plans you have > > to use the PREPARE SQL statement. > > This argument seems self-contradictory to me. There is no such benefit > unless you're going to re-use the statement many times. Nor do I see > how pushing PREPARE down to the protocol level will create any > improvement in its performance. "you're going to re-use the statement many times" is true (or should be true) for every statement in every web site and other OLTP system. Even if the query appears on only a single web page and is executed only once on that web page, the nature of high volume web sites is that that page will be executed hundreds or thousands of times per minute. This is why the Perl DBI, for example, has a prepare_cached() which provides a automatic caching of prepared handles. With Oracle I was able to use this exclusively on a large high volume web site to keep thousands of prepared handles. Every query was prepared only once per apache process. There is a performance benefit to using placeholders and prepared queries in that the plan doesn't need to be regenerated repeatedly. Ideally every query should either be a big DSS query where the time spent in the optimizer is irrelevant, or an OLTP transaction using placeholders where again the time spent in the optimizer is irrelevant because it only needs to be run once. This would allow the optimizer to grow in complexity. For example it could explore both sides of the decision tree in places where now we have heuristics to pick the probable better plan. Postgres's optimizer is pretty impressive currently, but the constant attention to avoiding high cost optimizations limits it. There is also a security benefit. The idea of mixing parameters into the queries even at the driver level gives me the willies. The database then has to parse them back out of the query string. If there's a bug in the driver or any kind of mismatch between the backend parser and the driver quoting then there could be security holes. -- greg
> > No, you do a single select that returns 100 results... > > Does a single select mean a fast select ? Provably always? Perhaps not. But requesting data via a single select statement returning 100 rows means that you only have to submit the request once, and it means that the result set only has to be set up once, which is quite likely to save some work. If you're going to read 100 lines of data from a file, would it make more sense to: a) Open the file 100 times, fseek() to the right spot, and then close the file, or b) Open the file once, and do 100 fseek() calls? If you're just opening a file, there's not a huge amount of overhead associated with that. Setting up a DBMS query is quite a bit more costly, so that the effort looks like: a) Open a session connection, submit a query for one record, and process it, then close the connection, and do this 100times, versus b) Open a session connection, submit a query returning 100 records, and process them all. The overhead surrounding submitting and processing the query is quite a bit more than that involved in opening/closing files, so cutting down on it should be pretty worthwhile... -- (reverse (concatenate 'string "gro.gultn@" "enworbbc")) http://cbbrowne.com/info/nonrdbms.html "When we understand knowledge-based systems, it will be as before -- except our fingertips will have been singed." -- Alan Perlis
Barry Lind wrote: > 3) Protocol level support for CURSORs. It would be nice if cursor > support was done at the protocol level and not as a SQL command. I want to second this proposal. Currently I avoid using cursors in my programs since a) they need much more logic and _string_concatenation_ to be handled transparently by a library (prepend the query with DECLARE X CURSOR FOR), then (FETCH n FROM X), then (CLOSE X). That's inefficient. b) I have a really bad feeling to have the backend parse (FETCH FROM X) every time I ask for a (single) row c) I hate that the backend retransmits column names etc. for every fetch I issue. This information is mostly unneeded but the backend cannot know better Of course these issues can be addressed by using FETCH n (n>10) but this kludge is only needed because the FETCH protocolis so inefficient. Think about the amount of bytes transferred for "select 2000 lines of integers" with and without declare/fetch/close. Imagine a result set of 1 to 20000 integers given back (depending on parameters) for an interactive program (e.g. browsing a customer list by initials). Prefer a cursor (much more constant overhead even for single results) or all in one (and wait longer for a first result)? I'd love to tell the backend to give a "descriptor" for this query back and use it efficiently to get data and/or metadata (see ODBC, JDBC, sqlda or dynamic sql). Perhaps it's most efficient to ask for N initial results (which are instantly returned). Christof (who implemented dynamic sql for ecpg) PS: perhaps this protocol infrastructure is also well suited to return large bytea values (<M bytes : return inline, > return a descriptor). [Also proposed by Barry Lind.] PPS: I'm perfectly fine with returning attrelid/attnum. Then the client can control how many effort is spent for determining only the asked for metadata.
On Tue, 11 Mar 2003, Bruce Momjian wrote: > Six months would be June 1 beta, so maybe that is still a good target. We released v7.3 just before Dec 1st, so six months is May 1st, not June 1st ...
Tom Lane wrote: > Barry Lind <blind@xythos.com> writes: > >>One addition I would personally like to see (it comes up in my apps >>code) is the ability to detect wheather the server is big endian or >>little endian. When using binary cursors this is necessary in order to >>read int data. > > > Actually, my hope is to eliminate that business entirely by > standardizing the on-the-wire representation for binary data; note the > reference to send/receive routines in the original message. For integer > data this is simple enough: network byte order will be it. I'm not sure > yet what to do about float data. > Great. > >>2) Better support for domains. Currently the jdbc driver is broken with >>regards to domains (although no one has reported this yet). The driver >>will treat a datatype that is a domain as an unknown/unsupported >>datatype. It would be great if the T response included the 'base' >>datatype for a domain attribute so that the driver would know what >>parsing routines to call to convert to/from the text representation the >>backend expects. > > > I'm unconvinced that we need do this in the protocol, as opposed to > letting the client figure it out with metadata inquiries. If we should, > I'd be inclined to just replace the typeid field with the base typeid, > and not mention the domain to the frontend at all. Comments? > I don't have a strong opinion on this one. I can live with current functionality. It isn't too much work to look up the base type. > >>So I would request the ability of the client to set a max rows parameter >> for query results. If a query were to return more than the max >>number of rows, the client would be given a handle (essentially a cursor >>name) that it could use to fetch additional sets of rows. > > > How about simply erroring out if the query returns more than X rows? > This shouldn't be an error condition. I want to fetch all of the rows, I just don't want to have to buffer them all in memory. Consider the following example. Select statement #1 is 'select id from foo', statement #2 is 'update bar set x = y where foo_id = ?'. The program logic issues statement #1 and then starts iterating through the results and the issues statement #2 for some of those results. If statement #1 returns a large number of rows the program can run out of memory if all the rows from #1 need to be buffered in memory. What would be nice is if the protocol allowed getting some rows from #1 but not all so that the connection could be used to issue some #2 statements. > >>4) Protocol level support of PREPARE. In jdbc and most other >>interfaces, there is support for parameterized SQL. If you want to take >>advantage of the performance benefits of reusing parsed plans you have >>to use the PREPARE SQL statement. > > > This argument seems self-contradictory to me. There is no such benefit > unless you're going to re-use the statement many times. Nor do I see > how pushing PREPARE down to the protocol level will create any > improvement in its performance. > There is a benefit if you do reuse the statement multiple times. The performance problem is the two round trips minimum to the server that are required. A protocol solution to this would be to allow the client to send multiple requests at one time to the server. But as I type that I realize that can already be done, by having multiple semi-colon separated SQL commands sent at once. So I probably have everything I need for this already. I can just cue up the 'deallocate' calls and piggyback them on to the next real call to the server. > >>So what I would like to see is the ability for the client to set a MAX >>VALUE size parameter. The server would send up to this amount of data >>for any column. If the value was longer than MAX VALUE, the server >>would respond with a handle that the client could use to get the rest of >>the value (in chunks of MAX VALUE) if it wanted to. > > > I don't think I want to embed this in the protocol, either; especially > not when we don't have even the beginnings of backend support for it. > I think such a feature should be implemented and proven as callable > functions first, and then we could think about pushing it down into the > protocol. > That is fine. > >>6) Better over the wire support for bytea. The current encoding of >>binary data \000 results in a significant expansion in the size of data >>transmitted. It would be nice if bytea data didn't result in 2 or 3 >>times data expansion. > > > AFAICS the only context where this could make sense is binary > transmission of parameters for a previously-prepared statement. We do > have all the pieces for that on the roadmap. > Actually it is the select of binary data that I was refering to. Are you suggesting that the over the wire format for bytea in a query result will be binary (instead of the ascii encoded text format as it currently exists)? > regards, tom lane > I am looking forward to all of the protocol changes. thanks, --Barry
On Wed, 12 Mar 2003, Justin Clift wrote: > Marc G. Fournier wrote: > <snip> > >>Would it be feasible to investigate approaches for having the Win32 and > >>PITR work be shared amongst a few very-interested volunteers, so that > >>people can cover for each other's downtime? Not sure of the > >>confidentiality level of the Win32/PITR patches at present, but I'd > >>guess there would be at least a few solid volunteers willing to > >>contribute to the Win32/PITR ports if we asked for people to step > >>forwards. > > > > Why should we be the ones to ask for ppl to step forward to volunteer to > > help? Shouldn't it be the responsibility of the developer working on it > > to admit that there is no way they will make the scheduale and call for > > help? > > Hadn't thought of that. Um.. how about "whatever works"? It's sounds > like the kind of thing where some people may be offended if we suddenly > started asking for extra volunteers for their bits, and others in the > same situation wouldn't. The thing is, its not often, but ppl have been pop'ng up on the list asking for things to do ... we have the TODO list that we point ppl to, but if someone is working on a project and needs help, that is the perfect time for them to pop up and try and lure the person over :) But it also doesn't negate someone from asking for help as well ... hell, how many ppl are working on something that someone else is working on from a different angle, that would end up stronger by working together?
> > > Does looking up by the catalog keys take no cost ? > > > > Obviously there is cost, but doing a lookup only on demand, has got to be > > cheaper in the long run than including the entire column definition in the > > message whether it's wanted or not? > > So if there are 100 fields, should we ask the backend > the column name 100 times ? Also doesn't the planner/executor already have all needed info available ? Thus answering from prepare would be a lot cheaper than selecting the descriptor after prepare (more network roundtrips, parse, plan and execute the descriptor lookup). If this where at the protocol level, the client could flag what info is wanted, and thus avoid all unwanted overhead. Andreas
"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes: > Also doesn't the planner/executor already have all needed info available ? Not directly, and not necessarily in the form that the client would want it in (eg, converting type OID to type name isn't free). I don't care to load either the backend or the protocol down with the responsibility for offering every piece of column data that a client could possibly want as part of RowDescription. Besides, elsewhere in this thread we were hearing about how RowDescription is already too much overhead for some people ;-) To my mind, the argument in favor of this feature is essentially that it saves ODBC/JDBC from needing to duplicate the backend's SQL parser; which is a legitimate concern. But that doesn't translate to saying that we should push functionality out of the clients and into the backend when it wouldn't be in the backend otherwise. That's just moving code around on the basis of some rather-shaky arguments about performance. And what happens when your client wants something different from the exact functionality that was pushed to the backend? You're back to square one. regards, tom lane
Barry Lind <blind@xythos.com> writes: >> AFAICS the only context where this could make sense is binary >> transmission of parameters for a previously-prepared statement. We do >> have all the pieces for that on the roadmap. >> > Actually it is the select of binary data that I was refering to. Are > you suggesting that the over the wire format for bytea in a query result > will be binary (instead of the ascii encoded text format as it currently > exists)? See binary cursors ... regards, tom lane
> -----Original Message----- > From: Hiroshi Inoue [mailto:Inoue@tpf.co.jp] > Sent: 13 March 2003 10:04 > To: Dave Page > Cc: tgl@sss.pgh.pa.us; ZeugswetterA@spardat.at; > pgman@candle.pha.pa.us; pgsql-hackers@postgresql.org; > pgsql-interfaces@postgresql.org > Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign > > > Dave Page wrote: > > > > It's rumoured that Hiroshi Inoue once said: > > > > > > Does looking up by the catalog keys take no cost ? > > > > Obviously there is cost, but doing a lookup only on demand, > has got to > > be cheaper in the long run than including the entire column > definition > > in the message whether it's wanted or not? > > So if there are 100 fields, should we ask the backend > the column name 100 times ? No, the column name is there already. This is useful for additional info such as uniqueness, nullability, default value, contraints and so on. Surely you don't want to get the entire relevant bits of the system catalogues with every query unless we specifically request it? Regards, Dave.
> -----Original Message----- > From: Zeugswetter Andreas SB SD [mailto:ZeugswetterA@spardat.at] > Sent: 13 March 2003 17:07 > To: Hiroshi Inoue; Dave Page > Cc: tgl@sss.pgh.pa.us; pgman@candle.pha.pa.us; > pgsql-hackers@postgresql.org > Subject: RE: [HACKERS] Roadmap for FE/BE protocol redesign > > > If this where at the protocol level, the client could flag > what info is wanted, and thus avoid all unwanted overhead. But the client usually won't know at that point. For example, an ODBC app executes a query and reads the tuples returned. After that has occured, the app calls SQLDescribeCol to describe a column in the resultset... Regards, Dave.
Couldn't it be done optionally, so the clients that want the info pay the price and those that don't want it get the speed and lower bandwidth? Just a thought andrew ----- Original Message ----- From: "Tom Lane" <tgl@sss.pgh.pa.us> > "Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes: > > Also doesn't the planner/executor already have all needed info available ? > > Not directly, and not necessarily in the form that the client would want > it in (eg, converting type OID to type name isn't free). I don't care > to load either the backend or the protocol down with the responsibility > for offering every piece of column data that a client could possibly > want as part of RowDescription. > > Besides, elsewhere in this thread we were hearing about how > RowDescription is already too much overhead for some people ;-) > > To my mind, the argument in favor of this feature is essentially that > it saves ODBC/JDBC from needing to duplicate the backend's SQL parser; > which is a legitimate concern. But that doesn't translate to saying > that we should push functionality out of the clients and into the > backend when it wouldn't be in the backend otherwise. That's just > moving code around on the basis of some rather-shaky arguments about > performance. And what happens when your client wants something > different from the exact functionality that was pushed to the backend? > You're back to square one. > > regards, tom lane
Tom Lane wrote: > Barry Lind <blind@xythos.com> writes: > >>>AFAICS the only context where this could make sense is binary >>>transmission of parameters for a previously-prepared statement. We do >>>have all the pieces for that on the roadmap. >>> >> >>Actually it is the select of binary data that I was refering to. Are >>you suggesting that the over the wire format for bytea in a query result >>will be binary (instead of the ascii encoded text format as it currently >>exists)? > > > See binary cursors ... Generally that is not an option. It either requires users to code to postgresql specific sql syntax, or requires the driver to do it magically for them. The later runs into all the issues that I raised on cursor support. In general the jdbc driver is expected to execute arbitrary sql statements any application might want to send it. The driver is handicaped because it doesn't know really anything about that sql statement (other than it is a select vs an update or delete). Specifically it doesn't know what tables or columns that SQL will access or how many rows a select will return. All ofthis knowledge is in the backend, and short of implementing a full sql parser in java this knowledge will never exist in the front end. Many of the things I put on my wish list for the protocol stem from this. Where there are two ways to do something (use cursors or not, use prepared statements or not, use binary cursors or not) the driver either needs to a) choose one way and always use it, b) infer from the sql statement which way will be better, or c) require the user to tell us. The problem with a) is that it may not always be the correct choice. The problem with b) is that generally this isn't possible and the problem with c) is it requires that the user write code that isn't portable across different databases. I would like to simply do a) in all cases. But that means that one of the two options should always (or almost always) be the best choice. So in the case of "use cursors or not", it would be nice if using cursors added little or no overhead such that it could always be used. In the case of "use prepared statements vs not", it would be nice if prepared statements added little or no overhead so that they could always be used. And finally in the case of "use binary or regular cursors" it would be nice if binary cursors could always be used. The Oracle SQLNet protocol supports most of this. Though it has been a few years since I worked with it, the oracle protocol has many of the features I am looking for (and perhaps the reason I am looking for them, is that I have seen them used there before). Essentially the Oracle protocol lets you do the following operations: open, parse, describe, bind, execute, fetch, close. A request from the client to the server specifies what operations it wants to perform on a sql statement. So a client could request to do all seven operations (which is essentially what the current postgres protocol does today). Or it could issue an open,parse call which essentially is that same thing as the PREPARE sql statement, followed by a describe,bind,execute,fetch which is similar to an EXECUTE and FETCH sql statement and finally a close which is similar to a CLOSE and DEALLOCATE sql. The describe request is generally only done once even though you may do multiple fetchs (unlike todays protocol which includes the describe information on every fetch, even if you are fetching one row at a time). The oracle approach gives the client complete flexibility to do a lot, without requiring that the client start parsing sql statements and doing things like appending on DECLARE CURSOR, or FETCH in order to reformate the applications sql statement into the postgresql sql way of doing this. --Barry
> -----Original Message----- > From: Dave Page > > > -----Original Message----- > > From: Hiroshi Inoue [mailto:Inoue@tpf.co.jp] > > Sent: 13 March 2003 10:04 > > To: Dave Page > > Cc: tgl@sss.pgh.pa.us; ZeugswetterA@spardat.at; > > pgman@candle.pha.pa.us; pgsql-hackers@postgresql.org; > > pgsql-interfaces@postgresql.org > > Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign > > > > > > Dave Page wrote: > > > > > > It's rumoured that Hiroshi Inoue once said: > > > > > > > > Does looking up by the catalog keys take no cost ? > > > > > > Obviously there is cost, but doing a lookup only on demand, > > has got to > > > be cheaper in the long run than including the entire column > > definition > > > in the message whether it's wanted or not? > > > > So if there are 100 fields, should we ask the backend > > the column name 100 times ? > > No, the column name is there already. The column name isn't there. If a field has its alias name the alias is there. regards, Hiroshi Inoue
Barry Lind <blind@xythos.com> writes: > Tom Lane wrote: >> See binary cursors ... > Generally that is not an option. It either requires users to code to > postgresql specific sql syntax, or requires the driver to do it > magically for them. Fair enough. I don't see anything much wrong with a GUC option that says "send SELECT output in binary format". This is not really a protocol issue since the ASCII and BINARY choices both exist at the protocol level --- there is nothing in the protocol saying binary data can only be returned by FETCH and not by SELECT. The main problem with it in present releases is that binary data is architecture-dependent and so encouraging its general use seems like a really bad idea. But if we manage to get send/receive conversion routines in there, most of that issue would go away. > The describe request is generally only > done once even though you may do multiple fetchs (unlike todays protocol > which includes the describe information on every fetch, even if you are > fetching one row at a time). I'm less than excited about changing that, because it breaks clients that don't want to remember past RowDescriptions (libpq being the front-line victim), and it guarantees loss-of-synchronization failures anytime the client misassociates rowdescription with query. In exchange for that, we get what exactly? Fetching one row at a time is *guaranteed* to be inefficient. The correct response if that bothers you is to fetch multiple rows at a time, not to make a less robust protocol. regards, tom lane
Tom Lane wrote: > > "Dave Page" <dpage@vale-housing.co.uk> writes: > > It's rumoured that Hiroshi Inoue once said: > >> Does looking up by the catalog keys take no cost ? > > > Obviously there is cost, but doing a lookup only on demand, has got to be > > cheaper in the long run than including the entire column definition in the > > message whether it's wanted or not? > > More to the point, the cost is paid by applications that want the > functionality, and not by those that don't. > > It'd probably be reasonable for client libraries to maintain a cache > of column info, so that they only have to query the backend about a > particular column ID once per connection. Is it a kind of thing that the server forces the clients easily ? Hmm as for PREPAREd statements, it seems much better to implement functions which returns fields info for the statement than relying on such a protocol level change. regards, Hiroshi Inouehttp://www.geocities.jp/inocchichichi/psqlodbc/
Hiroshi Inoue <Inoue@tpf.co.jp> writes: > Hmm as for PREPAREd statements, it seems much better to > implement functions which returns fields info for the > statement than relying on such a protocol level change. Well, we're changing the protocol anyway for other purposes, so the extra burden of a change here doesn't seem large. I think it's more useful to worry about what functionality we want than to worry about whether it's changed... regards, tom lane
Tom Lane wrote: > Barry Lind <blind@xythos.com> writes: > >>Tom Lane wrote: >> >>>See binary cursors ... > > >>Generally that is not an option. It either requires users to code to >>postgresql specific sql syntax, or requires the driver to do it >>magically for them. > > > Fair enough. I don't see anything much wrong with a GUC option that > says "send SELECT output in binary format". This is not really a > protocol issue since the ASCII and BINARY choices both exist at the > protocol level --- there is nothing in the protocol saying binary data > can only be returned by FETCH and not by SELECT. The main problem with > it in present releases is that binary data is architecture-dependent and > so encouraging its general use seems like a really bad idea. But if we > manage to get send/receive conversion routines in there, most of that > issue would go away. > That would be great. > >>The describe request is generally only >>done once even though you may do multiple fetchs (unlike todays protocol >>which includes the describe information on every fetch, even if you are >>fetching one row at a time). > > > I'm less than excited about changing that, because it breaks clients > that don't want to remember past RowDescriptions (libpq being the > front-line victim), and it guarantees loss-of-synchronization failures > anytime the client misassociates rowdescription with query. In exchange > for that, we get what exactly? Fetching one row at a time is > *guaranteed* to be inefficient. The correct response if that bothers > you is to fetch multiple rows at a time, not to make a less robust > protocol. I don't feel strongly either way on this one, but IIRC the SQL standard for cursors only specifies fetching one record at a time (at least that is how MSSQL and DB2 implement it). Thus portable code is likely to only fetch one record at a time. The current row description isn't too big, but with the changes being suggested it might become so. thanks, --Barry
Marc G. Fournier wrote: > On Tue, 11 Mar 2003, Bruce Momjian wrote: > > > Six months would be June 1 beta, so maybe that is still a good target. > > We released v7.3 just before Dec 1st, so six months is May 1st, not June > 1st ... Six months is June 1 --- December (1), January-May (5) == 6. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
--On Thursday, March 13, 2003 21:32:36 -0500 Bruce Momjian <pgman@candle.pha.pa.us> wrote: > Marc G. Fournier wrote: >> On Tue, 11 Mar 2003, Bruce Momjian wrote: >> >> > Six months would be June 1 beta, so maybe that is still a good target. >> >> We released v7.3 just before Dec 1st, so six months is May 1st, not June >> 1st ... > > Six months is June 1 --- December (1), January-May (5) == 6. > PostgreSQL agrees: ler=# select date('2002-12-01') + ler-# '6 months'::interval; ?column? ---------------------2003-06-01 00:00:00 (1 row) ler=# -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 972-414-9812 E-Mail: ler@lerctr.org US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749
It's rumoured that Hiroshi Inoue once said: >> -----Original Message----- >> From: Dave Page >> >> > -----Original Message----- >> > From: Hiroshi Inoue [mailto:Inoue@tpf.co.jp] >> > Sent: 13 March 2003 10:04 >> > To: Dave Page >> > Cc: tgl@sss.pgh.pa.us; ZeugswetterA@spardat.at; >> > pgman@candle.pha.pa.us; pgsql-hackers@postgresql.org; >> > pgsql-interfaces@postgresql.org >> > Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign >> > >> > >> > Dave Page wrote: >> > > >> > > It's rumoured that Hiroshi Inoue once said: >> > > > >> > > > Does looking up by the catalog keys take no cost ? >> > > >> > > Obviously there is cost, but doing a lookup only on demand, >> > has got to >> > > be cheaper in the long run than including the entire column >> > definition >> > > in the message whether it's wanted or not? >> > >> > So if there are 100 fields, should we ask the backend >> > the column name 100 times ? >> >> No, the column name is there already. > > The column name isn't there. If a field has its alias name the alias is > there. Ahh yes, you're right of course. The docs are a little sketchy on that detail (http://www.postgresql.org/docs/view.php?version=7.3&idoc=1&file=protocol-message-formats.html). Regards, Dave
Tom Lane wrote: > Barry Lind <blind@xythos.com> writes: >>The describe request is generally only >>done once even though you may do multiple fetchs (unlike todays protocol >>which includes the describe information on every fetch, even if you are >>fetching one row at a time). > > > I'm less than excited about changing that, because it breaks clients > that don't want to remember past RowDescriptions (libpq being the > front-line victim), and it guarantees loss-of-synchronization failures > anytime the client misassociates rowdescription with query. In exchange > for that, we get what exactly? Fetching one row at a time is > *guaranteed* to be inefficient. The correct response if that bothers > you is to fetch multiple rows at a time, not to make a less robust > protocol. I don't think that protocol support for cursors should change the behavior of executing all seven stages by default. A "FETCH ..." commmand would get processed like any other (e.g. "SELECT ...") and metadata is sent back, too (which corresponds to decribe stage IIRC). New programs have the option to use the backwards compatible high level access via PQexec(c,"FETCH FROM X") which does all seven steps at once, or use the new low level way e.g. PQexec_new(c,"SELECT ...", query_parameter_descriptor, what_to_do (*), lines_to_return_without_cursor_overhead) which should return at most the specified lines and (if needed) a cursor descriptor (most likely an int) for subsequent PQfetch and PQclose calls. I really like the idea of PGresult as an argument (cursor descriptor) for PQfetch (instead of an int) because it may even copy the metadata to the new PGresult, or perhaps replace the values in the original PGresult (if we decide to go this way). [proposed signature: PGresult *PQfetch(PGresult*result_of_the_select, how_many_lines, perhaps_even_offset/position)] Additional there should be a PQclose and perhaps a PQprocess(PGresult *, things_to_do (*)) if we want to be able to separate every step. If you know you are never interested in metadata, you can omit the describe flag at all. [null indication and type specification is of course always needed to access the actual data] Christof *) open, parse, describe, bind, execute, fetch, close PS: If we decide to omit the lines_to_return_without_cursor_overhead optimization, the new architecture would still be a big win for *DBC. This optimization can not get a GUC variable instead of a protocol parameter since this would break clients: should they specify fetch+close to enable it? If yes, there's no easy way to implement the old behavior (all seven stages, no limit on returned lines). If no, the client cannot specify to omit the fetch without changing it (limit 0). PPS: Query parameter passing is another topic, but I tend to propose a PGresult variant for specifying them (of course each with its type).
Christof Petig wrote: > If you know you are never interested in metadata, you can omit the > describe flag at all. [null indication and type specification is of > course always needed to access the actual data] More exactly they are sent separately: null indication is per row 'D'/'B' and type specification is per query 'T'. If the client does not ask for metadata one might omit attrelid,attnum (*) and field name in the 'T' packet. One might argue whether this small win per query times column rectifies to implement the feature. But then we'd need a method to query them lateron (otherwise *DBC could never omit them at first). Christof *) they are not there, yet ;-)
> > for that, we get what exactly? Fetching one row at a time is > > *guaranteed* to be inefficient. The correct response if that bothers > > you is to fetch multiple rows at a time, not to make a less robust > > protocol. > I don't feel strongly either way on this one, but IIRC the SQL standard > for cursors only specifies fetching one record at a time (at least that > is how MSSQL and DB2 implement it). Thus portable code is likely to > only fetch one record at a time. The current row description isn't too > big, but with the changes being suggested it might become so. Yes, I think it should be the responsibility of the higer level interfaces (ODBC, esql/c, ...) to fetch rows in optimal packets (e.g. 4-16k at a time). This is what other db's interfaces do for you. The first fetch gets a packet from the db, subsequent fetches are answered from the client side cache until it is empty, then the next fetch gets the next packet from the db ... The trick would be to ask for n bytes of row data (maybe a range), which might be 1 - n rows (or maybe even only part of one row if it is very wide ?). I think this is one of the things the new protocol should be able to cope with. Andreas
So, just to throw out a wild idea: If you're talking about making large changes to the on-the-wire protocol. Have you considered using an existing database protocol? This would avoid having to reinvent the wheel every time postgres implements a new feature like prepared queries, bind arrays, or metadata information. There is a free implementation of the TDS (Tabular DataStream) protocol used by Sybase and MSSQL. I don't know how much of it would be interesting for postgres and how much is Sybase/MSSQL-specific. It would be pretty neat if postgres could use precisely the same on-the-wire protocol as other major databases, just requiring a separate high level driver to interpret the semantic meaning of the data. At the very least it sounds like interesting to do a compare and contrast as far as understanding what advantages the features TDS has have and what disadvantages, before postgres possibly misses good ideas or makes the same mistakes. -- greg
Greg Stark <gsstark@mit.edu> writes: > So, just to throw out a wild idea: If you're talking about making large > changes to the on-the-wire protocol. Have you considered using an existing > database protocol? Yeah, I have. Didn't look promising --- there's no percentage unless we're 100% compatible, which creates a lot of problems (eg, can't ship type OIDs to frontend anymore). What I actually looked into was RDA, but I doubt that TDS would be any closer to our needs... regards, tom lane
A long time ago, in a galaxy far, far away, gsstark@mit.edu (Greg Stark) wrote: > So, just to throw out a wild idea: If you're talking about making large > changes to the on-the-wire protocol. Have you considered using an existing > database protocol? This would avoid having to reinvent the wheel every time > postgres implements a new feature like prepared queries, bind arrays, or > metadata information. > > There is a free implementation of the TDS (Tabular DataStream) protocol used > by Sybase and MSSQL. I don't know how much of it would be interesting for > postgres and how much is Sybase/MSSQL-specific. > > It would be pretty neat if postgres could use precisely the same on-the-wire > protocol as other major databases, just requiring a separate high level driver > to interpret the semantic meaning of the data. > > At the very least it sounds like interesting to do a compare and contrast as > far as understanding what advantages the features TDS has have and what > disadvantages, before postgres possibly misses good ideas or makes the same > mistakes. Let me take the liberty of pointing people to documentation to the TDS protocol: <http://www.freetds.org/tds.html> I agree that there are far worse ideas, when opening up the DB protocol, than to look at some existing protocols, and TDS would seem a reasonable one. It doesn't look overly efficient, but it's not overtly gratuitously inefficient... -- (reverse (concatenate 'string "moc.enworbbc@" "enworbbc")) http://www.ntlug.org/~cbbrowne/linuxxian.html DOS: n., A small annoying boot virus that causes random spontaneous system crashes, usually just before saving a massive project. Easily cured by Unix. See also MS-DOS, IBM-DOS, DR-DOS. -- from David Vicker's .plan
Tom Lane writes: > So are you voting against adding any attribute-ID info to > RowDescription? While I'm not that thrilled with it myself, it seems > relatively harmless as long as we can keep the overhead down. I'm > okay with attrelid/attnum, but would gripe about including more than that. At the beginning of this thread you raised a number of points where the identity of the column of origin is not well-defined. I haven't seen an answer to that. Whether the identity of the column is provided through numbers or, as was originally requested, through names seems to be irrelevant for those questions. Now assume someone wanted to define a method to identify the column of origin for a limited set of query types. Would you align the definition with what the current planner and executor structures can easily give you or would you use a more "mathematical" definition? And assuming it's the latter, do you feel confident that that definition will not constrain development of the planner and executor structures in the future? -- Peter Eisentraut peter_e@gmx.net
Peter Eisentraut <peter_e@gmx.net> writes: > At the beginning of this thread you raised a number of points where the > identity of the column of origin is not well-defined. I haven't seen an > answer to that. Yes, Dave did answer --- basically, he's happy with not providing any column identity data in the cases where it's not obvious what the answer should be. And in particular he doesn't want the mechanism to drill down into view definitions (which is less than obviously right to me, but if that's what he wants it sure eliminates a lot of definitional issues). Given that agreement I don't have a problem with providing the functionality. It will take a few more lines in the parser, but not an unreasonable amount I think. > Would you align the definition > with what the current planner and executor structures can easily give you > or would you use a more "mathematical" definition? And assuming it's the > latter, do you feel confident that that definition will not constrain > development of the planner and executor structures in the future? I'm not too concerned about it given the before-view-expansion proviso. Once the rewriter and planner go into action, the contents of the query tree do start to look rather implementation-dependent, but what the parser does is pretty well constrained by the SQL spec. regards, tom lane
Justin Clift wrote: [ ... ] > The problem Dave is suggesting this as a first attempt at a solution for > is that with ODBC, a frontend (i.e. OpenOffice) asks the ODBC driver > which columns are NULLable, etc. And the ODBC driver is getting the > info wrong, then passing back the incorrect info. And that's probably why OpenOffice doesn't allow you to update a view, event if there are rules allowing this from psql ... This, in my book is an EPITA for end users ... > So, when a person goes to insert a row into a table with a > SERIAL/SEQUENCE based column, OpenOffice has been told the column isn't > NULLable and forces the user to enter a value. Voila, it doesn't work > with sequences. :( > > It's likely possible to add to the ODBC driver some way of getting the > info right, but Dave is also looking for a way of making this easier > into the future for similar problems. i.e. Let the database explicitly > have info about what each column can do. I'd second that, even if it takes time ... Emmanuel Charpentier -- Emmanuel Charpentier
Tom Lane kirjutas R, 14.03.2003 kell 19:15: > Greg Stark <gsstark@mit.edu> writes: > > So, just to throw out a wild idea: If you're talking about making large > > changes to the on-the-wire protocol. Have you considered using an existing > > database protocol? > > Yeah, I have. Didn't look promising --- there's no percentage unless > we're 100% compatible, which creates a lot of problems (eg, can't ship > type OIDs to frontend anymore). Surely there is a way to ship type info, even for UDT's (user defined types) as nowadays most big databases support those. > What I actually looked into was RDA, but I doubt that TDS would be any > closer to our needs... I remember someone started cleaning up IO in order to move it into a separate module with the aim of making multiple implementations (RDA, TDS, XML, native JDBC wire protocol if it ever becomes a reality, etc.) possible. While not exactly pertinent to new wire protocol this effort if completed would make it much easier to have backwards compatibility on the wire level. ------------ Hannu PS. Another feature I'd like is individually turning off warnings and notices. ------------ Hannu
Tom Lane writes: > * Backend's ReadyForQuery message (Z message) should carry an indication > of current transaction status (idle/in transaction/in aborted transaction) > so that frontend need not guess at state. Perhaps also indicate > autocommit status. If we do this, could we get rid of the messy autocommit GUC option and handle autocommit in the client? Before sending a command, the client could check the transaction status and automatically choose to start a new transaction. That way we could get rid of the current mess that every client needs to send a SET autocommit command before it can safely do anything. -- Peter Eisentraut peter_e@gmx.net
Tom Lane writes: > Yes, Dave did answer --- basically, he's happy with not providing any > column identity data in the cases where it's not obvious what the answer > should be. And in particular he doesn't want the mechanism to drill > down into view definitions (which is less than obviously right to me, > but if that's what he wants it sure eliminates a lot of definitional > issues). I don't get it. Say I execute SELECT a, b, c FROM foo;. In order to update that query, the application needs to create some update statement, say UPDATE foo SET a = entered_value;. So the application already knows that "foo" is the table and "a" is the column. So if the application wants to know about details on the column "a", it can execute SELECT whatever FROM pg_attribute, pg_class WHERE relname = 'foo' AND attname = 'a'; With this proposed change, it can replace that with SELECT whatever FROM pg_attribute, pg_class WHERE oid = X AND attnum = Y; With the difference that the first version always works and the second version sometimes works, and when that sometimes is is determined by the rule that it should be "obvious". That doesn't seem right to me. -- Peter Eisentraut peter_e@gmx.net
Peter Eisentraut wrote: > Tom Lane writes: > > > * Backend's ReadyForQuery message (Z message) should carry an indication > > of current transaction status (idle/in transaction/in aborted transaction) > > so that frontend need not guess at state. Perhaps also indicate > > autocommit status. > > If we do this, could we get rid of the messy autocommit GUC option and > handle autocommit in the client? Before sending a command, the client > could check the transaction status and automatically choose to start a new > transaction. That way we could get rid of the current mess that every > client needs to send a SET autocommit command before it can safely do > anything. What if folks want all their connections autocommit off. Seems it is best in the server. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Peter Eisentraut <peter_e@gmx.net> writes: > Tom Lane writes: >> * Backend's ReadyForQuery message (Z message) should carry an indication >> of current transaction status (idle/in transaction/in aborted transaction) >> so that frontend need not guess at state. Perhaps also indicate >> autocommit status. > If we do this, could we get rid of the messy autocommit GUC option and > handle autocommit in the client? Hmm ... that's a thought ... not very backwards-compatible with 7.3, but I think I like it better than continuing on with the GUC option. As you say, that path is looking messier all the time. Comments anyone? regards, tom lane
On Mon, 2003-03-17 at 20:48, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > If we do this, could we get rid of the messy autocommit GUC option and > > handle autocommit in the client? > > Hmm ... that's a thought ... not very backwards-compatible with 7.3, > but I think I like it better than continuing on with the GUC option. > As you say, that path is looking messier all the time. Yes, definitely -- replacing the 7.3 GUC option would be great, that's a pretty bad hack IMHO. Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
Peter Eisentraut <peter_e@gmx.net> writes: > ... So the application already knows > that "foo" is the table and "a" is the column. So if the application > wants to know about details on the column "a", it can execute > SELECT whatever FROM pg_attribute, pg_class WHERE relname = 'foo' AND attname = 'a'; > With this proposed change, it can replace that with > SELECT whatever FROM pg_attribute, pg_class WHERE oid = X AND attnum = Y; Dave will correct me if I'm wrong --- but I think the issue here is that the client-side library (think ODBC or JDBC) needs to gain this level of understanding of a query that is presented to it as an SQL-source string. So no, it doesn't already know that "foo" is the table and "a" is the column. To find that out, it has to duplicate a lot of backend code. regards, tom lane
Peter Eisentraut wrote: > I don't get it. Say I execute SELECT a, b, c FROM foo;. In order to > update that query, the application needs to create some update statement, > say UPDATE foo SET a = entered_value;. So the application already knows > that "foo" is the table and "a" is the column. So if the application > wants to know about details on the column "a", it can execute > > SELECT whatever FROM pg_attribute, pg_class WHERE relname = 'foo' AND attname = 'a'; Once the statement uses aliases for the result columns this can go wrong. And to determine the correct table is difficult at best. > With this proposed change, it can replace that with > > SELECT whatever FROM pg_attribute, pg_class WHERE oid = X AND attnum = Y; > > With the difference that the first version always works and the second > version sometimes works, and when that sometimes is is determined by the > rule that it should be "obvious". That doesn't seem right to me. I have the impression that the first solution sometimes works (and other times gets it really wrong, you can't tell without parsing the query) and the second solution works when possible (and gives no answer elsewhere). The only cases I expect the second one to fail are: natural joins (where the table is ambigious), expressions and function calls. Of course you can't expect a sane result in these cases. On a second thought I do not want the column name (or alias) to get discarded (because otherwise you can't get the alias), but *DBC desperately needs attrelid and attnum. Christof
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: 18 March 2003 06:26 > To: Peter Eisentraut > Cc: Dave Page; PostgreSQL Development > Subject: Re: [HACKERS] Roadmap for FE/BE protocol redesign > > > Peter Eisentraut <peter_e@gmx.net> writes: > > ... So the application already knows > > that "foo" is the table and "a" is the column. So if the > application > > wants to know about details on the column "a", it can > execute SELECT > > whatever FROM pg_attribute, pg_class WHERE relname = 'foo' > AND attname > > = 'a'; With this proposed change, it can replace that with SELECT > > whatever FROM pg_attribute, pg_class WHERE oid = X AND attnum = Y; > > Dave will correct me if I'm wrong --- but I think the issue > here is that the client-side library (think ODBC or JDBC) > needs to gain this level of understanding of a query that is > presented to it as an SQL-source string. So no, it doesn't > already know that "foo" is the table and "a" is the column. > To find that out, it has to duplicate a lot of backend code. That's basically it. pgAdmin jumps through hoops trying to parse the SQL the user enters to figure out tables and columns etc, and currently the ODBC driver doesn't even bother in many cases (it does have a 'Parse Statements' option but that rarely helps) and ends up returning incorrect info. I don't know much about the JDBC driver, but Barry did say this fix would save him exactly the same problems. Regards, Dave.
Bruce Momjian writes: > What if folks want all their connections autocommit off. For interactive use, people can record their preferred setting in ~/.psqlrc or something like that. But any non-interactive program is written with a specific autocommit setting in mind. Either it assumes that single statements are automatically committed or it issues explicit COMMIT commands. That means, the choice of autocommit mode is always ultimately determined by the client, never by the server or its administrator. For that very reason most standard interfaces define in their specification whether applications must assume autocommit behavior or must not do so. -- Peter Eisentraut peter_e@gmx.net
Peter Eisentraut wrote: > Bruce Momjian writes: > > > What if folks want all their connections autocommit off. > > For interactive use, people can record their preferred setting in > ~/.psqlrc or something like that. But that only works for psql, right? How would this be done at the libpq level? Environment variables? GUC seems a whole lot cleaner. > But any non-interactive program is written with a specific autocommit > setting in mind. Either it assumes that single statements are > automatically committed or it issues explicit COMMIT commands. That > means, the choice of autocommit mode is always ultimately determined by > the client, never by the server or its administrator. > > For that very reason most standard interfaces define in their > specification whether applications must assume autocommit behavior or must > not do so. I understand. I just don't see any value in pushing that logic into each client when we can do it centrally in the server. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Tue, Mar 18, 2003 at 05:18:02PM -0500, Bruce Momjian wrote: > Peter Eisentraut wrote: > > Bruce Momjian writes: > > > > > What if folks want all their connections autocommit off. > > > > For interactive use, people can record their preferred setting in > > ~/.psqlrc or something like that. > > But that only works for psql, right? How would this be done at the > libpq level? Environment variables? GUC seems a whole lot cleaner. I think an environment variable would be right. The current method is not clean in the sense that a client cannot decide what she wants; she just accepts the decision from the DBA. Thus, an application can't be written with a certain value in mind, because the DBA can change the setting at any time. Client-side decision is the wiser proposal, I think. > I understand. I just don't see any value in pushing that logic into > each client when we can do it centrally in the server. The server doesn't know beforehand what the client wants. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "No renuncies a nada. No te aferres a nada"
Alvaro Herrera wrote: > On Tue, Mar 18, 2003 at 05:18:02PM -0500, Bruce Momjian wrote: > > Peter Eisentraut wrote: > > > Bruce Momjian writes: > > > > > > > What if folks want all their connections autocommit off. > > > > > > For interactive use, people can record their preferred setting in > > > ~/.psqlrc or something like that. > > > > But that only works for psql, right? How would this be done at the > > libpq level? Environment variables? GUC seems a whole lot cleaner. > > I think an environment variable would be right. The current method is > not clean in the sense that a client cannot decide what she wants; she > just accepts the decision from the DBA. Thus, an application can't be > written with a certain value in mind, because the DBA can change the > setting at any time. The client can say "SET autocommit to off" or on. It can use SHOW to to see the setting. Environment variables are used mostly for connecting, and once connected, we use GUC. In fact, an environment variable seems wrong because it isn't integrated into the client, like a SET command is. > Client-side decision is the wiser proposal, I think. > > > I understand. I just don't see any value in pushing that logic into > > each client when we can do it centrally in the server. > > The server doesn't know beforehand what the client wants. True, but GUC seems like the way to go, and we have per-user/db settings for GUC. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On 16 Mar 2003, Hannu Krosing wrote: > Tom Lane kirjutas R, 14.03.2003 kell 19:15: > > Greg Stark <gsstark@mit.edu> writes: > > > So, just to throw out a wild idea: If you're talking about making large > > > changes to the on-the-wire protocol. Have you considered using an existing > > > database protocol? > > > > What I actually looked into was RDA, but I doubt that TDS would be any > > closer to our needs... > > I remember someone started cleaning up IO in order to move it into a > separate module with the aim of making multiple implementations (RDA, > TDS, XML, native JDBC wire protocol if it ever becomes a reality, etc.) > possible. That was me, I did an initial cut of separating the FE/BE code from the rest, but ran short on time. Hoping to get back to it one of these days. My primary interest was in getting the DRDA protocol supported in a clean fashion. For those mentioning RDA, I believe that standard is pushing up the daisys. DRDA is about the only standards game in town at this point, it has client side support from just about every vendor (IBM obviously, Oracle, Sybase, MS) and server side support of some sort from DB2 and a couple others (MS SNA gateway, for example is/has a DRDA server). Mostly through gateways and add on products, but it's a far cry better than any other effort I'm aware of. > While not exactly pertinent to new wire protocol this effort if > completed would make it much easier to have backwards compatibility on > the wire level. I think this would be a good idea all around, and would make future changes/replacements to FE/BE protocol a lot cleaner. > ------------ > Hannu Brian
Bruce Momjian writes: > True, but GUC seems like the way to go, and we have per-user/db settings > for GUC. But the required autocommit setting depends neither on the user nor the database, it depends on the identity of the client application. That type of granularity is not offered by GUC. -- Peter Eisentraut peter_e@gmx.net
Peter Eisentraut wrote: > Bruce Momjian writes: > > > True, but GUC seems like the way to go, and we have per-user/db settings > > for GUC. > > But the required autocommit setting depends neither on the user nor the > database, it depends on the identity of the client application. That type > of granularity is not offered by GUC. True, but the standard also requires autocommit off, so I can imagine folks wanting it off by default for various users/database. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Monday 10 March 2003 10:51 am, Tom Lane wrote: > > * XML support? If we do anything, I'd want some extensible solution to > allowing multiple query-result output formats from the backend, not an > XML-specific hack. For one thing, that would allow the actual appearance > of any XML support to happen later. It seems this would also be a good solution to a previous discussion about boolean representation. The postgres output of t/f is perfectly resonable, but can be somewhat confusing to someone that relies on PHP's typecasting. In the discussion, someone mentioned that if you take in a variable directly from the database and cast it to boolean, both 't' and 'f' will cast to true. It turned out to be even more confusing because MySQL uses 0/1 which cast properly. If I remember correctly, there was even talk of adding a run-time parameter similar to the datestyle. If it were all handled in the query-result output formatting functions like you suggest, that would seem like a much cleaner solution. Regards,Jeff Davis
I am wondering if it would be possible to return the value of the first serial field in the row that was incremented on an insert ? -- Dave Cramer <dave@fastcrypt.com> Cramer Consulting
Dave Cramer <dave@fastcrypt.com> writes: > I am wondering if it would be possible to return the value of the first > serial field in the row that was incremented on an insert ? That's a language issue, not a protocol issue, and no I'm not taking it on for 7.4. regards, tom lane
Tom Lane wrote: > Dave Cramer <dave@fastcrypt.com> writes: > > I am wondering if it would be possible to return the value of the first > > serial field in the row that was incremented on an insert ? > > That's a language issue, not a protocol issue, and no I'm not taking it > on for 7.4. What I was wondering is if we could create a currval() call that takes no arguments, and returns the most recent sequence id assigned. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tom, Forgive me for being dense, but how do you delineate this as being a "language issue"? Dave On Fri, 2003-03-21 at 10:54, Bruce Momjian wrote: > Tom Lane wrote: > > Dave Cramer <dave@fastcrypt.com> writes: > > > I am wondering if it would be possible to return the value of the first > > > serial field in the row that was incremented on an insert ? > > > > That's a language issue, not a protocol issue, and no I'm not taking it > > on for 7.4. > > What I was wondering is if we could create a currval() call that takes > no arguments, and returns the most recent sequence id assigned. -- Dave Cramer <dave@fastcrypt.com> Cramer Consulting
Bruce Momjian <pgman@candle.pha.pa.us> writes: > What I was wondering is if we could create a currval() call that takes > no arguments, and returns the most recent sequence id assigned. Why? That's still an extra query that the client has to issue, and currval in that form would be an amazingly fragile programming tool. (What if some trigger causes an autoincrement on some other sequence than the one you are thinking about?) I liked the INSERT ... RETURNING and UPDATE ... RETURNING syntax extensions that Philip Warner (IIRC) proposed awhile back. Those would get the job done much more flexibly than anything else that's been suggested. That's why I think it's a language problem and not a protocol problem. regards, tom lane
Sounds good. --------------------------------------------------------------------------- Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > What I was wondering is if we could create a currval() call that takes > > no arguments, and returns the most recent sequence id assigned. > > Why? That's still an extra query that the client has to issue, and > currval in that form would be an amazingly fragile programming tool. > (What if some trigger causes an autoincrement on some other sequence > than the one you are thinking about?) > > I liked the INSERT ... RETURNING and UPDATE ... RETURNING syntax > extensions that Philip Warner (IIRC) proposed awhile back. Those would > get the job done much more flexibly than anything else that's been > suggested. That's why I think it's a language problem and not a > protocol problem. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian writes: > True, but the standard also requires autocommit off, so I can imagine > folks wanting it off by default for various users/database. The standard only covers embedded C programs. Other interfaces that are covered by other standards require other settings. -- Peter Eisentraut peter_e@gmx.net
Peter Eisentraut wrote: > Bruce Momjian writes: > > > True, but the standard also requires autocommit off, so I can imagine > > folks wanting it off by default for various users/database. > > The standard only covers embedded C programs. Other interfaces that are > covered by other standards require other settings. My point was that _the_ standard requires autocommit off, not that everyone is using autocommit off. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Peter Eisentraut wrote: >> The standard only covers embedded C programs. Other interfaces that are >> covered by other standards require other settings. > My point was that _the_ standard requires autocommit off, not that > everyone is using autocommit off. Peter's point is good: the only interface that is actually subject to the part of the spec you are quoting is ecpg. It could reasonably be argued that libpq should only support autocommit-on, because that's the historical behavior that programs written atop libpq expect. The other client libraries have their own specs to adhere to. regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Peter Eisentraut wrote: > >> The standard only covers embedded C programs. Other interfaces that are > >> covered by other standards require other settings. > > > My point was that _the_ standard requires autocommit off, not that > > everyone is using autocommit off. > > Peter's point is good: the only interface that is actually subject to the > part of the spec you are quoting is ecpg. It could reasonably be argued > that libpq should only support autocommit-on, because that's the > historical behavior that programs written atop libpq expect. The other > client libraries have their own specs to adhere to. But isn't that like saying that the spec doesn't apply to libpq at all. Why would autocommit not apply but other queries specification apply? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > But isn't that like saying that the spec doesn't apply to libpq at all. > Why would autocommit not apply but other queries specification apply? No, you're missing the point. Essentially all the client libraries offer their own autocommit behavior on top of what the spec says. libpq is alone in not having any such client-side logic. So it's not a foregone conclusion that we should implement autocommit on/off logic on the server side as a substitute for adding it to libpq. We have now tried doing it on the server side, and we are finding that we don't like the side-effects; so it's time to revisit that decision. regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > But isn't that like saying that the spec doesn't apply to libpq at all. > > Why would autocommit not apply but other queries specification apply? > > No, you're missing the point. Essentially all the client libraries > offer their own autocommit behavior on top of what the spec says. Well, which ones? jdbc (which prevers the server autocommit), ecpg, odbc (?), and Perl. Anyone else? > libpq is alone in not having any such client-side logic. So it's not > a foregone conclusion that we should implement autocommit on/off logic > on the server side as a substitute for adding it to libpq. We have > now tried doing it on the server side, and we are finding that we don't > like the side-effects; so it's time to revisit that decision. My concern is bloating the API for all languages based on libpq, and psql and stuff like that. Heck, even pgadmin would have to have a button for it because a SET couldn't control it. I know the server-side isn't optimal, but is the alternative worse? Also, if Peter wants clients to be able to just issue a query and know it commits, should we just disable setting autocommit in postgresql.conf and per-user/db, and force applications to turn it on explicitly? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > My concern is bloating the API for all languages based on libpq, and > psql and stuff like that. Heck, even pgadmin would have to have a > button for it because a SET couldn't control it. Peter's point, AIUI, is that that is a good thing. The problem with SET for autocommit is that every client program has to *know*, without question, which setting it is using. Autocommit is just about as dangerous as a GUC variable that, say, silently swaps the meanings of INSERT and DELETE --- if you don't know which setting you are using then you will get the wrong results. Thus it is not "convenient" to allow autocommit to be set in all the weird and wonderful ways that we allow GUC variables to be set; it's more like handing a razor blade to a baby. We've already found out that all client-side apps have to explicitly force autocommit to the setting they want, or they break. How is that a good thing? I think we *should* go back to the assumption that libpq-based programs see autocommit-on behavior unless they take specific action to prevent it. And that means that the client program has to take action to select autocommit off, not that the admin can flick a switch remotely that will affect clients. The real point is that both the client application and the client library need to know what the autocommit behavior is. This is why adding autocommit to the library APIs is the right thing, not the wrong thing. When there are ways other than a library API call to set the autocommit behavior, then one party or the other is out of the loop about what the current behavior is, and that gets us right back into the same mess. Basically I think that Peter is arguing that autocommit as a GUC variable is a wrong and dangerous idea. And I'm forced to agree. I was wrong to put it in, and I'm now willing to take it out again. At the time it seemed like a reasonable shortcut around changing the protocol --- but now that we're changing the protocol anyway, it's better to get rid of the shortcut. regards, tom lane
In general I agree with Tom. GUC is the wrong mechanism for autocommit. The reason being that it isn't a system administratorsdecision what value autocommit should have. It is generally dictated by the client protocol or application. Now that being said, it is nice for the client to be able to simply tell the server "you are in autocommit mode until I tell you otherwise". Instead of having to worry about trapping each commit and rollback request to make sure you insert to proper begin. The current GUC autocommit is nice in that it makes it easier for the cleint to simply turn on or off the state. It is a problem because it is a GUC parameter and therefore can be changed by the admin (thus you don't know what your initial state is without asking the server) or the user (via sql SET, thus you don't know that it has changed). As I said in my earlier mail note from a jdbc perspective I can get it to work which ever way is decided (in fact the jdbc driver will probably need to support all of the ways, depending on if it is talking to a 7.2, 7.3 or 7.4 backend). My preference (given that I am detecting a willingness to make more significant changes in this area that I was expecting) would be to drop the GUC autocommit parameter. Replacing that functionality with the ability to set and discover the autocommit state via the FE/BE protocol. thanks, --Barry Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > >>My concern is bloating the API for all languages based on libpq, and >>psql and stuff like that. Heck, even pgadmin would have to have a >>button for it because a SET couldn't control it. > > > Peter's point, AIUI, is that that is a good thing. > > The problem with SET for autocommit is that every client program has to > *know*, without question, which setting it is using. Autocommit is just > about as dangerous as a GUC variable that, say, silently swaps the > meanings of INSERT and DELETE --- if you don't know which setting you > are using then you will get the wrong results. > > Thus it is not "convenient" to allow autocommit to be set in all the > weird and wonderful ways that we allow GUC variables to be set; it's > more like handing a razor blade to a baby. We've already found out that > all client-side apps have to explicitly force autocommit to the setting > they want, or they break. How is that a good thing? > > I think we *should* go back to the assumption that libpq-based programs > see autocommit-on behavior unless they take specific action to prevent > it. And that means that the client program has to take action to select > autocommit off, not that the admin can flick a switch remotely that will > affect clients. > > The real point is that both the client application and the client > library need to know what the autocommit behavior is. This is why > adding autocommit to the library APIs is the right thing, not the wrong > thing. When there are ways other than a library API call to set the > autocommit behavior, then one party or the other is out of the loop > about what the current behavior is, and that gets us right back into the > same mess. > > Basically I think that Peter is arguing that autocommit as a GUC > variable is a wrong and dangerous idea. And I'm forced to agree. > I was wrong to put it in, and I'm now willing to take it out again. > At the time it seemed like a reasonable shortcut around changing the > protocol --- but now that we're changing the protocol anyway, it's > better to get rid of the shortcut. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
Man, I lost another vote! :-) --------------------------------------------------------------------------- Barry Lind wrote: > In general I agree with Tom. GUC is the wrong mechanism for autocommit. > The reason being that it isn't a system administrators decision what > value autocommit should have. It is generally dictated by the client > protocol or application. > > Now that being said, it is nice for the client to be able to simply tell > the server "you are in autocommit mode until I tell you otherwise". > Instead of having to worry about trapping each commit and rollback > request to make sure you insert to proper begin. > > The current GUC autocommit is nice in that it makes it easier for the > cleint to simply turn on or off the state. It is a problem because it > is a GUC parameter and therefore can be changed by the admin (thus you > don't know what your initial state is without asking the server) or the > user (via sql SET, thus you don't know that it has changed). > > As I said in my earlier mail note from a jdbc perspective I can get it > to work which ever way is decided (in fact the jdbc driver will probably > need to support all of the ways, depending on if it is talking to a 7.2, > 7.3 or 7.4 backend). > > My preference (given that I am detecting a willingness to make more > significant changes in this area that I was expecting) would be to drop > the GUC autocommit parameter. Replacing that functionality with the > ability to set and discover the autocommit state via the FE/BE protocol. > > thanks, > --Barry > > Tom Lane wrote: > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > >>My concern is bloating the API for all languages based on libpq, and > >>psql and stuff like that. Heck, even pgadmin would have to have a > >>button for it because a SET couldn't control it. > > > > > > Peter's point, AIUI, is that that is a good thing. > > > > The problem with SET for autocommit is that every client program has to > > *know*, without question, which setting it is using. Autocommit is just > > about as dangerous as a GUC variable that, say, silently swaps the > > meanings of INSERT and DELETE --- if you don't know which setting you > > are using then you will get the wrong results. > > > > Thus it is not "convenient" to allow autocommit to be set in all the > > weird and wonderful ways that we allow GUC variables to be set; it's > > more like handing a razor blade to a baby. We've already found out that > > all client-side apps have to explicitly force autocommit to the setting > > they want, or they break. How is that a good thing? > > > > I think we *should* go back to the assumption that libpq-based programs > > see autocommit-on behavior unless they take specific action to prevent > > it. And that means that the client program has to take action to select > > autocommit off, not that the admin can flick a switch remotely that will > > affect clients. > > > > The real point is that both the client application and the client > > library need to know what the autocommit behavior is. This is why > > adding autocommit to the library APIs is the right thing, not the wrong > > thing. When there are ways other than a library API call to set the > > autocommit behavior, then one party or the other is out of the loop > > about what the current behavior is, and that gets us right back into the > > same mess. > > > > Basically I think that Peter is arguing that autocommit as a GUC > > variable is a wrong and dangerous idea. And I'm forced to agree. > > I was wrong to put it in, and I'm now willing to take it out again. > > At the time it seemed like a reasonable shortcut around changing the > > protocol --- but now that we're changing the protocol anyway, it's > > better to get rid of the shortcut. > > > > regards, tom lane > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 4: Don't 'kill -9' the postmaster > > > > > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Man, I lost another vote! :-) Happens to us all ;-) But this discussion is far from over ... not that many people have weighed in yet. Also, even if autocommit's fate is sealed, we still have to think about how to handle the other variables Barry identified as trouble spots. regards, tom lane
I was thinking if we had a SET PERMANENT that couldn't be changed, interfaces that want to control variables could set them perminantly, and others could let users control it. Actually, if we reported change to the client, the interface could override any change the user made --- just an idea. --------------------------------------------------------------------------- Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Man, I lost another vote! :-) > > Happens to us all ;-) > > But this discussion is far from over ... not that many people have > weighed in yet. Also, even if autocommit's fate is sealed, we still > have to think about how to handle the other variables Barry identified > as trouble spots. > > regards, tom lane > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: > I was thinking if we had a SET PERMANENT that couldn't be changed, > interfaces that want to control variables could set them perminantly, > and others could let users control it. SET PERMANENT only works for those variables that can only have one value for a given client protocol (for example datestyle in jdbc). Whereas autocommit needs to be changeable by the jdbc driver since the jdbc spec allows the user to call an API method to change the setting. > > Actually, if we reported change to the client, the interface could > override any change the user made --- just an idea. Reporting the change to the client seems sufficient for the client to take whatever actions are necessary. --Barry > > --------------------------------------------------------------------------- > > Tom Lane wrote: > >>Bruce Momjian <pgman@candle.pha.pa.us> writes: >> >>>Man, I lost another vote! :-) >> >>Happens to us all ;-) >> >>But this discussion is far from over ... not that many people have >>weighed in yet. Also, even if autocommit's fate is sealed, we still >>have to think about how to handle the other variables Barry identified >>as trouble spots. >> >> regards, tom lane >> > >
Barry Lind wrote: > > > Bruce Momjian wrote: > > I was thinking if we had a SET PERMANENT that couldn't be changed, > > interfaces that want to control variables could set them perminantly, > > and others could let users control it. > > SET PERMANENT only works for those variables that can only have one > value for a given client protocol (for example datestyle in jdbc). > Whereas autocommit needs to be changeable by the jdbc driver since the > jdbc spec allows the user to call an API method to change the setting. Oh, good point. > > > > > Actually, if we reported change to the client, the interface could > > override any change the user made --- just an idea. > > Reporting the change to the client seems sufficient for the client to > take whatever actions are necessary. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > ... So the application already knows > > that "foo" is the table and "a" is the column. So if the application > > wants to know about details on the column "a", it can execute > > SELECT whatever FROM pg_attribute, pg_class WHERE relname = 'foo' AND attname = 'a'; > > With this proposed change, it can replace that with > > SELECT whatever FROM pg_attribute, pg_class WHERE oid = X AND attnum = Y; > > Dave will correct me if I'm wrong --- but I think the issue here is that > the client-side library (think ODBC or JDBC) needs to gain this level of > understanding of a query that is presented to it as an SQL-source > string. So no, it doesn't already know that "foo" is the table and "a" > is the column. To find that out, it has to duplicate a lot of backend > code. Perhaps, rather than changing the protocol to include attrelid/attnum information for the query, we should instead implement a command that would yield the query's result information directly: fileinfo=> QUERY RESULTS SELECT * from files;classname | attname | atttype | classid | typeid | typemod -----------+------------+--------------------------+----------+--------+---------files | filename | character varying(1024) | 59422343 | 1043 | 1028files | mode | bit(32) | 59422343 | 1560 | 32files | size | bigint | 59422343 | 20 | -1files | uid | integer | 59422343 | 23 | -1files | gid | integer | 59422343 | 23 | -1files | createtime | timestamp with time zone | 59422343 | 1184 | -1files | modtime | timestamp with timezone | 59422343 | 1184 | -1files | device | integer | 59422343 | 23 | -1files | inode | integer | 59422343 | 23 | -1files | nlinks | integer | 59422343 | 23 | -1 (10 rows) Each tuple result of the QUERY RESULTS command (some other name for it could be selected, this is just an example) would describe a column that would be returned by the query being examined, and the tuples would be sent in the left-to-right order that the columns they describe would appear (or, alternatively, another column like attnum could be sent that numbers the columns, starting with 1). When a particular piece of information is unavailable, a NULL is sent in its place -- just as you'd expect. An example of such a column would be: fileinfo=> QUERY RESULTS SELECT CAST(1 AS integer), CAST(2 AS bigint);classname | attname | atttype | classid | typeid |typemod -----------+---------+---------+---------+--------+--------- | int4 | integer | | 23 | -1 | int8 | bigint | | 20 | -1 (2 rows) (psql shows NULLs as no value, so that's what I'm showing above). Anyway, it's just a thought, but it's something that could be used by literally everything. And, of course, QUERY RESULTS should be able to operate recursively, thus "QUERY RESULTS QUERY RESULTS ... SELECT ..." (which could be made a special case if necessary). The downside of this is that client libraries that wanted information about what a query would return would have to send two queries through the parser. But the upside is that you take that hit only if you need the information. And if you plan to issue a particular query a lot, you can issue the above command once and you're done. I have no idea how hard this would be to implement. I'm assuming that EXPLAIN goes through a lot of the same code paths that this does, so it may make sense to make this a variant of EXPLAIN (e.g., EXPLAIN RESULTS SELECT...). -- Kevin Brown kevin@sysexperts.com
I like this idea because it used our existing query API to return result information. Added to TODO: * Allow clients to get data types, typmod, schema.table.column names from result sets, either via the backend protocol ora new QUERYINFO command --------------------------------------------------------------------------- Kevin Brown wrote: > Tom Lane wrote: > > Peter Eisentraut <peter_e@gmx.net> writes: > > > ... So the application already knows > > > that "foo" is the table and "a" is the column. So if the application > > > wants to know about details on the column "a", it can execute > > > SELECT whatever FROM pg_attribute, pg_class WHERE relname = 'foo' AND attname = 'a'; > > > With this proposed change, it can replace that with > > > SELECT whatever FROM pg_attribute, pg_class WHERE oid = X AND attnum = Y; > > > > Dave will correct me if I'm wrong --- but I think the issue here is that > > the client-side library (think ODBC or JDBC) needs to gain this level of > > understanding of a query that is presented to it as an SQL-source > > string. So no, it doesn't already know that "foo" is the table and "a" > > is the column. To find that out, it has to duplicate a lot of backend > > code. > > Perhaps, rather than changing the protocol to include attrelid/attnum > information for the query, we should instead implement a command that > would yield the query's result information directly: > > fileinfo=> QUERY RESULTS SELECT * from files; > classname | attname | atttype | classid | typeid | typemod > -----------+------------+--------------------------+----------+--------+--------- > files | filename | character varying(1024) | 59422343 | 1043 | 1028 > files | mode | bit(32) | 59422343 | 1560 | 32 > files | size | bigint | 59422343 | 20 | -1 > files | uid | integer | 59422343 | 23 | -1 > files | gid | integer | 59422343 | 23 | -1 > files | createtime | timestamp with time zone | 59422343 | 1184 | -1 > files | modtime | timestamp with time zone | 59422343 | 1184 | -1 > files | device | integer | 59422343 | 23 | -1 > files | inode | integer | 59422343 | 23 | -1 > files | nlinks | integer | 59422343 | 23 | -1 > (10 rows) > > Each tuple result of the QUERY RESULTS command (some other name for it > could be selected, this is just an example) would describe a column > that would be returned by the query being examined, and the tuples > would be sent in the left-to-right order that the columns they > describe would appear (or, alternatively, another column like attnum > could be sent that numbers the columns, starting with 1). > > When a particular piece of information is unavailable, a NULL is sent > in its place -- just as you'd expect. An example of such a column > would be: > > fileinfo=> QUERY RESULTS SELECT CAST(1 AS integer), CAST(2 AS bigint); > classname | attname | atttype | classid | typeid | typemod > -----------+---------+---------+---------+--------+--------- > | int4 | integer | | 23 | -1 > | int8 | bigint | | 20 | -1 > (2 rows) > > > (psql shows NULLs as no value, so that's what I'm showing above). > > > > Anyway, it's just a thought, but it's something that could be used by > literally everything. And, of course, QUERY RESULTS should be able to > operate recursively, thus "QUERY RESULTS QUERY RESULTS ... SELECT ..." > (which could be made a special case if necessary). > > The downside of this is that client libraries that wanted information > about what a query would return would have to send two queries through > the parser. But the upside is that you take that hit only if you need > the information. And if you plan to issue a particular query a lot, > you can issue the above command once and you're done. > > I have no idea how hard this would be to implement. I'm assuming that > EXPLAIN goes through a lot of the same code paths that this does, so > it may make sense to make this a variant of EXPLAIN (e.g., EXPLAIN > RESULTS SELECT...). > > > > > -- > Kevin Brown kevin@sysexperts.com > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Added to TODO: > * Allow clients to get data types, typmod, schema.table.column names from > result sets, either via the backend protocol or a new QUERYINFO command Uh ... this is already done, at least as far as the protocol is concerned. We do need to backfill support in libpq, etc. regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Added to TODO: > > > * Allow clients to get data types, typmod, schema.table.column names from > > result sets, either via the backend protocol or a new QUERYINFO command > > Uh ... this is already done, at least as far as the protocol is > concerned. We do need to backfill support in libpq, etc. Oh, really. That is great. I didn't realize you had handled that. I will update the TODO: * Update clients to use data types, typmod, schema.table.column names of result sets using new query protocol I assume we will need new libpq functions? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Tom Lane wrote: >> Uh ... this is already done, at least as far as the protocol is >> concerned. We do need to backfill support in libpq, etc. > I assume we will need new libpq functions? Yeah. regards, tom lane