Thread: ECPG FETCH readahead
Hi, we improved ECPG quite a lot in 9.0 because we worked and still working with an Informix to PostgreSQL migration project. We came across a pretty big performance problem that can be seen in every "naive" application that uses only FETCH 1, FETCH RELATIVE or FETCH ABSOLUTE. These are almost the only FETCH variations usable in Informix, i.e. it doesn't have the grammar for fetching N rows at once. Instead, the Client SDK libraries do caching themselves behind the scenes to reduce network turnaround time. This is what we implemented for ECPG, so by default it fetches 256 rows at once if possible and serves the application from memory. The number of cached rows can be changed using the ECPGFETCHSZ environment variable. The cursor readahead is activated by "ecpg -r fetch_readahead". The implementation splits ECPGdo() and ecpg_execute() in ecpglib/execute.c so the different parts are callable from the newly introduced cursor.c code. Three new API calls are introduced: ECPGopen(), ECPGfetch() and ECPGclose(). Obviously, OPEN and CLOSE use ECPGopen() and ECPGclose(), respectively. They build and tear down the cache structures besides calling the main ECPGdo() behind the scenes. ECPGopen() also discovers the total number of records in the recordset, so the previous ECPG "deficiency" (backend limitation) that sqlca.sqlerrd[2] didn't report the (possibly estimated) number of rows in the resultset is now overcome. This slows down OPEN for cursors serving larger datasets but it makes possible to position the readahead window using MOVE ABSOLUTE no matter what FORWARD/BACKWARD/ABSOLUTE/RELATIVE variants are used by the application. And the caching is more than overweighs the slowdown in OPEN it seems. ECPGfetch() is the more interesting one, this handles FETCH and MOVE statements and follows the absolute position of the cursor in the client, too. In Informix, the DECLARE statement is used for creating a cursor descriptor, it can be OPENed/CLOSEd several times and the "FREE curname" statement tears down the cursor descriptor. In our implementation, OPEN and CLOSE sets up and tears down the caching structure, The DECLARE statement didn't lose its declarative nature and the FREE statement is still only usable only for prepared statements. I chose this path because this way this feature can be used in native mode as well. It is usable even if the application itself uses FETCH N. The readahead window can be set externally to the application to squeeze out more performance in batch programs. The patch size is over 2MB because I introduced a new regression test called fetch2.pgc that does a lot of work on a recordset having 400 rows. It browses the recordset back and forth with: - FETCH FORWARD 1/FETCH BACKWARD 1 - FETCH FORWARD 5/FETCH BACKWARD 5 - FETCH ABSOLUTE +N/FETCH ABSOLUTE -N - FETCH FORWARD 3+MOVE FORWARD 7, also backwards - FETCH RELATIVE +2/FETCH RELATIVE -2 This test is compiled both with and without "-r fetch_readahead", so I was able to verify that the two runs produce the same output. Also, fetch.pgc, dyntest.pgc and sqlda.pgc are also compiled with and without "-r fetch_readahead", for verifying that both SQL and SQLDA descriptors are working the same way as before. E.g. PGresult for SQL descriptors are not simply assigned anymore, they are copied using PQcopyResult() without tuples and a bunch of PQsetvalue() calls to copy only the proper rows from the cache or all rows if no cache. The split parts of ecpg_execute() are intentionally kept the original wording (especially the "ecpg_execute" function name) in ecpg_log() messages to eliminate any impact on other regression tests. If this is not desired, a patch for this can come later. Because of the patch size, the compressed version is attached. Comments? Best regards, Zoltán Böszörményi -- Bible has answers for everything. Proof: "But let your communication be, Yea, yea; Nay, nay: for whatsoever is more than these cometh of evil." (Matthew 5:37) - basics of digital technology. "May your kingdom come" - superficial description of plate tectonics ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH http://www.postgresql.at/
Attachment
Boszormenyi Zoltan wrote: > Hi, > > we improved ECPG quite a lot in 9.0 because we worked and > still working with an Informix to PostgreSQL migration project. > > We came across a pretty big performance problem that can be seen in > every "naive" application that uses only FETCH 1, FETCH RELATIVE > or FETCH ABSOLUTE. These are almost the only FETCH variations > usable in Informix, i.e. it doesn't have the grammar for fetching N rows > at once. Instead, the Client SDK libraries do caching themselves > behind the scenes to reduce network turnaround time. I assume our ecpg version supports >1 fetch values, even in Informix mode. Does it make sense to add lots of code to our ecpg then? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. +
Hi, 2010-06-23 22:42 keltezéssel, Bruce Momjian írta: > Boszormenyi Zoltan wrote: > >> Hi, >> >> we improved ECPG quite a lot in 9.0 because we worked and >> still working with an Informix to PostgreSQL migration project. >> >> We came across a pretty big performance problem that can be seen in >> every "naive" application that uses only FETCH 1, FETCH RELATIVE >> or FETCH ABSOLUTE. These are almost the only FETCH variations >> usable in Informix, i.e. it doesn't have the grammar for fetching N rows >> at once. Instead, the Client SDK libraries do caching themselves >> behind the scenes to reduce network turnaround time. >> > I assume our ecpg version supports>1 fetch values, even in Informix > mode. Does it make sense to add lots of code to our ecpg then? > I think, yes, it does make sense. Because we are talking about porting a whole lot of COBOL applications. The ESQL/C or ECPG connector was already written the Informix quirks in mind, so it fetches only one record at a time passing it to the application. And similar performance is expected from ECPG - which excpectation is not fulfilled currently because libecpg doesn't do the same caching as ESQL/C does. And FYI, I haven't added a whole lot of code, most of the code changes in the patch is execute.c refactoring. ECPGdo() was split into several functions, the new parts are still doing the same things. I can make the test case much smaller, I only needed to test crossing the readahead window boundary. This would also make the patch much smaller. And this readahead is not on by default, it's only activated by "ecpg -r fetch_readahead". Best regards, Zoltán Böszörményi
On 24/06/10 10:27, Böszörményi Zoltán wrote: > And this readahead is not on by default, it's only activated > by "ecpg -r fetch_readahead". Is there a reason not to enable it by default? I'm a bit worried that it will receive no testing if it's not always on. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
2010-06-24 11:04 keltezéssel, Heikki Linnakangas írta: > On 24/06/10 10:27, Böszörményi Zoltán wrote: >> And this readahead is not on by default, it's only activated >> by "ecpg -r fetch_readahead". > > Is there a reason not to enable it by default? I'm a bit worried that > it will receive no testing if it's not always on. Because in the first step I wanted to minimize the impact on regression test stderr results. This is what I mentioned in the initial mail, I stuck to the original wording of ecpg_log() messages in the split-up parts of the original ECPGdo() and ecpg_execute() exactly for this reason. The usual policy for ecpg_log() is to report the function name where it was issued. I was also thinking about a new feature for pg_regress, to compare stdout results of two regression tests automatically so a difference can be reported as an error. It would be good for automated testing of features in ECPG that can be toggled, like auto-prepare and fetch readahead. It might come in handy in other subsystems, too. Best regards, Zoltán Böszörményi
On Wed, Jun 23, 2010 at 04:42:37PM -0400, Bruce Momjian wrote: > I assume our ecpg version supports >1 fetch values, even in Informix > mode. Does it make sense to add lots of code to our ecpg then? Yes, it does. The big question that zoltan and I haven't figured out yet, is whether it makes sense to add all this even for native ecpg mode. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org ICQ 179140304, AIM/Yahoo/Skype michaelmeskes, Jabber meskes@jabber.org VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
> I think, yes, it does make sense. Because we are talking > about porting a whole lot of COBOL applications. COBOL??? > The ESQL/C or ECPG connector was already written > the Informix quirks in mind, so it fetches only one record > at a time passing it to the application. And similar performance > is expected from ECPG - which excpectation is not fulfilled > currently because libecpg doesn't do the same caching as > ESQL/C does. Eh, you are talking about a program you wrote for your customer or they wrote themselves, right? I simply refuse to add this stuff only to fix this situation for that one customer of yours if it only hits them. Now the thing to discuss is how common is this situation. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org ICQ 179140304, AIM/Yahoo/Skype michaelmeskes, Jabber meskes@jabber.org VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
On Thu, Jun 24, 2010 at 12:04:30PM +0300, Heikki Linnakangas wrote: > Is there a reason not to enable it by default? I'm a bit worried > that it will receive no testing if it's not always on. Yes, there is a reason, namely that you don't need it in native mode at all. ECPG can read as many records as you want in one go, something ESQL/C apparently cannot do. This patch is a workaround for that restriction. I still do not really see that this feature gives us an advantage in native mode though. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org ICQ 179140304, AIM/Yahoo/Skype michaelmeskes, Jabber meskes@jabber.org VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
On Jun 24, 2010, at 2:13 PM, Michael Meskes wrote: >> I think, yes, it does make sense. Because we are talking >> about porting a whole lot of COBOL applications. > > COBOL??? > yes, COBOL :). it is much more common than people think. it is not the first COBOL request for PostgreSQL hitting my desk. in our concrete example we are using a C module written with ECPG which is magically attached to tons of COBOL code ... >> The ESQL/C or ECPG connector was already written >> the Informix quirks in mind, so it fetches only one record >> at a time passing it to the application. And similar performance >> is expected from ECPG - which excpectation is not fulfilled >> currently because libecpg doesn't do the same caching as >> ESQL/C does. > > Eh, you are talking about a program you wrote for your customer or they wrote > themselves, right? I simply refuse to add this stuff only to fix this situation > for that one customer of yours if it only hits them. Now the thing to discuss > is how common is this situation. > > Michael i think that this cursor issue is a pretty common thing for many codes. people are usually not aware of the fact that network round trips and parsing which are naturally related to "FETCH 1" area lot more expensive than fetching one row somewhere deep inside the DB engine. out there there are many applications which fetch data row by row. if an app fetches data row by row in PostgreSQL it willbe A LOT slower than in, say, Informix because most commercial database clients will cache data inside a cursor behindthe scenes to avoid the problem we try to solve. currently we are talking about a performance penalty of factor 5 or so. so - it is not a small thing; it is a big difference. regards, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de
2010-06-24 14:13 keltezéssel, Michael Meskes írta: >> I think, yes, it does make sense. Because we are talking >> about porting a whole lot of COBOL applications. >> > COBOL??? > Yes, OpenCOBOL... >> The ESQL/C or ECPG connector was already written >> the Informix quirks in mind, so it fetches only one record >> at a time passing it to the application. And similar performance >> is expected from ECPG - which excpectation is not fulfilled >> currently because libecpg doesn't do the same caching as >> ESQL/C does. >> > Eh, you are talking about a program you wrote for your customer or they wrote > themselves, right? I simply refuse to add this stuff only to fix this situation > for that one customer of yours if it only hits them. Now the thing to discuss > is how common is this situation. > The OpenCOBOL database connector was written by them but the problem is more generic. There are many "naive" applications (elsewhere, too) using cursors but fetching one record at a time perhaps for portability reasons. This patch provides a big performance boost for those. Best regards, Zoltán Böszörményi
On Thu, Jun 24, 2010 at 8:19 AM, Michael Meskes <meskes@postgresql.org> wrote: > On Thu, Jun 24, 2010 at 12:04:30PM +0300, Heikki Linnakangas wrote: >> Is there a reason not to enable it by default? I'm a bit worried >> that it will receive no testing if it's not always on. > > Yes, there is a reason, namely that you don't need it in native mode at all. > ECPG can read as many records as you want in one go, something ESQL/C > apparently cannot do. This patch is a workaround for that restriction. I still > do not really see that this feature gives us an advantage in native mode > though. So, who has the next action on this patch? Does Zoltan need to revise it, or does Michael need to review it, or where are we? Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi, Robert Haas írta: > On Thu, Jun 24, 2010 at 8:19 AM, Michael Meskes <meskes@postgresql.org> wrote: > >> On Thu, Jun 24, 2010 at 12:04:30PM +0300, Heikki Linnakangas wrote: >> >>> Is there a reason not to enable it by default? I'm a bit worried >>> that it will receive no testing if it's not always on. >>> >> Yes, there is a reason, namely that you don't need it in native mode at all. >> ECPG can read as many records as you want in one go, something ESQL/C >> apparently cannot do. This patch is a workaround for that restriction. I still >> do not really see that this feature gives us an advantage in native mode >> though. >> > > So, who has the next action on this patch? Does Zoltan need to revise > it, or does Michael need to review it, or where are we? > Michael reviewed it shortly in private and I need to send a new revision anyway, regardless of his comments. I will refresh it ASAP. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
2010-10-14 11:56 keltezéssel, Boszormenyi Zoltan írta: > Hi, > > Robert Haas írta: >> On Thu, Jun 24, 2010 at 8:19 AM, Michael Meskes <meskes@postgresql.org> wrote: >> >>> On Thu, Jun 24, 2010 at 12:04:30PM +0300, Heikki Linnakangas wrote: >>> >>>> Is there a reason not to enable it by default? I'm a bit worried >>>> that it will receive no testing if it's not always on. >>>> >>> Yes, there is a reason, namely that you don't need it in native mode at all. >>> ECPG can read as many records as you want in one go, something ESQL/C >>> apparently cannot do. This patch is a workaround for that restriction. I still >>> do not really see that this feature gives us an advantage in native mode >>> though. >>> >> So, who has the next action on this patch? Does Zoltan need to revise >> it, or does Michael need to review it, or where are we? >> > Michael reviewed it shortly in private and I need to send > a new revision anyway, regardless of his comments. > I will refresh it ASAP. The ASAP took a little long. The attached patch is in git diff format, because (1) the regression test intentionally doesn't do ECPGdebug() so the patch isn't dominated by a 2MB stderr file, so this file is empty and (2) regular diff cannot cope with empty new files. Anyway, here is the new version with a new feature. Implementation details: - New ecpglib/cursor.c handles the client-side accounting of cursors. - Functions in ecpglib/execute.c are split into smaller functions so useful operations can be used by the new cursor.c - ecpg -r fetch_readahead enables readahead by default for all cursors. - Default readahead size is 256 rows, it can be modified by an environment variable, ECPGFETCHSZ. - *NEW FEATURE* Readahead can be individually enabled or disabled by ECPG-side grammar: DECLARE curname [ [ NO ] READAHEAD ] CURSOR FOR ... Without [ NO ] READAHEAD, the default behaviour is used for cursors. - Since the server and the client may disagree on the cursor position if readahead is used, ECPG preprocessor throws an error if WHERE CURRENT OF is used on such cursors. - The default assumed behaviour of cursors with readahead is NO SCROLL. If you want readahead and backward scrolling, SCROLL keyword is mandatory. The regression test creates a table with 513 rows, so it spans 2 full and 1 incomplete readahead window. It reads the table with two cursors, one with readahead and one without by 1 records forward and backward. This is repeated with reading 5 records at a time. Then the whole table is read into a chain of sqlda structures forward and backward. All other regression tests pass as well. The original regression tests also pass with these changes, the split of execute.c was risky in this regard. Now the split is done more cleanly than in the previous version, the file is not as rearranged as before. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
Hi,
2011-11-16 20:51 keltezéssel, Boszormenyi Zoltan írta:
New patch attached against yesterday's GIT tree. Changes:
- documented the new ECPG_INVALID_CURSOR error code
- consistently free everything in error paths in cursor.c
2011-11-16 20:51 keltezéssel, Boszormenyi Zoltan írta:
2010-10-14 11:56 keltezéssel, Boszormenyi Zoltan írta:Hi, Robert Haas írta:On Thu, Jun 24, 2010 at 8:19 AM, Michael Meskes <meskes@postgresql.org> wrote:On Thu, Jun 24, 2010 at 12:04:30PM +0300, Heikki Linnakangas wrote:Is there a reason not to enable it by default? I'm a bit worried that it will receive no testing if it's not always on.Yes, there is a reason, namely that you don't need it in native mode at all. ECPG can read as many records as you want in one go, something ESQL/C apparently cannot do. This patch is a workaround for that restriction. I still do not really see that this feature gives us an advantage in native mode though.So, who has the next action on this patch? Does Zoltan need to revise it, or does Michael need to review it, or where are we?Michael reviewed it shortly in private and I need to send a new revision anyway, regardless of his comments. I will refresh it ASAP.The ASAP took a little long. The attached patch is in git diff format, because (1) the regression test intentionally doesn't do ECPGdebug() so the patch isn't dominated by a 2MB stderr file, so this file is empty and (2) regular diff cannot cope with empty new files. Anyway, here is the new version with a new feature. Implementation details: - New ecpglib/cursor.c handles the client-side accounting of cursors. - Functions in ecpglib/execute.c are split into smaller functions so useful operations can be used by the new cursor.c - ecpg -r fetch_readahead enables readahead by default for all cursors. - Default readahead size is 256 rows, it can be modified by an environment variable, ECPGFETCHSZ. - *NEW FEATURE* Readahead can be individually enabled or disabled by ECPG-side grammar: DECLARE curname [ [ NO ] READAHEAD ] CURSOR FOR ... Without [ NO ] READAHEAD, the default behaviour is used for cursors. - Since the server and the client may disagree on the cursor position if readahead is used, ECPG preprocessor throws an error if WHERE CURRENT OF is used on such cursors. - The default assumed behaviour of cursors with readahead is NO SCROLL. If you want readahead and backward scrolling, SCROLL keyword is mandatory. The regression test creates a table with 513 rows, so it spans 2 full and 1 incomplete readahead window. It reads the table with two cursors, one with readahead and one without by 1 records forward and backward. This is repeated with reading 5 records at a time. Then the whole table is read into a chain of sqlda structures forward and backward. All other regression tests pass as well. The original regression tests also pass with these changes, the split of execute.c was risky in this regard. Now the split is done more cleanly than in the previous version, the file is not as rearranged as before.
New patch attached against yesterday's GIT tree. Changes:
- documented the new ECPG_INVALID_CURSOR error code
- consistently free everything in error paths in cursor.c
Best regards, Zoltán Böszörményi
-- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
On Thu, Dec 29, 2011 at 10:46:23AM +0100, Boszormenyi Zoltan wrote: > 2011-11-16 20:51 keltez?ssel, Boszormenyi Zoltan ?rta: > > 2010-10-14 11:56 keltez?ssel, Boszormenyi Zoltan ?rta: > >>> On Thu, Jun 24, 2010 at 8:19 AM, Michael Meskes <meskes@postgresql.org> wrote: > >>>> On Thu, Jun 24, 2010 at 12:04:30PM +0300, Heikki Linnakangas wrote: > >>>>> Is there a reason not to enable it by default? I'm a bit worried > >>>>> that it will receive no testing if it's not always on. > >>>>> > >>>> Yes, there is a reason, namely that you don't need it in native mode at all. > >>>> ECPG can read as many records as you want in one go, something ESQL/C > >>>> apparently cannot do. This patch is a workaround for that restriction. I still > >>>> do not really see that this feature gives us an advantage in native mode > >>>> though. We yet lack a consensus on whether native ECPG apps should have access to the feature. My 2c is to make it available, because it's useful syntactic sugar. If your program independently processes each row of an arbitrary-length result set, current facilities force you to add an extra outer loop to batch the FETCHes for every such code site. Applications could define macros to abstract that pattern, but this seems common-enough to justify bespoke handling. Besides, minimalists already use libpq directly. I suggest enabling the feature by default but drastically reducing the default readahead chunk size from 256 to, say, 5. That still reduces the FETCH round trip overhead by 80%, but it's small enough not to attract pathological behavior on a workload where each row is a 10 MiB document. I would not offer an ecpg-time option to disable the feature per se. Instead, let the user set the default chunk size at ecpg time. A setting of 1 effectively disables the feature, though one could later re-enable it with ECPGFETCHSZ. > > The ASAP took a little long. The attached patch is in git diff format, > > because (1) the regression test intentionally doesn't do ECPGdebug() > > so the patch isn't dominated by a 2MB stderr file, so this file is empty > > and (2) regular diff cannot cope with empty new files. Avoid the empty file with fprintf(STDERR, "Dummy non-empty error output\n"); > > - *NEW FEATURE* Readahead can be individually enabled or disabled > > by ECPG-side grammar: > > DECLARE curname [ [ NO ] READAHEAD ] CURSOR FOR ... > > Without [ NO ] READAHEAD, the default behaviour is used for cursors. Likewise, this may as well take a chunk size rather than a yes/no. The patch adds warnings: error.c: In function `ecpg_raise': error.c:281: warning: format not a string literal and no format arguments error.c:281: warning: format not a string literal and no format arguments The patch adds few comments and no larger comments explaining its higher-level ideas. That makes it much harder to review. In this regard it follows the tradition of the ECPG code, but let's depart from that tradition for new work. I mention a few cases below where the need for commentary is acute. I tested full reads of various "SELECT * FROM generate_series(1, $1)" commands over a 50ms link, and the patch gives a sound performance improvement. With no readahead, a mere N=100 takes 5s to drain. With readahead enabled, N=10000 takes only 3s. Performance was quite similar to that of implementing my own batching with "FETCH 256 FROM cur". When I kicked up ECPGFETCHSZ (after fixing its implementation -- see below) past the result set size, performance was comparable to that of simply passing the query through psql. > --- a/doc/src/sgml/ecpg.sgml > +++ b/doc/src/sgml/ecpg.sgml > @@ -5289,6 +5315,17 @@ while (1) > </varlistentry> > > <varlistentry> > + <term>-231 (<symbol>ECPG_INVALID_CURSOR</symbol>)</term> > + <listitem> > + <para> > + The cursor you are trying to use with readahead has not been opened yet (SQLSTATE 34000), > + invalid values were passed to libecpg (SQLSTATE 42P11) hor not in prerequisite state, i.e. Typo. > --- /dev/null > +++ b/src/interfaces/ecpg/ecpglib/cursor.c > @@ -0,0 +1,730 @@ cursor.c contains various >78-col lines. pgindent has limited ability to improve those, so please split them. > +static struct cursor_descriptor * > +add_cursor(int lineno, struct connection *con, const char *name, bool scrollable, int64 n_tuples, bool *existing) > +{ > + struct cursor_descriptor *desc, > + *ptr, *prev = NULL; > + bool found = false; > + > + if (!name || name[0] == '\0') > + { > + if (existing) > + *existing = false; > + return NULL; > + } > + > + ptr = con->cursor_desc; > + while (ptr) > + { > + int ret = strcasecmp(ptr->name, name); > + > + if (ret == 0) > + { > + found = true; > + break; > + } > + if (ret > 0) > + break; > + > + prev = ptr; > + ptr = ptr->next; > + } Any reason not to use find_cursor() here? > +static void > +del_cursor(struct connection *con, const char *name) > +{ > + struct cursor_descriptor *ptr, *prev = NULL; > + bool found = false; > + > + ptr = con->cursor_desc; > + while (ptr) > + { > + int ret = strcasecmp(ptr->name, name); > + > + if (ret == 0) > + { > + found = true; > + break; > + } > + if (ret > 0) > + break; > + > + prev = ptr; > + ptr = ptr->next; > + } Any reason not to use find_cursor() here? > +bool > +ECPGopen(const int lineno, const int compat, const int force_indicator, > + const char *connection_name, const bool questionmarks, > + const char *curname, const int st, const char *query, ...) > +{ > + va_list args; > + bool ret, scrollable; > + char *new_query, *ptr, *whold, *noscroll, *scroll, *dollar0; > + struct sqlca_t *sqlca = ECPGget_sqlca(); > + > + if (!query) > + { > + ecpg_raise(lineno, ECPG_EMPTY, ECPG_SQLSTATE_ECPG_INTERNAL_ERROR, NULL); > + return false; > + } > + ptr = strstr(query, "for "); > + if (!ptr) > + { > + ecpg_raise(lineno, ECPG_INVALID_STMT, ECPG_SQLSTATE_ECPG_INTERNAL_ERROR, NULL); > + return false; > + } > + whold = strstr(query, "with hold "); > + dollar0 = strstr(query, "$0"); > + > + noscroll = strstr(query, "no scroll "); > + scroll = strstr(query, "scroll "); A query like 'SELECT 1 AS "with hold "' fools these lexical tests. Capture that information in the parser rather than attempting to reconstruct it here. > + scrollable = (noscroll == NULL) && (scroll != NULL) && (scroll < ptr); > + > + new_query = ecpg_alloc(strlen(curname) + strlen(ptr) + (whold ? 10 : 0) + 32, lineno); > + if (!new_query) > + return false; > + sprintf(new_query, "declare %s %s cursor %s%s", > + (dollar0 && (dollar0 < ptr) ? "$0" : curname), > + (scrollable ? "scroll" : "no scroll"), > + (whold ? "with hold " : ""), > + ptr); > + > + /* Set the fetch size the first time we are called. */ > + if (fetch_size == 0) > + { > + char *fsize_str = getenv("ECPGFETCHSZ"); > + char *endptr = NULL; > + int fsize; > + > + if (fsize_str) > + { > + fsize = strtoul(fsize_str, &endptr, 10); > + if (endptr || (fsize < 4)) > + fetch_size = DEFAULTFETCHSIZE; "endptr" will never be NULL; use "*endptr". As it stands, the code always ignores ECPGFETCHSZ. An unusable ECPGFETCHSZ should procedure an error, not silently give no effect. Why a minimum of 4? > + else > + fetch_size = fsize; > + } > + else > + fetch_size = DEFAULTFETCHSIZE; > + } > + > + va_start(args, query); > + ret = ecpg_do(lineno, compat, force_indicator, connection_name, questionmarks, st, new_query, args); > + va_end(args); > + > + ecpg_free(new_query); > + > + /* > + * If statement went OK, add the cursor and discover the > + * number of rows in the recordset. This will slow down OPEN > + * but we gain a lot with caching. > + */ > + if (ret /* && sqlca->sqlerrd[2] == 0 */) Why is the commented code there? > + { > + struct connection *con = ecpg_get_connection(connection_name); > + struct cursor_descriptor *cur; > + bool existing; > + int64 n_tuples; > + > + if (scrollable) > + { > + PGresult *res; > + char *query; > + char *endptr = NULL; > + > + query = ecpg_alloc(strlen(curname) + strlen("move all in ") + 2, lineno); > + sprintf(query, "move all in %s", curname); > + res = PQexec(con->connection, query); > + n_tuples = strtoull(PQcmdTuples(res), &endptr, 10); > + PQclear(res); > + ecpg_free(query); > + > + /* Go back to the beginning of the resultset. */ > + query = ecpg_alloc(strlen(curname) + strlen("move absolute 0 in ") + 2, lineno); > + sprintf(query, "move absolute 0 in %s", curname); > + res = PQexec(con->connection, query); > + PQclear(res); > + ecpg_free(query); > + } > + else > + { > + n_tuples = 0; > + } You give this rationale for the above code: On Thu, Jun 17, 2010 at 02:09:47PM +0200, Boszormenyi Zoltan wrote: > ECPGopen() also discovers the total number of records in the recordset, > so the previous ECPG "deficiency" (backend limitation) that sqlca.sqlerrd[2] > didn't report the (possibly estimated) number of rows in the resultset > is now > overcome. This slows down OPEN for cursors serving larger datasets > but it makes possible to position the readahead window using MOVE > ABSOLUTE no matter what FORWARD/BACKWARD/ABSOLUTE/RELATIVE > variants are used by the application. And the caching is more than > overweighs > the slowdown in OPEN it seems. From the documentation for Informix and Oracle, those databases do not populate sqlerrd[2] this way: http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqlt.doc/sqltmst189.htm http://docs.oracle.com/cd/A57673_01/DOC/api/doc/PC_22/ch10.htm#toc139 The performance impact will vary widely depending on the query cost per row and the fraction of rows the application will actually retrieve. Consider a complex aggregate returning only a handful of rows. Consider SELECT * on a 1B-row table with the application ceasing reads after 1000 rows. Performance aside, this will yield double execution of any volatile functions involved. So, I think we ought to diligently avoid this step. (Failing that, the documentation must warn about the extra full cursor scan and this feature must stay disabled by default.) > + > + /* Add the cursor */ > + cur = add_cursor(lineno, con, curname, scrollable, n_tuples, &existing); > + > + /* > + * Report the number of tuples for the [scrollable] cursor. > + * The server didn't do it for us. > + */ > + sqlca->sqlerrd[2] = (cur->n_tuples < LONG_MAX ? cur->n_tuples : LONG_MAX); > + } > + > + return ret; > +} > + > +static bool > +ecpg_cursor_execute(struct statement * stmt, struct cursor_descriptor *cur) > +{ > + char tmp[64]; > + char *query; > + int64 start_pos; > + > + if ((cur->cache_pos >= cur->start_pos) && cur->res && (cur->cache_pos < cur->start_pos + PQntuples(cur->res))) > + { > + stmt->results = cur->res; > + ecpg_free_params(stmt, true, stmt->lineno); > + return true; > + } Why does ecpg_cursor_execute() also call ecpg_free_params()? Offhand, it seems that ECPGfetch() always takes care of that and is the more appropriate place, seeing it's the one calling ecpg_build_params(). > --- a/src/interfaces/ecpg/ecpglib/data.c > +++ b/src/interfaces/ecpg/ecpglib/data.c > @@ -120,7 +120,7 @@ check_special_value(char *ptr, double *retval, char **endptr) > } > > bool > -ecpg_get_data(const PGresult *results, int act_tuple, int act_field, int lineno, > +ecpg_get_data(const PGresult *results, int var_index, int act_tuple, int act_field, int lineno, > enum ECPGttype type, enum ECPGttype ind_type, > char *var, char *ind, long varcharsize, long offset, > long ind_offset, enum ARRAY_TYPE isarray, enum COMPAT_MODE compat, bool force_indicator) This function could sure use a block comment such as would be customary in src/backend. Compare the one at heap_update(), for example. > --- a/src/interfaces/ecpg/ecpglib/error.c > +++ b/src/interfaces/ecpg/ecpglib/error.c > @@ -268,6 +268,20 @@ ecpg_raise(int line, int code, const char *sqlstate, const char *str) > ecpg_gettext("could not connect to database \"%s\" on line %d"), str, line); > break; > > + case ECPG_INVALID_CURSOR: > + if (strcmp(sqlstate, ECPG_SQLSTATE_OBJECT_NOT_IN_PREREQUISITE_STATE) == 0) > + snprintf(sqlca->sqlerrm.sqlerrmc, sizeof(sqlca->sqlerrm.sqlerrmc), > + > + /* > + * translator: this string will be truncated at 149 characters > + * expanded. > + */ > + ecpg_gettext("cursor can only scan forward")); Every other message works in the line number somehow; this should do the same. > --- a/src/interfaces/ecpg/ecpglib/execute.c > +++ b/src/interfaces/ecpg/ecpglib/execute.c > @@ -1707,46 +1809,20 @@ ecpg_execute(struct statement * stmt) > } > > bool > -ECPGdo(const int lineno, const int compat, const int force_indicator, const char *connection_name, const bool questionmarks,const int st, const char *query,...) > +ecpg_do_prologue(int lineno, const int compat, const int force_indicator, > + const char *connection_name, const bool questionmarks, > + enum ECPG_statement_type statement_type, const char *query, > + va_list args, struct statement **stmt_out) A block comment would especially help this function, considering the name tells one so little. > --- a/src/interfaces/ecpg/ecpglib/extern.h > +++ b/src/interfaces/ecpg/ecpglib/extern.h > @@ -60,6 +60,12 @@ struct statement > bool questionmarks; > struct variable *inlist; > struct variable *outlist; > + char *oldlocale; > + const char **dollarzero; > + int ndollarzero; > + const char **param_values; > + int nparams; > + PGresult *results; > }; Please comment the members of this struct like we do in most of src/include. dollarzero has something to do with dynamic cursor names, right? Does it have other roles? > --- a/src/interfaces/ecpg/preproc/ecpg.header > +++ b/src/interfaces/ecpg/preproc/ecpg.header > @@ -111,6 +115,26 @@ mmerror(int error_code, enum errortype type, const char *error, ...) > } > > /* > + * set use_fetch_readahead based on the current cursor > + * doesn't return if the cursor is not declared > + */ > +static void > +set_cursor_readahead(const char *curname) > +{ > + struct cursor *ptr; > + > + for (ptr = cur; ptr != NULL; ptr = ptr->next) > + { > + if (strcmp(ptr->name, curname) == 0) > + break; > + } > + if (!ptr) > + mmerror(PARSE_ERROR, ET_FATAL, "cursor \"%s\" does not exist", curname); > + > + use_fetch_readahead = ptr->fetch_readahead; > +} Following add_additional_variables(), use strcasecmp() for literal cursor names and strcmp() for cursor name host variables. > --- a/src/interfaces/ecpg/preproc/extern.h > +++ b/src/interfaces/ecpg/preproc/extern.h > @@ -24,12 +24,17 @@ extern bool autocommit, > force_indicator, > questionmarks, > regression_mode, > - auto_prepare; > + auto_prepare, > + fetch_readahead; > +extern bool use_fetch_readahead; The names of the last two variables don't make clear the difference between them. I suggest default_fetch_readahead and current_fetch_readahead. > --- /dev/null > +++ b/src/interfaces/ecpg/test/preproc/cursor-readahead.pgc > @@ -0,0 +1,244 @@ > +#include <stdio.h> > +#include <malloc.h> Why <malloc.h>? I only see this calling free(); use <stdlib.h> instead. Thanks, nm
Hi, first, thank you for answering and for the review. 2012-03-02 17:41 keltezéssel, Noah Misch írta: > On Thu, Dec 29, 2011 at 10:46:23AM +0100, Boszormenyi Zoltan wrote: >> 2011-11-16 20:51 keltez?ssel, Boszormenyi Zoltan ?rta: >>> 2010-10-14 11:56 keltez?ssel, Boszormenyi Zoltan ?rta: >>>>> On Thu, Jun 24, 2010 at 8:19 AM, Michael Meskes <meskes@postgresql.org> wrote: >>>>>> On Thu, Jun 24, 2010 at 12:04:30PM +0300, Heikki Linnakangas wrote: >>>>>>> Is there a reason not to enable it by default? I'm a bit worried >>>>>>> that it will receive no testing if it's not always on. >>>>>>> >>>>>> Yes, there is a reason, namely that you don't need it in native mode at all. >>>>>> ECPG can read as many records as you want in one go, something ESQL/C >>>>>> apparently cannot do. This patch is a workaround for that restriction. I still >>>>>> do not really see that this feature gives us an advantage in native mode >>>>>> though. > We yet lack a consensus on whether native ECPG apps should have access to the > feature. I don't even remember about any opinion on this matter. So, at this point don't know whether it's lack of interest. We also have a saying "silence means agreement". :-) > My 2c is to make it available, because it's useful syntactic sugar. Thanks, we thought the same. > If your program independently processes each row of an arbitrary-length result > set, current facilities force you to add an extra outer loop to batch the > FETCHes for every such code site. Applications could define macros to > abstract that pattern, but this seems common-enough to justify bespoke > handling. We have similar opinions. > Besides, minimalists already use libpq directly. Indeed. On the other hand, ECPG provides a safety net with syntax checking so it's useful for not minimalist types. :-) > I suggest enabling the feature by default but drastically reducing the default > readahead chunk size from 256 to, say, 5. > That still reduces the FETCH round > trip overhead by 80%, but it's small enough not to attract pathological > behavior on a workload where each row is a 10 MiB document. I see. How about 8? Nice "round" power of 2 value, still small and avoids 87.5% of overhead. BTW, the default disabled behaviour was to avoid "make check" breakage, see below. > I would not offer > an ecpg-time option to disable the feature per se. Instead, let the user set > the default chunk size at ecpg time. A setting of 1 effectively disables the > feature, though one could later re-enable it with ECPGFETCHSZ. This means all code previously going through ECPGdo() would go through ECPGopen()/ECPGfetch()/ECPGclose(). This is more intrusive and all regression tests that were only testing certain features would also test the readahead feature, too. Also, the test for WHERE CURRENT OF at ecpg time would have to be done at runtime, possibly making previously working code fail if ECPGFETCHSZ is enabled. How about still allowing "NO READAHEAD" cursors that compile into plain ECPGdo()? This way, ECPGFETCHSZ don't interfere with WHERE CURRENT OF. But this would mean code changes everywhere where WHERE CURRENT OF is used. Or how about a new feature in the backend, so ECPG can do UPDATE/DELETE ... WHERE OFFSET N OF cursor and the offset of computed from the actual cursor position and the position known by the application? This way an app can do readahead and do work on rows collected by the cursor with WHERE CURRENT OF which gets converted to WHERE OFFSET OF behind the scenes. > >>> The ASAP took a little long. The attached patch is in git diff format, >>> because (1) the regression test intentionally doesn't do ECPGdebug() >>> so the patch isn't dominated by a 2MB stderr file, so this file is empty >>> and (2) regular diff cannot cope with empty new files. > Avoid the empty file with fprintf(STDERR, "Dummy non-empty error output\n"); Fixed. > >>> - *NEW FEATURE* Readahead can be individually enabled or disabled >>> by ECPG-side grammar: >>> DECLARE curname [ [ NO ] READAHEAD ] CURSOR FOR ... >>> Without [ NO ] READAHEAD, the default behaviour is used for cursors. > Likewise, this may as well take a chunk size rather than a yes/no. Done. > The patch adds warnings: > error.c: In function `ecpg_raise': > error.c:281: warning: format not a string literal and no format arguments > error.c:281: warning: format not a string literal and no format arguments Fixed. > The patch adds few comments and no larger comments explaining its higher-level > ideas. That makes it much harder to review. In this regard it follows the > tradition of the ECPG code, but let's depart from that tradition for new work. > I mention a few cases below where the need for commentary is acute. Understood. Adding comments as I go over that code again. > I tested full reads of various "SELECT * FROM generate_series(1, $1)" commands > over a 50ms link, and the patch gives a sound performance improvement. With > no readahead, a mere N=100 takes 5s to drain. With readahead enabled, N=10000 > takes only 3s. Performance was quite similar to that of implementing my own > batching with "FETCH 256 FROM cur". When I kicked up ECPGFETCHSZ (after > fixing its implementation -- see below) past the result set size, performance > was comparable to that of simply passing the query through psql. > >> --- a/doc/src/sgml/ecpg.sgml >> +++ b/doc/src/sgml/ecpg.sgml >> @@ -5289,6 +5315,17 @@ while (1) >> </varlistentry> >> >> <varlistentry> >> + <term>-231 (<symbol>ECPG_INVALID_CURSOR</symbol>)</term> >> + <listitem> >> + <para> >> + The cursor you are trying to use with readahead has not been opened yet (SQLSTATE 34000), >> + invalid values were passed to libecpg (SQLSTATE 42P11) hor not in prerequisite state, i.e. > Typo. Fixed. > >> --- /dev/null >> +++ b/src/interfaces/ecpg/ecpglib/cursor.c >> @@ -0,0 +1,730 @@ > cursor.c contains various >78-col lines. pgindent has limited ability to > improve those, so please split them. > >> +static struct cursor_descriptor * >> +add_cursor(int lineno, struct connection *con, const char *name, bool scrollable, int64 n_tuples, bool *existing) >> +{ >> + struct cursor_descriptor *desc, >> + *ptr, *prev = NULL; >> + bool found = false; >> + >> + if (!name || name[0] == '\0') >> + { >> + if (existing) >> + *existing = false; >> + return NULL; >> + } >> + >> + ptr = con->cursor_desc; >> + while (ptr) >> + { >> + int ret = strcasecmp(ptr->name, name); >> + >> + if (ret == 0) >> + { >> + found = true; >> + break; >> + } >> + if (ret > 0) >> + break; >> + >> + prev = ptr; >> + ptr = ptr->next; >> + } > Any reason not to use find_cursor() here? Because both add_cursor() and del_cursor() needs the "prev" pointer. I now modified find_cursor() and they use it. > >> +static void >> +del_cursor(struct connection *con, const char *name) >> +{ >> + struct cursor_descriptor *ptr, *prev = NULL; >> + bool found = false; >> + >> + ptr = con->cursor_desc; >> + while (ptr) >> + { >> + int ret = strcasecmp(ptr->name, name); >> + >> + if (ret == 0) >> + { >> + found = true; >> + break; >> + } >> + if (ret > 0) >> + break; >> + >> + prev = ptr; >> + ptr = ptr->next; >> + } > Any reason not to use find_cursor() here? > >> +bool >> +ECPGopen(const int lineno, const int compat, const int force_indicator, >> + const char *connection_name, const bool questionmarks, >> + const char *curname, const int st, const char *query, ...) >> +{ >> + va_list args; >> + bool ret, scrollable; >> + char *new_query, *ptr, *whold, *noscroll, *scroll, *dollar0; >> + struct sqlca_t *sqlca = ECPGget_sqlca(); >> + >> + if (!query) >> + { >> + ecpg_raise(lineno, ECPG_EMPTY, ECPG_SQLSTATE_ECPG_INTERNAL_ERROR, NULL); >> + return false; >> + } >> + ptr = strstr(query, "for "); >> + if (!ptr) >> + { >> + ecpg_raise(lineno, ECPG_INVALID_STMT, ECPG_SQLSTATE_ECPG_INTERNAL_ERROR, NULL); >> + return false; >> + } >> + whold = strstr(query, "with hold "); >> + dollar0 = strstr(query, "$0"); >> + >> + noscroll = strstr(query, "no scroll "); >> + scroll = strstr(query, "scroll "); > A query like 'SELECT 1 AS "with hold "' fools these lexical tests. But SELECT 1 AS "with hold" doesn't go through ECPGopen(), it's run by ECPGdo() so no breakage there. ecpglib functions are not intended to be called from manually constructed C code. > Capture > that information in the parser rather than attempting to reconstruct it here. Okay, this makes sense anyway. > >> + scrollable = (noscroll == NULL) && (scroll != NULL) && (scroll < ptr); >> + >> + new_query = ecpg_alloc(strlen(curname) + strlen(ptr) + (whold ? 10 : 0) + 32, lineno); >> + if (!new_query) >> + return false; >> + sprintf(new_query, "declare %s %s cursor %s%s", >> + (dollar0 && (dollar0 < ptr) ? "$0" : curname), >> + (scrollable ? "scroll" : "no scroll"), >> + (whold ? "with hold " : ""), >> + ptr); >> + >> + /* Set the fetch size the first time we are called. */ >> + if (fetch_size == 0) >> + { >> + char *fsize_str = getenv("ECPGFETCHSZ"); >> + char *endptr = NULL; >> + int fsize; >> + >> + if (fsize_str) >> + { >> + fsize = strtoul(fsize_str, &endptr, 10); >> + if (endptr || (fsize < 4)) >> + fetch_size = DEFAULTFETCHSIZE; > "endptr" will never be NULL; use "*endptr". As it stands, the code always > ignores ECPGFETCHSZ. You're right. > An unusable ECPGFETCHSZ should procedure an error, not > silently give no effect. Point taken. Which error handling do imagine? abort() or simply returning false and raise and error in SQLCA? > Why a minimum of 4? I forgot. > >> + else >> + fetch_size = fsize; >> + } >> + else >> + fetch_size = DEFAULTFETCHSIZE; >> + } >> + >> + va_start(args, query); >> + ret = ecpg_do(lineno, compat, force_indicator, connection_name, questionmarks, st, new_query, args); >> + va_end(args); >> + >> + ecpg_free(new_query); >> + >> + /* >> + * If statement went OK, add the cursor and discover the >> + * number of rows in the recordset. This will slow down OPEN >> + * but we gain a lot with caching. >> + */ >> + if (ret /* && sqlca->sqlerrd[2] == 0 */) > Why is the commented code there? Some leftover from testing, it shouldn't be there. > >> + { >> + struct connection *con = ecpg_get_connection(connection_name); >> + struct cursor_descriptor *cur; >> + bool existing; >> + int64 n_tuples; >> + >> + if (scrollable) >> + { >> + PGresult *res; >> + char *query; >> + char *endptr = NULL; >> + >> + query = ecpg_alloc(strlen(curname) + strlen("move all in ") + 2, lineno); >> + sprintf(query, "move all in %s", curname); >> + res = PQexec(con->connection, query); >> + n_tuples = strtoull(PQcmdTuples(res), &endptr, 10); >> + PQclear(res); >> + ecpg_free(query); >> + >> + /* Go back to the beginning of the resultset. */ >> + query = ecpg_alloc(strlen(curname) + strlen("move absolute 0 in ") + 2, lineno); >> + sprintf(query, "move absolute 0 in %s", curname); >> + res = PQexec(con->connection, query); >> + PQclear(res); >> + ecpg_free(query); >> + } >> + else >> + { >> + n_tuples = 0; >> + } > You give this rationale for the above code: > > On Thu, Jun 17, 2010 at 02:09:47PM +0200, Boszormenyi Zoltan wrote: >> ECPGopen() also discovers the total number of records in the recordset, >> so the previous ECPG "deficiency" (backend limitation) that sqlca.sqlerrd[2] >> didn't report the (possibly estimated) number of rows in the resultset >> is now >> overcome. This slows down OPEN for cursors serving larger datasets >> but it makes possible to position the readahead window using MOVE >> ABSOLUTE no matter what FORWARD/BACKWARD/ABSOLUTE/RELATIVE >> variants are used by the application. And the caching is more than >> overweighs >> the slowdown in OPEN it seems. > From the documentation for Informix and Oracle, those databases do not > populate sqlerrd[2] this way: > http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqlt.doc/sqltmst189.htm > http://docs.oracle.com/cd/A57673_01/DOC/api/doc/PC_22/ch10.htm#toc139 The problem here is that Informix in the field in fact returns the number of rows in the cursor and the customer we developed this readahead code for relied on this. Maybe this was eliminated in newer versions of Informix to make it faster. > The performance impact will vary widely depending on the query cost per row > and the fraction of rows the application will actually retrieve. Consider a > complex aggregate returning only a handful of rows. Indeed. > Consider SELECT * on a > 1B-row table with the application ceasing reads after 1000 rows. Performance > aside, this will yield double execution of any volatile functions involved. > So, I think we ought to diligently avoid this step. (Failing that, the > documentation must warn about the extra full cursor scan and this feature must > stay disabled by default.) OK, how about enabling it for Informix-compat mode only, or only via an environment variable? I agree it should be documented. > >> + >> + /* Add the cursor */ >> + cur = add_cursor(lineno, con, curname, scrollable, n_tuples, &existing); >> + >> + /* >> + * Report the number of tuples for the [scrollable] cursor. >> + * The server didn't do it for us. >> + */ >> + sqlca->sqlerrd[2] = (cur->n_tuples < LONG_MAX ? cur->n_tuples : LONG_MAX); >> + } >> + >> + return ret; >> +} >> + >> +static bool >> +ecpg_cursor_execute(struct statement * stmt, struct cursor_descriptor *cur) >> +{ >> + char tmp[64]; >> + char *query; >> + int64 start_pos; >> + >> + if ((cur->cache_pos >= cur->start_pos) && cur->res && (cur->cache_pos < cur->start_pos + PQntuples(cur->res))) >> + { >> + stmt->results = cur->res; >> + ecpg_free_params(stmt, true, stmt->lineno); >> + return true; >> + } > Why does ecpg_cursor_execute() also call ecpg_free_params()? Offhand, it > seems that ECPGfetch() always takes care of that and is the more appropriate > place, seeing it's the one calling ecpg_build_params(). I will look at it. >> --- a/src/interfaces/ecpg/ecpglib/data.c >> +++ b/src/interfaces/ecpg/ecpglib/data.c >> @@ -120,7 +120,7 @@ check_special_value(char *ptr, double *retval, char **endptr) >> } >> >> bool >> -ecpg_get_data(const PGresult *results, int act_tuple, int act_field, int lineno, >> +ecpg_get_data(const PGresult *results, int var_index, int act_tuple, int act_field, int lineno, >> enum ECPGttype type, enum ECPGttype ind_type, >> char *var, char *ind, long varcharsize, long offset, >> long ind_offset, enum ARRAY_TYPE isarray, enum COMPAT_MODE compat, bool force_indicator) > This function could sure use a block comment such as would be customary in > src/backend. Compare the one at heap_update(), for example. OK. > >> --- a/src/interfaces/ecpg/ecpglib/error.c >> +++ b/src/interfaces/ecpg/ecpglib/error.c >> @@ -268,6 +268,20 @@ ecpg_raise(int line, int code, const char *sqlstate, const char *str) >> ecpg_gettext("could not connect to database \"%s\" on line %d"), str, line); >> break; >> >> + case ECPG_INVALID_CURSOR: >> + if (strcmp(sqlstate, ECPG_SQLSTATE_OBJECT_NOT_IN_PREREQUISITE_STATE) == 0) >> + snprintf(sqlca->sqlerrm.sqlerrmc, sizeof(sqlca->sqlerrm.sqlerrmc), >> + >> + /* >> + * translator: this string will be truncated at 149 characters >> + * expanded. >> + */ >> + ecpg_gettext("cursor can only scan forward")); > Every other message works in the line number somehow; this should do the same. Fixed. > >> --- a/src/interfaces/ecpg/ecpglib/execute.c >> +++ b/src/interfaces/ecpg/ecpglib/execute.c >> @@ -1707,46 +1809,20 @@ ecpg_execute(struct statement * stmt) >> } >> >> bool >> -ECPGdo(const int lineno, const int compat, const int force_indicator, const char *connection_name, const bool questionmarks,const int st, const char *query,...) >> +ecpg_do_prologue(int lineno, const int compat, const int force_indicator, >> + const char *connection_name, const bool questionmarks, >> + enum ECPG_statement_type statement_type, const char *query, >> + va_list args, struct statement **stmt_out) > A block comment would especially help this function, considering the name > tells one so little. OK. > >> --- a/src/interfaces/ecpg/ecpglib/extern.h >> +++ b/src/interfaces/ecpg/ecpglib/extern.h >> @@ -60,6 +60,12 @@ struct statement >> bool questionmarks; >> struct variable *inlist; >> struct variable *outlist; >> + char *oldlocale; >> + const char **dollarzero; >> + int ndollarzero; >> + const char **param_values; >> + int nparams; >> + PGresult *results; >> }; > Please comment the members of this struct like we do in most of src/include. OK. > dollarzero has something to do with dynamic cursor names, right? Does it have > other roles? Yes, it had other roles. ECPG supports user variables in cases where the PostgreSQL grammar doesn't. There's this rule: ECPG: var_valueNumericOnly addon if ($1[0] == '$') { free($1); $1 = mm_strdup("$0"); } The "var_value: NumericOnly" case in gram.y can show up in a lot of cases. This feature was there before the dynamic cursor. You can even use them together which means more than one $0 placeholders in the statement. E.g.: FETCH :amount FROM :curname; gets translated to FETCH $0 FROM $0; by ecpg, and both the amount and the cursor name is passed in in user variables. The value is needed by cursor.c, this is why this "dollarzero" pointer is needed. > >> --- a/src/interfaces/ecpg/preproc/ecpg.header >> +++ b/src/interfaces/ecpg/preproc/ecpg.header >> @@ -111,6 +115,26 @@ mmerror(int error_code, enum errortype type, const char *error, ...) >> } >> >> /* >> + * set use_fetch_readahead based on the current cursor >> + * doesn't return if the cursor is not declared >> + */ >> +static void >> +set_cursor_readahead(const char *curname) >> +{ >> + struct cursor *ptr; >> + >> + for (ptr = cur; ptr != NULL; ptr = ptr->next) >> + { >> + if (strcmp(ptr->name, curname) == 0) >> + break; >> + } >> + if (!ptr) >> + mmerror(PARSE_ERROR, ET_FATAL, "cursor \"%s\" does not exist", curname); >> + >> + use_fetch_readahead = ptr->fetch_readahead; >> +} > Following add_additional_variables(), use strcasecmp() for literal cursor > names and strcmp() for cursor name host variables. After modifying the grammar to use numeric values for readahead window size, this function and the "use_fetch_readahead" variable are not needed anymore. > >> --- a/src/interfaces/ecpg/preproc/extern.h >> +++ b/src/interfaces/ecpg/preproc/extern.h >> @@ -24,12 +24,17 @@ extern bool autocommit, >> force_indicator, >> questionmarks, >> regression_mode, >> - auto_prepare; >> + auto_prepare, >> + fetch_readahead; >> +extern bool use_fetch_readahead; > The names of the last two variables don't make clear the difference between > them. I suggest default_fetch_readahead and current_fetch_readahead. Now fetch_readahead is int, the other one is no more. > >> --- /dev/null >> +++ b/src/interfaces/ecpg/test/preproc/cursor-readahead.pgc >> @@ -0,0 +1,244 @@ >> +#include <stdio.h> >> +#include <malloc.h> > Why <malloc.h>? I only see this calling free(); use <stdlib.h> instead. Fixed. I had to update the code, the previous patch didn't apply cleanly to current GIT. I will send the new patch some time next week after fixing the "make check" breakage. > > Thanks, > nm > -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
On Fri, Mar 02, 2012 at 11:41:05AM -0500, Noah Misch wrote: > We yet lack a consensus on whether native ECPG apps should have access to the > feature. My 2c is to make it available, because it's useful syntactic sugar. > If your program independently processes each row of an arbitrary-length result > set, current facilities force you to add an extra outer loop to batch the > FETCHes for every such code site. Applications could define macros to > abstract that pattern, but this seems common-enough to justify bespoke > handling. Besides, minimalists already use libpq directly. Sorry, I don't really understand what you're saying here. The program logic won't change at all when using this feature or what do I misunderstand? > I suggest enabling the feature by default but drastically reducing the default > readahead chunk size from 256 to, say, 5. That still reduces the FETCH round > trip overhead by 80%, but it's small enough not to attract pathological > behavior on a workload where each row is a 10 MiB document. I would not offer > an ecpg-time option to disable the feature per se. Instead, let the user set > the default chunk size at ecpg time. A setting of 1 effectively disables the > feature, though one could later re-enable it with ECPGFETCHSZ. Using 1 to effectively disable the feature is fine with me, but I strongly object any default enabling this feature. It's farily easy to create cases with pathological behaviour and this features is not standard by any means. I figure a normal programmer would expect only one row being transfered when fetching one. Other than that, thanks for the great review. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
2012-03-04 17:16 keltezéssel, Michael Meskes írta: > On Fri, Mar 02, 2012 at 11:41:05AM -0500, Noah Misch wrote: >> We yet lack a consensus on whether native ECPG apps should have access to the >> feature. My 2c is to make it available, because it's useful syntactic sugar. >> If your program independently processes each row of an arbitrary-length result >> set, current facilities force you to add an extra outer loop to batch the >> FETCHes for every such code site. Applications could define macros to >> abstract that pattern, but this seems common-enough to justify bespoke >> handling. Besides, minimalists already use libpq directly. > Sorry, I don't really understand what you're saying here. The program logic > won't change at all when using this feature or what do I misunderstand? The program logic shouldn't change at all. He meant that extra coding effort is needed if you want manual caching. It requires 2 loops instead of 1 if you use FETCH N (N>1). > >> I suggest enabling the feature by default but drastically reducing the default >> readahead chunk size from 256 to, say, 5. That still reduces the FETCH round >> trip overhead by 80%, but it's small enough not to attract pathological >> behavior on a workload where each row is a 10 MiB document. I would not offer >> an ecpg-time option to disable the feature per se. Instead, let the user set >> the default chunk size at ecpg time. A setting of 1 effectively disables the >> feature, though one could later re-enable it with ECPGFETCHSZ. > Using 1 to effectively disable the feature is fine with me, Something else would be needed. For DML with WHERE CURRENT OF cursor, the feature should stay disabled even with the environment variable is set without adding any decoration to the DECLARE statement. The safe thing would be to distinguish between uncached (but cachable) and uncachable cases. A single value cannot work. > but I strongly > object any default enabling this feature. It's farily easy to create cases with > pathological behaviour and this features is not standard by any means. I figure > a normal programmer would expect only one row being transfered when fetching > one. > > Other than that, thanks for the great review. > > Michael Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
On Sun, Mar 04, 2012 at 05:34:50PM +0100, Boszormenyi Zoltan wrote: > The program logic shouldn't change at all. He meant that extra coding effort > is needed if you want manual caching. It requires 2 loops instead of 1 if you use > FETCH N (N>1). Ah, thanks for the explanation. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
On Sun, Mar 04, 2012 at 05:16:06PM +0100, Michael Meskes wrote: > On Fri, Mar 02, 2012 at 11:41:05AM -0500, Noah Misch wrote: > > I suggest enabling the feature by default but drastically reducing the default > > readahead chunk size from 256 to, say, 5. That still reduces the FETCH round > > trip overhead by 80%, but it's small enough not to attract pathological > > behavior on a workload where each row is a 10 MiB document. I would not offer > > an ecpg-time option to disable the feature per se. Instead, let the user set > > the default chunk size at ecpg time. A setting of 1 effectively disables the > > feature, though one could later re-enable it with ECPGFETCHSZ. > > Using 1 to effectively disable the feature is fine with me, but I strongly > object any default enabling this feature. It's farily easy to create cases with > pathological behaviour and this features is not standard by any means. I figure > a normal programmer would expect only one row being transfered when fetching > one. On further reflection, I agree with you here. The prospect for queries that call volatile functions changed my mind; they would exhibit different functional behavior under readahead. We mustn't silently give affected programs different semantics. Thanks, nm
On Sun, Mar 04, 2012 at 04:33:32PM +0100, Boszormenyi Zoltan wrote: > 2012-03-02 17:41 keltez?ssel, Noah Misch ?rta: > > On Thu, Dec 29, 2011 at 10:46:23AM +0100, Boszormenyi Zoltan wrote: > > I suggest enabling the feature by default but drastically reducing the default > > readahead chunk size from 256 to, say, 5. > > > > That still reduces the FETCH round > > trip overhead by 80%, but it's small enough not to attract pathological > > behavior on a workload where each row is a 10 MiB document. > > I see. How about 8? Nice "round" power of 2 value, still small and avoids > 87.5% of overhead. Having pondered the matter further, I now agree with Michael that the feature should stay disabled by default. See my response to him for rationale. Assuming that conclusion holds, we can recommended a higher value to users who enable the feature at all. Your former proposal of 256 seems fine. > BTW, the default disabled behaviour was to avoid "make check" breakage, > see below. > > > I would not offer > > an ecpg-time option to disable the feature per se. Instead, let the user set > > the default chunk size at ecpg time. A setting of 1 effectively disables the > > feature, though one could later re-enable it with ECPGFETCHSZ. > > This means all code previously going through ECPGdo() would go through > ECPGopen()/ECPGfetch()/ECPGclose(). This is more intrusive and all > regression tests that were only testing certain features would also > test the readahead feature, too. It's a good sort of intrusiveness, reducing the likelihood of introducing bugs basically unrelated to readahead that happen to afflict only ECPGdo() or only the cursor.c interfaces. Let's indeed not have any preexisting test cases use readahead per se, but having them use the cursor.c interfaces anyway will build confidence in the new code. The churn in expected debug output isn't ideal, but I don't prefer the alternative of segmenting the implementation for the sake of the test cases. > Also, the test for WHERE CURRENT OF at ecpg time would have to be done > at runtime, possibly making previously working code fail if ECPGFETCHSZ is enabled. Good point. > How about still allowing "NO READAHEAD" cursors that compile into plain ECPGdo()? > This way, ECPGFETCHSZ don't interfere with WHERE CURRENT OF. But this would > mean code changes everywhere where WHERE CURRENT OF is used. ECPGFETCHSZ should only affect cursors that make no explicit mention of READAHEAD. I'm not sure whether that should mean actually routing READHEAD 1 cursors through ECPGdo() or simply making sure that cursor.c achieves the same outcome; see later for a possible reason to still do the latter. > Or how about a new feature in the backend, so ECPG can do > UPDATE/DELETE ... WHERE OFFSET N OF cursor > and the offset of computed from the actual cursor position and the position known > by the application? This way an app can do readahead and do work on rows collected > by the cursor with WHERE CURRENT OF which gets converted to WHERE OFFSET OF > behind the scenes. That's a neat idea, but I would expect obstacles threatening our ability to use it automatically for readahead. You would have to make the cursor a SCROLL cursor. We'll often pass a negative offset, making the operation fail if the cursor query used FOR UPDATE. Volatile functions in the query will get more calls. That's assuming the operation will map internally to something like MOVE N; UPDATE ... WHERE CURRENT OF; MOVE -N. You might come up with innovations to mitigate those obstacles, but those innovations would probably also apply to MOVE/FETCH. In any event, this would constitute a substantive patch in its own right. One way out of trouble here is to make WHERE CURRENT OF imply READHEAD 1/READHEAD 0 (incidentally, perhaps those two should be synonyms) on the affected cursor. If the cursor has some other readahead quantity declared explicitly, throw an error during preprocessing. Failing a reasonable resolution, I'm prepared to withdraw my suggestion of making ECPGFETCHSZ always-usable. It's nice to have, not critical. > >> +bool > >> +ECPGopen(const int lineno, const int compat, const int force_indicator, > >> + const char *connection_name, const bool questionmarks, > >> + const char *curname, const int st, const char *query, ...) > >> +{ > >> + va_list args; > >> + bool ret, scrollable; > >> + char *new_query, *ptr, *whold, *noscroll, *scroll, *dollar0; > >> + struct sqlca_t *sqlca = ECPGget_sqlca(); > >> + > >> + if (!query) > >> + { > >> + ecpg_raise(lineno, ECPG_EMPTY, ECPG_SQLSTATE_ECPG_INTERNAL_ERROR, NULL); > >> + return false; > >> + } > >> + ptr = strstr(query, "for "); > >> + if (!ptr) > >> + { > >> + ecpg_raise(lineno, ECPG_INVALID_STMT, ECPG_SQLSTATE_ECPG_INTERNAL_ERROR, NULL); > >> + return false; > >> + } > >> + whold = strstr(query, "with hold "); > >> + dollar0 = strstr(query, "$0"); > >> + > >> + noscroll = strstr(query, "no scroll "); > >> + scroll = strstr(query, "scroll "); > > A query like 'SELECT 1 AS "with hold "' fools these lexical tests. > > But SELECT 1 AS "with hold" doesn't go through ECPGopen(), it's run by ECPGdo() > so no breakage there. ecpglib functions are not intended to be called from manually > constructed C code. I tried something likeEXEC SQL DECLARE cur CURSOR FOR SELECT * FROM generate_series(1 , $1) AS t("with hold "); It wrongly generated this backend command:declare cur no scroll cursor with hold for select * from generate_series ( 1 ,$1 ) as t ( "with hold " ) > > An unusable ECPGFETCHSZ should procedure an error, not > > silently give no effect. > > Point taken. Which error handling do imagine? abort() or simply returning false > and raise and error in SQLCA? The latter. > >> + /* > >> + * If statement went OK, add the cursor and discover the > >> + * number of rows in the recordset. This will slow down OPEN > >> + * but we gain a lot with caching. > >> + */ > >> + if (ret /* && sqlca->sqlerrd[2] == 0 */) > > Why is the commented code there? > > Some leftover from testing, it shouldn't be there. > > > > >> + { > >> + struct connection *con = ecpg_get_connection(connection_name); > >> + struct cursor_descriptor *cur; > >> + bool existing; > >> + int64 n_tuples; > >> + > >> + if (scrollable) > >> + { > >> + PGresult *res; > >> + char *query; > >> + char *endptr = NULL; > >> + > >> + query = ecpg_alloc(strlen(curname) + strlen("move all in ") + 2, lineno); > >> + sprintf(query, "move all in %s", curname); > >> + res = PQexec(con->connection, query); > >> + n_tuples = strtoull(PQcmdTuples(res), &endptr, 10); > >> + PQclear(res); > >> + ecpg_free(query); > >> + > >> + /* Go back to the beginning of the resultset. */ > >> + query = ecpg_alloc(strlen(curname) + strlen("move absolute 0 in ") + 2, lineno); > >> + sprintf(query, "move absolute 0 in %s", curname); > >> + res = PQexec(con->connection, query); > >> + PQclear(res); > >> + ecpg_free(query); > >> + } > >> + else > >> + { > >> + n_tuples = 0; > >> + } > > You give this rationale for the above code: > > > > On Thu, Jun 17, 2010 at 02:09:47PM +0200, Boszormenyi Zoltan wrote: > >> ECPGopen() also discovers the total number of records in the recordset, > >> so the previous ECPG "deficiency" (backend limitation) that sqlca.sqlerrd[2] > >> didn't report the (possibly estimated) number of rows in the resultset > >> is now > >> overcome. This slows down OPEN for cursors serving larger datasets > >> but it makes possible to position the readahead window using MOVE > >> ABSOLUTE no matter what FORWARD/BACKWARD/ABSOLUTE/RELATIVE > >> variants are used by the application. And the caching is more than > >> overweighs > >> the slowdown in OPEN it seems. > > From the documentation for Informix and Oracle, those databases do not > > populate sqlerrd[2] this way: > > http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqlt.doc/sqltmst189.htm > > http://docs.oracle.com/cd/A57673_01/DOC/api/doc/PC_22/ch10.htm#toc139 > > The problem here is that Informix in the field in fact returns the number of rows > in the cursor and the customer we developed this readahead code for relied on this. > Maybe this was eliminated in newer versions of Informix to make it faster. > > > The performance impact will vary widely depending on the query cost per row > > and the fraction of rows the application will actually retrieve. Consider a > > complex aggregate returning only a handful of rows. > > Indeed. > > > Consider SELECT * on a > > 1B-row table with the application ceasing reads after 1000 rows. Performance > > aside, this will yield double execution of any volatile functions involved. > > So, I think we ought to diligently avoid this step. (Failing that, the > > documentation must warn about the extra full cursor scan and this feature must > > stay disabled by default.) > > OK, how about enabling it for Informix-compat mode only, or only via an > environment variable? I agree it should be documented. For a query where backend execution cost dominates the cost of transferring rows to the client, does Informix take roughly twice the normal time to execute the query via an ESQL/C cursor? Is that acceptable overhead for every "ecpg -C" user? (FWIW, I've never used Informix-compat mode.) If not, the feature deserves its own option. Whatever the trigger condition, shouldn't this apply independent of whether readahead is in use for a given cursor? (This could constitute a reason to use the cursor.c interfaces for every cursor.) Does some vendor-neutral standard define semantics for sqlerrd, or has it propagated by imitation? This reminds me to mention: your documentation should note that the use of readahead or the option that enables sqlerrd[2] calculation may change the outcome of queries calling volatile functions. See how the DECLARE documentation page discusses this hazard for SCROLL/WITH HOLD cursors. > >> --- a/src/interfaces/ecpg/ecpglib/extern.h > >> +++ b/src/interfaces/ecpg/ecpglib/extern.h > >> @@ -60,6 +60,12 @@ struct statement > >> bool questionmarks; > >> struct variable *inlist; > >> struct variable *outlist; > >> + char *oldlocale; > >> + const char **dollarzero; > >> + int ndollarzero; > >> + const char **param_values; > >> + int nparams; > >> + PGresult *results; > >> }; > > Please comment the members of this struct like we do in most of src/include. > > OK. > > > dollarzero has something to do with dynamic cursor names, right? Does it have > > other roles? > > Yes, it had other roles. ECPG supports user variables in cases where the > PostgreSQL grammar doesn't. There's this rule: > > ECPG: var_valueNumericOnly addon > if ($1[0] == '$') > { > free($1); > $1 = mm_strdup("$0"); > } > > The "var_value: NumericOnly" case in gram.y can show up in a lot of cases. > This feature was there before the dynamic cursor. You can even use them together > which means more than one $0 placeholders in the statement. E.g.: > FETCH :amount FROM :curname; > gets translated to > FETCH $0 FROM $0; > by ecpg, and both the amount and the cursor name is passed in in user variables. > The value is needed by cursor.c, this is why this "dollarzero" pointer is needed. Thanks for that explanation; the situation is clearer to me now. nm
2012-03-05 19:56 keltezéssel, Noah Misch írta: > > Having pondered the matter further, I now agree with Michael that the feature > should stay disabled by default. See my response to him for rationale. > Assuming that conclusion holds, we can recommended a higher value to users who > enable the feature at all. Your former proposal of 256 seems fine. OK. > >> BTW, the default disabled behaviour was to avoid "make check" breakage, >> see below. >> >>> I would not offer >>> an ecpg-time option to disable the feature per se. Instead, let the user set >>> the default chunk size at ecpg time. A setting of 1 effectively disables the >>> feature, though one could later re-enable it with ECPGFETCHSZ. >> This means all code previously going through ECPGdo() would go through >> ECPGopen()/ECPGfetch()/ECPGclose(). This is more intrusive and all >> regression tests that were only testing certain features would also >> test the readahead feature, too. > It's a good sort of intrusiveness, reducing the likelihood of introducing bugs > basically unrelated to readahead that happen to afflict only ECPGdo() or only > the cursor.c interfaces. Let's indeed not have any preexisting test cases use > readahead per se, but having them use the cursor.c interfaces anyway will > build confidence in the new code. The churn in expected debug output isn't > ideal, but I don't prefer the alternative of segmenting the implementation for > the sake of the test cases. I see. > >> Also, the test for WHERE CURRENT OF at ecpg time would have to be done >> at runtime, possibly making previously working code fail if ECPGFETCHSZ is enabled. > Good point. > >> How about still allowing "NO READAHEAD" cursors that compile into plain ECPGdo()? >> This way, ECPGFETCHSZ don't interfere with WHERE CURRENT OF. But this would >> mean code changes everywhere where WHERE CURRENT OF is used. > ECPGFETCHSZ should only affect cursors that make no explicit mention of > READAHEAD. I'm not sure whether that should mean actually routing READHEAD 1 > cursors through ECPGdo() or simply making sure that cursor.c achieves the same > outcome; see later for a possible reason to still do the latter. > >> Or how about a new feature in the backend, so ECPG can do >> UPDATE/DELETE ... WHERE OFFSET N OF cursor >> and the offset of computed from the actual cursor position and the position known >> by the application? This way an app can do readahead and do work on rows collected >> by the cursor with WHERE CURRENT OF which gets converted to WHERE OFFSET OF >> behind the scenes. > That's a neat idea, but I would expect obstacles threatening our ability to > use it automatically for readahead. You would have to make the cursor a > SCROLL cursor. We'll often pass a negative offset, making the operation fail > if the cursor query used FOR UPDATE. Volatile functions in the query will get > more calls. That's assuming the operation will map internally to something > like MOVE N; UPDATE ... WHERE CURRENT OF; MOVE -N. You might come up with > innovations to mitigate those obstacles, but those innovations would probably > also apply to MOVE/FETCH. In any event, this would constitute a substantive > patch in its own right. I was thinking along the lines of a Portal keeping the ItemPointerData for each tuple in the last FETCH statement. The WHERE OFFSET N OF cursor would treat the offset value relative to the tuple order returned by FETCH. So, OFFSET 0 OF == CURRENT OF and other values of N are negative. This way, it doesn't matter if the cursor is SCROLL, NO SCROLL or havethe default behaviour with "SCROLL in some cases".Then ECPGopen() doesn't have to play games with the DECLARE statement. Only ECPGfetch() needs to play with MOVE statements, passing different offsets to the backend, not what the application passed. > One way out of trouble here is to make WHERE CURRENT OF imply READHEAD > 1/READHEAD 0 (incidentally, perhaps those two should be synonyms) on the > affected cursor. If the cursor has some other readahead quantity declared > explicitly, throw an error during preprocessing. I played with this idea a while ago, from a different point of view. If the ECPG code had the DECLARE mycur, DML ... WHERE CURRENT OF mycur and OPEN mycur in exactly this order, i.e. WHERE CURRENT OF appears in a standalone function between DECLARE and the first OPEN for the cursor, then ECPG disabled readahead automatically for that cursor and for that cursor only. But this requires effort on the user of ECPG and can be very fragile. Code cleanup with reordering functions can break previously working code. > Failing a reasonable resolution, I'm prepared to withdraw my suggestion of > making ECPGFETCHSZ always-usable. It's nice to have, not critical. > >>>> +bool >>>> +ECPGopen(const int lineno, const int compat, const int force_indicator, >>>> + const char *connection_name, const bool questionmarks, >>>> + const char *curname, const int st, const char *query, ...) >>>> +{ >>>> + va_list args; >>>> + bool ret, scrollable; >>>> + char *new_query, *ptr, *whold, *noscroll, *scroll, *dollar0; >>>> + struct sqlca_t *sqlca = ECPGget_sqlca(); >>>> + >>>> + if (!query) >>>> + { >>>> + ecpg_raise(lineno, ECPG_EMPTY, ECPG_SQLSTATE_ECPG_INTERNAL_ERROR, NULL); >>>> + return false; >>>> + } >>>> + ptr = strstr(query, "for "); >>>> + if (!ptr) >>>> + { >>>> + ecpg_raise(lineno, ECPG_INVALID_STMT, ECPG_SQLSTATE_ECPG_INTERNAL_ERROR, NULL); >>>> + return false; >>>> + } >>>> + whold = strstr(query, "with hold "); >>>> + dollar0 = strstr(query, "$0"); >>>> + >>>> + noscroll = strstr(query, "no scroll "); >>>> + scroll = strstr(query, "scroll "); >>> A query like 'SELECT 1 AS "with hold "' fools these lexical tests. >> But SELECT 1 AS "with hold" doesn't go through ECPGopen(), it's run by ECPGdo() >> so no breakage there. ecpglib functions are not intended to be called from manually >> constructed C code. > I tried something like > EXEC SQL DECLARE cur CURSOR FOR SELECT * FROM generate_series(1 , $1) AS t("with hold "); > It wrongly generated this backend command: > declare cur no scroll cursor with hold for select * from generate_series ( 1 , $1 ) as t ( "with hold " ) Ah, ok. The grammar test in ecpg is better. >>> An unusable ECPGFETCHSZ should procedure an error, not >>> silently give no effect. >> Point taken. Which error handling do imagine? abort() or simply returning false >> and raise and error in SQLCA? > The latter. > >>>> + /* >>>> + * If statement went OK, add the cursor and discover the >>>> + * number of rows in the recordset. This will slow down OPEN >>>> + * but we gain a lot with caching. >>>> + */ >>>> + if (ret /* && sqlca->sqlerrd[2] == 0 */) >>> Why is the commented code there? >> Some leftover from testing, it shouldn't be there. Actually, now I remember. It was a wishful thinking on my part that whenever PG supports returning a number of rows at opening the cursor, correct or estimated, this code shouldn't be executed, just accept what the backend has given us. >> >>>> + { >>>> + struct connection *con = ecpg_get_connection(connection_name); >>>> + struct cursor_descriptor *cur; >>>> + bool existing; >>>> + int64 n_tuples; >>>> + >>>> + if (scrollable) >>>> + { >>>> + PGresult *res; >>>> + char *query; >>>> + char *endptr = NULL; >>>> + >>>> + query = ecpg_alloc(strlen(curname) + strlen("move all in ") + 2, lineno); >>>> + sprintf(query, "move all in %s", curname); >>>> + res = PQexec(con->connection, query); >>>> + n_tuples = strtoull(PQcmdTuples(res), &endptr, 10); >>>> + PQclear(res); >>>> + ecpg_free(query); >>>> + >>>> + /* Go back to the beginning of the resultset. */ >>>> + query = ecpg_alloc(strlen(curname) + strlen("move absolute 0 in ") + 2, lineno); >>>> + sprintf(query, "move absolute 0 in %s", curname); >>>> + res = PQexec(con->connection, query); >>>> + PQclear(res); >>>> + ecpg_free(query); >>>> + } >>>> + else >>>> + { >>>> + n_tuples = 0; >>>> + } >>> You give this rationale for the above code: >>> >>> On Thu, Jun 17, 2010 at 02:09:47PM +0200, Boszormenyi Zoltan wrote: >>>> ECPGopen() also discovers the total number of records in the recordset, >>>> so the previous ECPG "deficiency" (backend limitation) that sqlca.sqlerrd[2] >>>> didn't report the (possibly estimated) number of rows in the resultset >>>> is now >>>> overcome. This slows down OPEN for cursors serving larger datasets >>>> but it makes possible to position the readahead window using MOVE >>>> ABSOLUTE no matter what FORWARD/BACKWARD/ABSOLUTE/RELATIVE >>>> variants are used by the application. And the caching is more than >>>> overweighs >>>> the slowdown in OPEN it seems. >>> From the documentation for Informix and Oracle, those databases do not >>> populate sqlerrd[2] this way: >>> http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqlt.doc/sqltmst189.htm >>> http://docs.oracle.com/cd/A57673_01/DOC/api/doc/PC_22/ch10.htm#toc139 >> The problem here is that Informix in the field in fact returns the number of rows >> in the cursor and the customer we developed this readahead code for relied on this. >> Maybe this was eliminated in newer versions of Informix to make it faster. >> >>> The performance impact will vary widely depending on the query cost per row >>> and the fraction of rows the application will actually retrieve. Consider a >>> complex aggregate returning only a handful of rows. >> Indeed. >> >>> Consider SELECT * on a >>> 1B-row table with the application ceasing reads after 1000 rows. Performance >>> aside, this will yield double execution of any volatile functions involved. >>> So, I think we ought to diligently avoid this step. (Failing that, the >>> documentation must warn about the extra full cursor scan and this feature must >>> stay disabled by default.) >> OK, how about enabling it for Informix-compat mode only, or only via an >> environment variable? I agree it should be documented. > For a query where backend execution cost dominates the cost of transferring > rows to the client, does Informix take roughly twice the normal time to > execute the query via an ESQL/C cursor? Is that acceptable overhead for every > "ecpg -C" user? (FWIW, I've never used Informix-compat mode.) If not, the > feature deserves its own option. > > Whatever the trigger condition, shouldn't this apply independent of whether > readahead is in use for a given cursor? I guess so. > (This could constitute a reason to > use the cursor.c interfaces for every cursor.) Indeed. > Does some vendor-neutral standard define semantics for sqlerrd, or has it > propagated by imitation? No idea. > This reminds me to mention: your documentation should note that the use of > readahead or the option that enables sqlerrd[2] calculation may change the > outcome of queries calling volatile functions. See how the DECLARE > documentation page discusses this hazard for SCROLL/WITH HOLD cursors. OK. > >>>> --- a/src/interfaces/ecpg/ecpglib/extern.h >>>> +++ b/src/interfaces/ecpg/ecpglib/extern.h >>>> @@ -60,6 +60,12 @@ struct statement >>>> bool questionmarks; >>>> struct variable *inlist; >>>> struct variable *outlist; >>>> + char *oldlocale; >>>> + const char **dollarzero; >>>> + int ndollarzero; >>>> + const char **param_values; >>>> + int nparams; >>>> + PGresult *results; >>>> }; >>> Please comment the members of this struct like we do in most of src/include. >> OK. >> >>> dollarzero has something to do with dynamic cursor names, right? Does it have >>> other roles? >> Yes, it had other roles. ECPG supports user variables in cases where the >> PostgreSQL grammar doesn't. There's this rule: >> >> ECPG: var_valueNumericOnly addon >> if ($1[0] == '$') >> { >> free($1); >> $1 = mm_strdup("$0"); >> } >> >> The "var_value: NumericOnly" case in gram.y can show up in a lot of cases. >> This feature was there before the dynamic cursor. You can even use them together >> which means more than one $0 placeholders in the statement. E.g.: >> FETCH :amount FROM :curname; >> gets translated to >> FETCH $0 FROM $0; >> by ecpg, and both the amount and the cursor name is passed in in user variables. >> The value is needed by cursor.c, this is why this "dollarzero" pointer is needed. > Thanks for that explanation; the situation is clearer to me now. > > nm > -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
On Tue, Mar 06, 2012 at 07:07:41AM +0100, Boszormenyi Zoltan wrote: > 2012-03-05 19:56 keltez?ssel, Noah Misch ?rta: > >> Or how about a new feature in the backend, so ECPG can do > >> UPDATE/DELETE ... WHERE OFFSET N OF cursor > >> and the offset of computed from the actual cursor position and the position known > >> by the application? This way an app can do readahead and do work on rows collected > >> by the cursor with WHERE CURRENT OF which gets converted to WHERE OFFSET OF > >> behind the scenes. > > That's a neat idea, but I would expect obstacles threatening our ability to > > use it automatically for readahead. You would have to make the cursor a > > SCROLL cursor. We'll often pass a negative offset, making the operation fail > > if the cursor query used FOR UPDATE. Volatile functions in the query will get > > more calls. That's assuming the operation will map internally to something > > like MOVE N; UPDATE ... WHERE CURRENT OF; MOVE -N. You might come up with > > innovations to mitigate those obstacles, but those innovations would probably > > also apply to MOVE/FETCH. In any event, this would constitute a substantive > > patch in its own right. > > I was thinking along the lines of a Portal keeping the ItemPointerData > for each tuple in the last FETCH statement. The WHERE OFFSET N OF cursor > would treat the offset value relative to the tuple order returned by FETCH. > So, OFFSET 0 OF == CURRENT OF and other values of N are negative. > This way, it doesn't matter if the cursor is SCROLL, NO SCROLL or have > the default behaviour with "SCROLL in some cases". Then ECPGopen() > doesn't have to play games with the DECLARE statement. Only ECPGfetch() > needs to play with MOVE statements, passing different offsets to the backend, > not what the application passed. That broad approach sounds promising. The main other consideration that comes to mind is a plan to limit resource usage for a cursor that reads, say, 1B rows. However, I think attempting to implement this now will significantly decrease the chance of getting the core patch features committed now. > > One way out of trouble here is to make WHERE CURRENT OF imply READHEAD > > 1/READHEAD 0 (incidentally, perhaps those two should be synonyms) on the > > affected cursor. If the cursor has some other readahead quantity declared > > explicitly, throw an error during preprocessing. > > I played with this idea a while ago, from a different point of view. > If the ECPG code had the DECLARE mycur, DML ... WHERE CURRENT OF mycur > and OPEN mycur in exactly this order, i.e. WHERE CURRENT OF appears in > a standalone function between DECLARE and the first OPEN for the cursor, > then ECPG disabled readahead automatically for that cursor and for that > cursor only. But this requires effort on the user of ECPG and can be very > fragile. Code cleanup with reordering functions can break previously > working code. Don't the same challenges apply to accurately reporting an error when the user specifies WHERE CURRENT OF for a readahead cursor?
On Tue, Mar 6, 2012 at 6:06 AM, Noah Misch <noah@leadboat.com> wrote: > On Tue, Mar 06, 2012 at 07:07:41AM +0100, Boszormenyi Zoltan wrote: >> 2012-03-05 19:56 keltez?ssel, Noah Misch ?rta: >> >> Or how about a new feature in the backend, so ECPG can do >> >> UPDATE/DELETE ... WHERE OFFSET N OF cursor >> >> and the offset of computed from the actual cursor position and the position known >> >> by the application? This way an app can do readahead and do work on rows collected >> >> by the cursor with WHERE CURRENT OF which gets converted to WHERE OFFSET OF >> >> behind the scenes. >> > That's a neat idea, but I would expect obstacles threatening our ability to >> > use it automatically for readahead. You would have to make the cursor a >> > SCROLL cursor. We'll often pass a negative offset, making the operation fail >> > if the cursor query used FOR UPDATE. Volatile functions in the query will get >> > more calls. That's assuming the operation will map internally to something >> > like MOVE N; UPDATE ... WHERE CURRENT OF; MOVE -N. You might come up with >> > innovations to mitigate those obstacles, but those innovations would probably >> > also apply to MOVE/FETCH. In any event, this would constitute a substantive >> > patch in its own right. >> >> I was thinking along the lines of a Portal keeping the ItemPointerData >> for each tuple in the last FETCH statement. The WHERE OFFSET N OF cursor >> would treat the offset value relative to the tuple order returned by FETCH. >> So, OFFSET 0 OF == CURRENT OF and other values of N are negative. >> This way, it doesn't matter if the cursor is SCROLL, NO SCROLL or have >> the default behaviour with "SCROLL in some cases". Then ECPGopen() >> doesn't have to play games with the DECLARE statement. Only ECPGfetch() >> needs to play with MOVE statements, passing different offsets to the backend, >> not what the application passed. > > That broad approach sounds promising. The main other consideration that comes > to mind is a plan to limit resource usage for a cursor that reads, say, 1B > rows. However, I think attempting to implement this now will significantly > decrease the chance of getting the core patch features committed now. > >> > One way out of trouble here is to make WHERE CURRENT OF imply READHEAD >> > 1/READHEAD 0 (incidentally, perhaps those two should be synonyms) on the >> > affected cursor. If the cursor has some other readahead quantity declared >> > explicitly, throw an error during preprocessing. >> >> I played with this idea a while ago, from a different point of view. >> If the ECPG code had the DECLARE mycur, DML ... WHERE CURRENT OF mycur >> and OPEN mycur in exactly this order, i.e. WHERE CURRENT OF appears in >> a standalone function between DECLARE and the first OPEN for the cursor, >> then ECPG disabled readahead automatically for that cursor and for that >> cursor only. But this requires effort on the user of ECPG and can be very >> fragile. Code cleanup with reordering functions can break previously >> working code. > > Don't the same challenges apply to accurately reporting an error when the user > specifies WHERE CURRENT OF for a readahead cursor? I think we need either an updated version of this patch that's ready for commit real soon now, or we need to postpone it to 9.3. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2012-03-15 21:59 keltezéssel, Robert Haas írta: > I think we need either an updated version of this patch that's ready for commit real > soon now, or we need to postpone it to 9.3. Sorry for the delay, I had been busy with other tasks and I rewrote this code to better cope with unknown result size, scrollable cursors and negative cursor positions. I think all points raised by Noah is addressed: per-cursor readahead window size, extensive comments, documentation and not enabling result set size discovery. Also, I noticed this in passing: static void free_variable(struct variable * var) { struct variable *var_next; if (var == NULL) return; var_next = var->next; ecpg_free(var); while (var_next) { var = var_next; var_next = var->next; ecpg_free(var); } } I rewrote this as below to eliminate manual unrolling of the loop, they are equivalent: static void free_variable(struct variable * var) { struct variable *var_next; while (var) { var_next = var->next; ecpg_free(var); var = var_next; } } The problem with WHERE CURRENT OF is solved by a little more grammar and ecpglib code, which effectively does a MOVE ABSOLUTE N before executing the DML with WHERE CURRENT OF clause. No patching of the backend. This way, the new ECPG caching code is compatible with older servers but obviously reduces the efficiency of caching. Attached are two patches, the first one is the feature patch. The second patch makes all cursor statements go through the new caching functions with 1 tuple readahead window size. It only changes the preprocessed code and stderr logs of the regression tests affected, not their results. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig& Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
On Sat, Mar 24, 2012 at 10:49:07AM +0100, Boszormenyi Zoltan wrote: > Sorry for the delay, I had been busy with other tasks and I rewrote this code > to better cope with unknown result size, scrollable cursors and negative > cursor positions. > > I think all points raised by Noah is addressed: per-cursor readahead window size, > extensive comments, documentation and not enabling result set size discovery. The new comments are a nice improvement; thanks. > The problem with WHERE CURRENT OF is solved by a little more grammar > and ecpglib code, which effectively does a MOVE ABSOLUTE N before > executing the DML with WHERE CURRENT OF clause. No patching of the > backend. This way, the new ECPG caching code is compatible with older > servers but obviously reduces the efficiency of caching. Good plan. > diff -dcrpN postgresql.orig/doc/src/sgml/ecpg.sgml postgresql/doc/src/sgml/ecpg.sgml > *** postgresql.orig/doc/src/sgml/ecpg.sgml 2012-03-12 09:24:31.699560098 +0100 > --- postgresql/doc/src/sgml/ecpg.sgml 2012-03-24 10:15:00.538924601 +0100 > *************** EXEC SQL COMMIT; > *** 454,459 **** > --- 454,479 ---- > details. > </para> > > + <para> > + ECPG may use cursor readahead to improve performance of programs > + that use single-row FETCH statements. Option <literal>-R number</literal> > + option for ECPG modifies the default for all cursors from <literal>NO READAHEAD</literal> This sentence duplicates the word "option". > + to <literal>READAHEAD number</literal>. Explicit <literal>READAHEAD number</literal> or > + <literal>NO READAHEAD</literal> turns cursor readahead on (with <literal>number</literal> > + as the size of the readahead window) or off for the specified cursor, > + respectively. For cursors using a non-0 readahead window size is 256 rows, The number 256 is no longer present in your implementation. > + the window size may be modified by setting the <literal>ECPGFETCHSZ</literal> > + environment variable to a different value. I had in mind that DECLARE statements adorned with READAHEAD syntax would always behave precisely as written, independent of ECPGFETCHSZ or "ecpg -R". Unadorned DECLARE statements would use ECPGFETCHSZ if set, then the value passed to "ecpg -R" if provided, and finally a value of 1 (no readahead) in the absence of both ECPGFETCHSZ and "ecpg -R". Did you do it differently for a particular reason? I don't particularly object to what you've implemented, but I'd be curious to hear your thinking. > + </para> > + > + <para> > + <command>UPDATE</command> or <command>DELETE</command> with the > + <literal>WHERE CURRENT OF</literal> clause, cursor readahead may or may not > + actually improve performance, as <command>MOVE</command> statement has to be > + sent to the backend before the DML statement to ensure correct cursor position > + in the backend. > + </para> This sentence seems to be missing a word near its beginning. > *************** DECLARE <replaceable class="PARAMETER">c > *** 6639,6649 **** > </listitem> > </varlistentry> > </variablelist> > > <para> > ! For the meaning of the cursor options, > ! see <xref linkend="sql-declare">. > </para> > </refsect1> > > <refsect1> > --- 6669,6728 ---- > </listitem> > </varlistentry> > </variablelist> > + </refsect1> > + > + <refsect1> > + <title>Cursor options</title> > > <para> > ! For the meaning of other cursor options, see <xref linkend="sql-declare">. > </para> > + > + <variablelist> > + <varlistentry> > + <term><literal>READAHEAD number</literal></term> > + <term><literal>NO READAHEAD</literal></term> > + <listitem> > + <para> > + <literal>READAHEAD number</literal> makes the ECPG preprocessor and > + runtime library use a client-side cursor accounting and data readahead > + during <command>FETCH</command>. This improves performance for programs > + that use single-row <command>FETCH</command> statements. > + </para> > + > + <para> > + <literal>NO READAHEAD</literal> disables data readahead in case > + <parameter>-R number</parameter> is used for compiling the file. > + </para> > + </listitem> > + </varlistentry> One of the new sections about readahead should somehow reference the hazard around volatile functions. > + > + <varlistentry> > + <term><literal>OPEN RETURNS LAST ROW POSITION</literal></term> > + <term><literal>OPEN RETURNS 0 FOR LAST ROW POSITION</literal></term> > + <listitem> > + <para> > + When the cursor is opened, it's possible to discover the size of the result set > + using <command>MOVE ALL</command> which traverses the result set and move > + the cursor back to the beginning using <command>MOVE ABSOLUTE 0</command>. > + The size of the result set is returned in sqlca.sqlerrd[2]. > + </para> > + > + <para> > + This slows down opening the cursor from the application point of view > + but may also have side effects. If the cursor query contains volatile function > + calls with side effects, they will be evaluated twice because of traversing > + the result set this way during <command>OPEN</command>. > + </para> > + > + <para> > + The default is not to discover. > + </para> I mildly oppose having syntax to enable this per-cursor. Readahead is a generally handy feature that I might use in new programs, but this feature is more of a hack for compatibility with some old Informix version. For new code, authors should do their own MOVEs instead of using this option. The patch also adds an undocumented ECPGOPENRETURNSRESULTSETSIZE environment variable to control this. I see no use for such a thing, because a program's dependency on sqlerrd[2] is fixed at build time. If a program does not reference sqlerrd[2] for a cursor, there's no benefit from choosing at runtime to populate that value anyway. If a program does reference it, any option needed to have it set correctly had better be set at build time and apply to every run of the final program. IOW, I suggest having only the "ecpg"-time option to enable this behavior. Thoughts? > + </listitem> > + </varlistentry> > + > + </variablelist> > + > </refsect1> > > <refsect1> > diff -dcrpN postgresql.orig/doc/src/sgml/ref/ecpg-ref.sgml postgresql/doc/src/sgml/ref/ecpg-ref.sgml > *** postgresql.orig/doc/src/sgml/ref/ecpg-ref.sgml 2011-08-07 11:29:16.003256959 +0200 > --- postgresql/doc/src/sgml/ref/ecpg-ref.sgml 2012-03-24 10:13:08.679284063 +0100 > *************** PostgreSQL documentation > *** 171,176 **** > --- 171,196 ---- > </varlistentry> > > <varlistentry> > + <term><option>-R <replaceable>number</replaceable></option></term> > + <listitem> > + <para> > + Turn on cursor readahead by default using <replaceable>number</replaceable> > + as readahead window size. > + </para> > + </listitem> > + </varlistentry> > + > + <varlistentry> > + <term><option>--detect-cursor-resultset-size</option></term> > + <listitem> > + <para> > + Detect the cursor result set size during <command>OPEN</command> and > + return it in sqlca.sqlerrd[2]. Make reference to this option in the ecpg-sqlca section, where sqlerrd[2] has its primary documentation. > + </para> > + </listitem> > + </varlistentry> > + > + <varlistentry> > <term><option>-t</option></term> > <listitem> > <para> I did not re-review the code changes in the same detail as before, but nothing caught my attention when scanning through them. If some particular section has changed in a tricky way and deserves a close look, let me know. The documentation-cosmetic and policy questions I raise above shouldn't entail large-scale patch churn, so I've marked the patch Ready for Committer. Thanks, nm
2012-03-29 02:43 keltezéssel, Noah Misch írta: > On Sat, Mar 24, 2012 at 10:49:07AM +0100, Boszormenyi Zoltan wrote: >> Sorry for the delay, I had been busy with other tasks and I rewrote this code >> to better cope with unknown result size, scrollable cursors and negative >> cursor positions. >> >> I think all points raised by Noah is addressed: per-cursor readahead window size, >> extensive comments, documentation and not enabling result set size discovery. > The new comments are a nice improvement; thanks. > >> The problem with WHERE CURRENT OF is solved by a little more grammar >> and ecpglib code, which effectively does a MOVE ABSOLUTE N before >> executing the DML with WHERE CURRENT OF clause. No patching of the >> backend. This way, the new ECPG caching code is compatible with older >> servers but obviously reduces the efficiency of caching. > Good plan. > >> diff -dcrpN postgresql.orig/doc/src/sgml/ecpg.sgml postgresql/doc/src/sgml/ecpg.sgml >> *** postgresql.orig/doc/src/sgml/ecpg.sgml 2012-03-12 09:24:31.699560098 +0100 >> --- postgresql/doc/src/sgml/ecpg.sgml 2012-03-24 10:15:00.538924601 +0100 >> *************** EXEC SQL COMMIT; >> *** 454,459 **** >> --- 454,479 ---- >> details. >> </para> >> >> +<para> >> + ECPG may use cursor readahead to improve performance of programs >> + that use single-row FETCH statements. Option<literal>-R number</literal> >> + option for ECPG modifies the default for all cursors from<literal>NO READAHEAD</literal> > This sentence duplicates the word "option". Fixed. > >> + to<literal>READAHEAD number</literal>. Explicit<literal>READAHEAD number</literal> or >> +<literal>NO READAHEAD</literal> turns cursor readahead on (with<literal>number</literal> >> + as the size of the readahead window) or off for the specified cursor, >> + respectively. For cursors using a non-0 readahead window size is 256 rows, > The number 256 is no longer present in your implementation. Indeed. Oversight, that part of the sentence is deleted. > >> + the window size may be modified by setting the<literal>ECPGFETCHSZ</literal> >> + environment variable to a different value. > I had in mind that DECLARE statements adorned with READAHEAD syntax would > always behave precisely as written, independent of ECPGFETCHSZ or "ecpg -R". > Unadorned DECLARE statements would use ECPGFETCHSZ if set, then the value > passed to "ecpg -R" if provided, and finally a value of 1 (no readahead) in > the absence of both ECPGFETCHSZ and "ecpg -R". Did you do it differently for > a particular reason? I don't particularly object to what you've implemented, > but I'd be curious to hear your thinking. What I had in mind was: NO READAHEAD == READAHEAD 0. This will translate into 1 tuple sized readahead window that cannot be modified by ECPGFETCHSZ. READAHEAD 1 also means uncached by default but ECPGFETCHSZ may modify the readahead window size. Values greater than 1 means cached by default with the specified window size but it can also be overridden (even disabled) just in case. This part can be debated, I agree. This is in add_cursor(): -----8<-----8<-----8<-----8<-----8<-----8<----- desc->cachable = (ra_size > 0); /* * If this cursor is allowed to be cached, the readahead * windows is also allowed to be overridden (possibly * even disabled) by default_fetch_size if it's set. */ if (desc->cachable) desc->ra_size = (default_fetch_size > 0 ? default_fetch_size : ra_size); else desc->ra_size = 1; -----8<-----8<-----8<-----8<-----8<-----8<----- The check (default_fetch_size > 0) can be modified so it reads (default_fetch_size > ra_size) so the override can only happen upwards. This part is policy that can be debated and modified accordingly, it doesn't affect the internals of the caching code. > >> +</para> >> + >> +<para> >> +<command>UPDATE</command> or<command>DELETE</command> with the >> +<literal>WHERE CURRENT OF</literal> clause, cursor readahead may or may not >> + actually improve performance, as<command>MOVE</command> statement has to be >> + sent to the backend before the DML statement to ensure correct cursor position >> + in the backend. >> +</para> > This sentence seems to be missing a word near its beginning. Sounds like a corner case to me, but I am not a native english speaker. Which one sounds better: UPDATE or DELETE with the WHERE CURRENT OF clause... or UPDATE or DELETE statements with the WHERE CURRENT OF clause... ? I went with the second. The committer can change the wording in any way he wants. > >> *************** DECLARE<replaceable class="PARAMETER">c >> *** 6639,6649 **** >> </listitem> >> </varlistentry> >> </variablelist> >> >> <para> >> ! For the meaning of the cursor options, >> ! see<xref linkend="sql-declare">. >> </para> >> </refsect1> >> >> <refsect1> >> --- 6669,6728 ---- >> </listitem> >> </varlistentry> >> </variablelist> >> +</refsect1> >> + >> +<refsect1> >> +<title>Cursor options</title> >> >> <para> >> ! For the meaning of other cursor options, see<xref linkend="sql-declare">. >> </para> >> + >> +<variablelist> >> +<varlistentry> >> +<term><literal>READAHEAD number</literal></term> >> +<term><literal>NO READAHEAD</literal></term> >> +<listitem> >> +<para> >> +<literal>READAHEAD number</literal> makes the ECPG preprocessor and >> + runtime library use a client-side cursor accounting and data readahead >> + during<command>FETCH</command>. This improves performance for programs >> + that use single-row<command>FETCH</command> statements. >> +</para> >> + >> +<para> >> +<literal>NO READAHEAD</literal> disables data readahead in case >> +<parameter>-R number</parameter> is used for compiling the file. >> +</para> >> +</listitem> >> +</varlistentry> > One of the new sections about readahead should somehow reference the hazard > around volatile functions. Done. >> + >> +<varlistentry> >> +<term><literal>OPEN RETURNS LAST ROW POSITION</literal></term> >> +<term><literal>OPEN RETURNS 0 FOR LAST ROW POSITION</literal></term> >> +<listitem> >> +<para> >> + When the cursor is opened, it's possible to discover the size of the result set >> + using<command>MOVE ALL</command> which traverses the result set and move >> + the cursor back to the beginning using<command>MOVE ABSOLUTE 0</command>. >> + The size of the result set is returned in sqlca.sqlerrd[2]. >> +</para> >> + >> +<para> >> + This slows down opening the cursor from the application point of view >> + but may also have side effects. If the cursor query contains volatile function >> + calls with side effects, they will be evaluated twice because of traversing >> + the result set this way during<command>OPEN</command>. >> +</para> >> + >> +<para> >> + The default is not to discover. >> +</para> > I mildly oppose having syntax to enable this per-cursor. Readahead is a > generally handy feature that I might use in new programs, but this feature is > more of a hack for compatibility with some old Informix version. For new > code, authors should do their own MOVEs instead of using this option. > > The patch also adds an undocumented ECPGOPENRETURNSRESULTSETSIZE environment > variable to control this. I see no use for such a thing, because a program's > dependency on sqlerrd[2] is fixed at build time. If a program does not > reference sqlerrd[2] for a cursor, there's no benefit from choosing at runtime > to populate that value anyway. If a program does reference it, any option > needed to have it set correctly had better be set at build time and apply to > every run of the final program. > > IOW, I suggest having only the "ecpg"-time option to enable this behavior. > Thoughts? I wanted to make it available for non-Informix mode and a way to disable it if the command line option enables it. But the extra environment variable seems to be silly indeed. I deleted that part of the code. >> +</listitem> >> +</varlistentry> >> + >> +</variablelist> >> + >> </refsect1> >> >> <refsect1> >> diff -dcrpN postgresql.orig/doc/src/sgml/ref/ecpg-ref.sgml postgresql/doc/src/sgml/ref/ecpg-ref.sgml >> *** postgresql.orig/doc/src/sgml/ref/ecpg-ref.sgml 2011-08-07 11:29:16.003256959 +0200 >> --- postgresql/doc/src/sgml/ref/ecpg-ref.sgml 2012-03-24 10:13:08.679284063 +0100 >> *************** PostgreSQL documentation >> *** 171,176 **** >> --- 171,196 ---- >> </varlistentry> >> >> <varlistentry> >> +<term><option>-R<replaceable>number</replaceable></option></term> >> +<listitem> >> +<para> >> + Turn on cursor readahead by default using<replaceable>number</replaceable> >> + as readahead window size. >> +</para> >> +</listitem> >> +</varlistentry> >> + >> +<varlistentry> >> +<term><option>--detect-cursor-resultset-size</option></term> >> +<listitem> >> +<para> >> + Detect the cursor result set size during<command>OPEN</command> and >> + return it in sqlca.sqlerrd[2]. > Make reference to this option in the ecpg-sqlca section, where sqlerrd[2] has > its primary documentation. Done. >> +</para> >> +</listitem> >> +</varlistentry> >> + >> +<varlistentry> >> <term><option>-t</option></term> >> <listitem> >> <para> > > I did not re-review the code changes in the same detail as before, but nothing > caught my attention when scanning through them. If some particular section > has changed in a tricky way and deserves a close look, let me know. Well, the extra comments should explain everything. The rewrite did not change the execute.c function split, only comments were added. The changes in cursor.c were mainly about being able to properly deal with negative cursor positions, i.e. traversing the cursor backwards, e.g. with FETCH LAST or FETCH ABSOLUTE -N. > The documentation-cosmetic and policy questions I raise above shouldn't entail > large-scale patch churn, so I've marked the patch Ready for Committer. Thanks. The patch with the above changes is attached. Also attached the second patch from the previous mail, in case the committer wants to consider it. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig& Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
2012-03-29 12:59 keltezéssel, Boszormenyi Zoltan írta: > 2012-03-29 02:43 keltezéssel, Noah Misch írta: >> On Sat, Mar 24, 2012 at 10:49:07AM +0100, Boszormenyi Zoltan wrote: >>> + the window size may be modified by setting the<literal>ECPGFETCHSZ</literal> >>> + environment variable to a different value. >> I had in mind that DECLARE statements adorned with READAHEAD syntax would >> always behave precisely as written, independent of ECPGFETCHSZ or "ecpg -R". >> Unadorned DECLARE statements would use ECPGFETCHSZ if set, then the value >> passed to "ecpg -R" if provided, and finally a value of 1 (no readahead) in >> the absence of both ECPGFETCHSZ and "ecpg -R". Did you do it differently for >> a particular reason? I don't particularly object to what you've implemented, >> but I'd be curious to hear your thinking. > > What I had in mind was: > > NO READAHEAD == READAHEAD 0. This will translate into 1 tuple sized > readahead window that cannot be modified by ECPGFETCHSZ. After the core patch, this is not totally true yet. The "NO READAHEAD" cursors don't go through the new cursor functions at all. But cursor.c is prepared for the second patch that makes all cursors use the caching code. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig& Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
On Thu, Mar 29, 2012 at 12:59:40PM +0200, Boszormenyi Zoltan wrote: > 2012-03-29 02:43 keltez?ssel, Noah Misch ?rta: >> On Sat, Mar 24, 2012 at 10:49:07AM +0100, Boszormenyi Zoltan wrote: >>> + to<literal>READAHEAD number</literal>. Explicit<literal>READAHEAD number</literal> or >>> +<literal>NO READAHEAD</literal> turns cursor readahead on (with<literal>number</literal> >>> + as the size of the readahead window) or off for the specified cursor, >>> + respectively. For cursors using a non-0 readahead window size is 256 rows, >> The number 256 is no longer present in your implementation. > > Indeed. Oversight, that part of the sentence is deleted. > >> >>> + the window size may be modified by setting the<literal>ECPGFETCHSZ</literal> >>> + environment variable to a different value. >> I had in mind that DECLARE statements adorned with READAHEAD syntax would >> always behave precisely as written, independent of ECPGFETCHSZ or "ecpg -R". >> Unadorned DECLARE statements would use ECPGFETCHSZ if set, then the value >> passed to "ecpg -R" if provided, and finally a value of 1 (no readahead) in >> the absence of both ECPGFETCHSZ and "ecpg -R". Did you do it differently for >> a particular reason? I don't particularly object to what you've implemented, >> but I'd be curious to hear your thinking. > > What I had in mind was: > > NO READAHEAD == READAHEAD 0. This will translate into 1 tuple sized > readahead window that cannot be modified by ECPGFETCHSZ. > > READAHEAD 1 also means uncached by default but ECPGFETCHSZ may > modify the readahead window size. To me, it feels too magical to make READAHEAD 1 and READAHEAD 0 differ only in whether ECPGFETCHSZ can override. I'm willing to go with whatever consensus arises on this topic, though. > This part is policy that can be debated and modified accordingly, > it doesn't affect the internals of the caching code. Agreed. >>> +</para> >>> + >>> +<para> >>> +<command>UPDATE</command> or<command>DELETE</command> with the >>> +<literal>WHERE CURRENT OF</literal> clause, cursor readahead may or may not >>> + actually improve performance, as<command>MOVE</command> statement has to be >>> + sent to the backend before the DML statement to ensure correct cursor position >>> + in the backend. >>> +</para> >> This sentence seems to be missing a word near its beginning. > > Sounds like a corner case to me, but I am not a native english speaker. > Which one sounds better: > > UPDATE or DELETE with the WHERE CURRENT OF clause... > > or > > UPDATE or DELETE statements with the WHERE CURRENT OF clause... > ? Here is the simplest change to make the original grammatical: Given an UPDATE or DELETE with the WHERE CURRENT OF clause, ... I might write the whole thing this way: To execute an UPDATE or DELETE statement bearing the WHERE CURRENT OF clause, ECPG may first execute an implicit MOVE tosynchronize the server cursor position with the local cursor position. This can reduce or eliminate the performance benefitof readahead for affected cursors. >> one of the new sections about readahead should somehow reference the hazard >> around volatile functions. > > Done. I don't see the mention in your latest patch. You do mention it for the sqlerrd[2] compatibility stuff. >>> +<varlistentry> >>> +<term><literal>OPEN RETURNS LAST ROW POSITION</literal></term> >>> +<term><literal>OPEN RETURNS 0 FOR LAST ROW POSITION</literal></term> >>> +<listitem> >>> +<para> >>> + When the cursor is opened, it's possible to discover the size of the result set >>> + using<command>MOVE ALL</command> which traverses the result set and move >>> + the cursor back to the beginning using<command>MOVE ABSOLUTE 0</command>. >>> + The size of the result set is returned in sqlca.sqlerrd[2]. >>> +</para> >>> + >>> +<para> >>> + This slows down opening the cursor from the application point of view >>> + but may also have side effects. If the cursor query contains volatile function >>> + calls with side effects, they will be evaluated twice because of traversing >>> + the result set this way during<command>OPEN</command>. >>> +</para> >>> + >>> +<para> >>> + The default is not to discover. >>> +</para> >> I mildly oppose having syntax to enable this per-cursor. Readahead is a >> generally handy feature that I might use in new programs, but this feature is >> more of a hack for compatibility with some old Informix version. For new >> code, authors should do their own MOVEs instead of using this option. >> >> The patch also adds an undocumented ECPGOPENRETURNSRESULTSETSIZE environment >> variable to control this. I see no use for such a thing, because a program's >> dependency on sqlerrd[2] is fixed at build time. If a program does not >> reference sqlerrd[2] for a cursor, there's no benefit from choosing at runtime >> to populate that value anyway. If a program does reference it, any option >> needed to have it set correctly had better be set at build time and apply to >> every run of the final program. >> >> IOW, I suggest having only the "ecpg"-time option to enable this behavior. >> Thoughts? > > I wanted to make it available for non-Informix mode and a way to > disable it if the command line option enables it. I can understand the desire to have a way to disable it. Perhaps you have a bunch of old Informix code, so you enable the global option and later discover some expensive cursors that don't use sqlerrd[2]. It would be nice to locally suppress the behavior for those cursors. Still, we're looking at dedicated ECPG syntax, quite visible even to folks with no interest in Informix. We have eschewed littering our syntax with compatibility aids, and I like it that way. IMO, an option to the "ecpg" preprocessor is an acceptable level of muddle to provide this aid. A DECLARE CURSOR syntax extension goes too far. Thanks, nm
On Thu, Mar 29, 2012 at 01:03:41PM -0400, Noah Misch wrote: > Still, we're looking at dedicated ECPG syntax, quite visible even to folks > with no interest in Informix. We have eschewed littering our syntax with > compatibility aids, and I like it that way. IMO, an option to the "ecpg" > preprocessor is an acceptable level of muddle to provide this aid. A DECLARE > CURSOR syntax extension goes too far. +1 from me. Let's not add special ecpg sql syntax if we don't absolutely have to. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
2012-03-29 20:34 keltezéssel, Michael Meskes írta: > On Thu, Mar 29, 2012 at 01:03:41PM -0400, Noah Misch wrote: >> Still, we're looking at dedicated ECPG syntax, quite visible even to folks >> with no interest in Informix. We have eschewed littering our syntax with >> compatibility aids, and I like it that way. IMO, an option to the "ecpg" >> preprocessor is an acceptable level of muddle to provide this aid. A DECLARE >> CURSOR syntax extension goes too far. > +1 from me. > > Let's not add special ecpg sql syntax if we don't absolutely have to. > > Michael This is what I waited for, finally another opinion. I will delete this from the grammar. What about the other question about the 0 vs 1 distinction? Without the second patch, 0 drives the cursor with the old ECPGdo() code. All other values drive the cursor via the new ECPGopen() et.al. Should we keep this behaviour? In this case the 0 vs 1 distinction is not needed in add_cursor() as the 0 value doesn't even reach ECPGopen(). But if we consider including the second patch as well, should we keep the "NO READAHEAD" option in the grammar and in cursor.c? After solving the problem with WHERE CURRENT OF, this and the 0-vs-1 distinction starts to feel pointless. And what about the override via the environment variable? Should it work only upwards? Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig& Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
2012-03-29 19:03 keltezéssel, Noah Misch írta: > To me, it feels too magical to make READAHEAD 1 and READAHEAD 0 differ only in > whether ECPGFETCHSZ can override. I'm willing to go with whatever consensus > arises on this topic, though. I now removed this distinction because the grammar does it so the value 0 doesn't even reach ECPGopen(). >> This part is policy that can be debated and modified accordingly, >> it doesn't affect the internals of the caching code. > Agreed. > >>>> +</para> >>>> + >>>> +<para> >>>> +<command>UPDATE</command> or<command>DELETE</command> with the >>>> +<literal>WHERE CURRENT OF</literal> clause, cursor readahead may or may not >>>> + actually improve performance, as<command>MOVE</command> statement has to be >>>> + sent to the backend before the DML statement to ensure correct cursor position >>>> + in the backend. >>>> +</para> >>> This sentence seems to be missing a word near its beginning. >> Sounds like a corner case to me, but I am not a native english speaker. >> Which one sounds better: >> >> UPDATE or DELETE with the WHERE CURRENT OF clause... >> >> or >> >> UPDATE or DELETE statements with the WHERE CURRENT OF clause... >> ? > Here is the simplest change to make the original grammatical: > > Given an UPDATE or DELETE with the WHERE CURRENT OF clause, ... > > I might write the whole thing this way: > > To execute an UPDATE or DELETE statement bearing the WHERE CURRENT OF > clause, ECPG may first execute an implicit MOVE to synchronize the server > cursor position with the local cursor position. This can reduce or > eliminate the performance benefit of readahead for affected cursors. OK, I used your text. >>> one of the new sections about readahead should somehow reference the hazard >>> around volatile functions. >> Done. > I don't see the mention in your latest patch. You do mention it for the > sqlerrd[2] compatibility stuff. sqlerrd[2] compatibility stuff? I mentioned it in section "ecpg-sqlca", this is the main documentation section, not the compatibility one AFAIK. Anyway, I now reference the volatile function hazard in the first paragraphs added to section "ecpg-cursors". > I can understand the desire to have a way to disable it. Perhaps you have a > bunch of old Informix code, so you enable the global option and later discover > some expensive cursors that don't use sqlerrd[2]. It would be nice to locally > suppress the behavior for those cursors. > > Still, we're looking at dedicated ECPG syntax, quite visible even to folks > with no interest in Informix. We have eschewed littering our syntax with > compatibility aids, and I like it that way. IMO, an option to the "ecpg" > preprocessor is an acceptable level of muddle to provide this aid. A DECLARE > CURSOR syntax extension goes too far. I have now deleted this grammar extension. Attached is the new core feature patch. Summary of changes: - documentation changes according to your query - cursor.c doesn't distinguish between 0 and 1, this is the job of the grammar. - I saved some MOVE statements in case the cursor position in the backend is what we want at the moment. This means READAHEAD 1 is at the same level of performance like without caching, there are no extra implicit MOVEs. WHERE CURRENT OF also avoids MOVEs in this case. The caching code is now clearly faster in case when there are a lot of MOVEs without actual FETCHes. This is only an if () condition around the previously coded MOVE block in ecpg_cursor_execute(). It doesn't change the semantics, only omits the MOVE in case it's not needed. I also refreshed the second patch that drives all cursors with the new cursor functions. Because the runtime doesn't handle 0, 1 is the default readahead window size. Now, NO READAHEAD would mean READAHEAD 1 but ECPGFETCHSZ can also override it. This means that NO READAHEAD makes no sense, so it's removed from the grammar and the documentation is changed accordingly. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig& Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
On Fri, Mar 30, 2012 at 12:48:07AM +0200, Boszormenyi Zoltan wrote: > 2012-03-29 19:03 keltez?ssel, Noah Misch ?rta: >>>> one of the new sections about readahead should somehow reference the hazard >>>> around volatile functions. >>> Done. >> I don't see the mention in your latest patch. You do mention it for the >> sqlerrd[2] compatibility stuff. > > sqlerrd[2] compatibility stuff? I mentioned it in section "ecpg-sqlca", this is the main > documentation section, not the compatibility one AFAIK. Anyway, I now reference the volatile > function hazard in the first paragraphs added to section "ecpg-cursors". This patch adds two features, and those features are independent from a user perspective. The primary feature is cursor readahead, and the secondary feature is "ecpg --detect-cursor-resultset-size" (the referent of my above "sqlerrd[2] compatibility stuff" reference). Each feature has independent semantic implications when the application uses cursors on queries that call volatile functions. Under --detect-cursor-resultset-size, we will execute functions for all rows at OPEN time and again for each row at FETCH time. When you declare a cursor with "READAHEAD n" and do not FETCH it to the end, up to "n" unFETCHed rows will nonetheless have their functions executed. If the volatile function is something like clock_timestamp(), the application will observe the executions to have happened in clusters of "n" rather than in step with the application's FETCH calls. Your latest patch revision hints at the semantic implications for "ecpg --detect-cursor-resultset-size", but it does not mention them for readahead. Then again, perhaps it's sufficiently obvious to not warrant mention. Without knowing internals, I would not expect users to guess the consequence of "ecpg --detect-cursor-resultset-size". With readahead, it may be guessable enough. Thanks, nm
On Fri, Mar 30, 2012 at 12:48:07AM +0200, Boszormenyi Zoltan wrote: > Attached is the new core feature patch. Summary of changes: > ... > I also refreshed the second patch that drives all cursors with the new > ... I'm slightly confused here. It seems Zoltan added a second patch *after* Noah marked this patch as ready for committer. That second patch seems to apply cleanly after the first one got applied. Now, which one was reviewed and is considered ready for commit? The first one? Or both? Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
On Sat, Apr 07, 2012 at 01:20:08PM +0200, Michael Meskes wrote: > On Fri, Mar 30, 2012 at 12:48:07AM +0200, Boszormenyi Zoltan wrote: > > Attached is the new core feature patch. Summary of changes: > > ... > > I also refreshed the second patch that drives all cursors with the new > > ... > > I'm slightly confused here. It seems Zoltan added a second patch *after* Noah > marked this patch as ready for committer. That second patch seems to apply > cleanly after the first one got applied. Now, which one was reviewed and is > considered ready for commit? The first one? Or both? Both. The second patch appeared after my first review, based on a comment in that review. I looked at it during my re-review before marking the overall project Ready for Committer. I do call your attention to a question I raised in my second review: if a program contains "DECLARE foo READAHEAD 5 CURSOR FOR ..." and the user runs the program with ECPGFETCHSZ=10 in the environment, should that cursor use a readahead window of 5 or of 10? Original commentary: http://archives.postgresql.org/message-id/20120329004323.GA17329@tornado.leadboat.com
On Sat, Apr 07, 2012 at 11:50:42AM -0400, Noah Misch wrote: > Both. The second patch appeared after my first review, based on a comment in > that review. I looked at it during my re-review before marking the overall > project Ready for Committer. Thanks. > I do call your attention to a question I raised in my second review: if a > program contains "DECLARE foo READAHEAD 5 CURSOR FOR ..." and the user runs > the program with ECPGFETCHSZ=10 in the environment, should that cursor use a > readahead window of 5 or of 10? Original commentary: > http://archives.postgresql.org/message-id/20120329004323.GA17329@tornado.leadboat.com I'd say it should be 5. I don't like an environment variable overwriting a hard-coded setting. I think this is what you, Noah, thought, too, right? I'd say let's change this. Is it possible to allow just READAHEAD without a number? In that case I would accept the environment variable. And some comments mostly directed at Zoltan: ecpg --help says ...default 0 (disabled)..., but options less than 1 are not accepted and the default setting of 1 has a comment "Disabled by default". I guess this needs to be adjusted. Is there a reason why two new options for ecpg were invented? Normally ecpg options define how the preprocessor works but not the resulting binary. Well, different preprocessor behaviour might result in different binary behaviour of course. The only option that only effects the resulting binary is "-r" for "runtime". Again, this is not completely true as the option has to make its way into the binary, but that's it. Now I wonder whether it would make more sense to add the two options as runtime options instead. The --detect-cursor-resultset-size option should work there without a problem. I haven't delved into the source code enough to find out if -R changes something in the compiler stage. The test case cursor-readahead.pgc has a comment saying "test automatic prepare for all statements". Copy/Paste error? I cannot find a test that tests the environment variable giving the fetch size. Could you please point me to that? Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
2012-04-08 16:25 keltezéssel, Michael Meskes írta: > On Sat, Apr 07, 2012 at 11:50:42AM -0400, Noah Misch wrote: >> Both. The second patch appeared after my first review, based on a comment in >> that review. I looked at it during my re-review before marking the overall >> project Ready for Committer. > Thanks. > >> I do call your attention to a question I raised in my second review: if a >> program contains "DECLARE foo READAHEAD 5 CURSOR FOR ..." and the user runs >> the program with ECPGFETCHSZ=10 in the environment, should that cursor use a >> readahead window of 5 or of 10? Original commentary: >> http://archives.postgresql.org/message-id/20120329004323.GA17329@tornado.leadboat.com > I'd say it should be 5. I don't like an environment variable overwriting a > hard-coded setting. I think this is what you, Noah, thought, too, right? I'd > say let's change this. Do you want me to change this or will you do it? I am on holiday and will be back to work on wednesday. The possibility to test different readahead window sizes without modifying the source and recompiling was useful. > Is it possible to allow just READAHEAD without a number? > In that case I would accept the environment variable. After the 2nd patch is applied, this is exactly the case. The cursors are driven and accounted using the new functions but with the readahead window being a single row as the default value for fetch_readahead is 1. > > And some comments mostly directed at Zoltan: > > ecpg --help says ...default 0 (disabled)..., but options less than 1 are not > accepted and the default setting of 1 has a comment "Disabled by default". I > guess this needs to be adjusted. Yes, the help text was not changed in the 2nd patch, I missed that. > > Is there a reason why two new options for ecpg were invented? Normally ecpg > options define how the preprocessor works but not the resulting binary. The -R option simply provides a default without ornamenting the DECLARE statement. > Well, > different preprocessor behaviour might result in different binary behaviour of > course. The only option that only effects the resulting binary is "-r" for > "runtime". Again, this is not completely true as the option has to make its way > into the binary, but that's it. Now I wonder whether it would make more sense > to add the two options as runtime options instead. The > --detect-cursor-resultset-size option should work there without a problem. You are right. This can be a suboption to "-r". > I > haven't delved into the source code enough to find out if -R changes something > in the compiler stage. "-R" works just like "-r" in the sense that a value gets passed to the runtime. "-R" simply changes the default value that gets passed if no READAHEAD N clause is specified for a cursor. This is true only if you intend to apply both paches. Without the 2nd patch and fetch_readahead=0 (no -R option given) or NO READAHEAD is specified for a cursor, the compiler makes a distinction between uncachable cursors driven by ECPGdo() and cachable cursors driven by the new runtime functions. With the 2nd patch applied, this distinction is no more. > > The test case cursor-readahead.pgc has a comment saying "test automatic prepare > for all statements". Copy/Paste error? It must be, yes. > I cannot find a test that tests the environment variable giving the fetch size. > Could you please point me to that? I didn't write such a test. The reason is that while variables are exported by make from the Makefile to the binaries run by make e.g. CFLAGS et.al. for $(CC), "make check" simply runs pg_regress once which uses its own configuration file that doesn't have a way to set or unset an environment variable. This could be a useful extension to pg_regress though. Best regards, Zoltán Böszörményi > > Michael -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig& Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
On Sun, Apr 08, 2012 at 06:35:33PM +0200, Boszormenyi Zoltan wrote: > Do you want me to change this or will you do it? I am on holiday > and will be back to work on wednesday. I don't think waiting till later this week is a real problem. > The possibility to test different readahead window sizes > without modifying the source and recompiling was useful. Sure, but you can still do that when not defining a fixed number in the statement. > The -R option simply provides a default without ornamenting > the DECLARE statement. Could you please incorporate these changes, too, when you're back from vacation? > >I cannot find a test that tests the environment variable giving the fetch size. > >Could you please point me to that? > > I didn't write such a test. The reason is that while variables are > exported by make from the Makefile to the binaries run by make > e.g. CFLAGS et.al. for $(CC), "make check" simply runs pg_regress > once which uses its own configuration file that doesn't have a > way to set or unset an environment variable. This could be a useful > extension to pg_regress though. How about calling setenv() from the test program itself? Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
On Sun, Apr 08, 2012 at 04:25:01PM +0200, Michael Meskes wrote: > On Sat, Apr 07, 2012 at 11:50:42AM -0400, Noah Misch wrote: > > I do call your attention to a question I raised in my second review: if a > > program contains "DECLARE foo READAHEAD 5 CURSOR FOR ..." and the user runs > > the program with ECPGFETCHSZ=10 in the environment, should that cursor use a > > readahead window of 5 or of 10? Original commentary: > > http://archives.postgresql.org/message-id/20120329004323.GA17329@tornado.leadboat.com > > I'd say it should be 5. I don't like an environment variable overwriting a > hard-coded setting. I think this is what you, Noah, thought, too, right? Yes.
2012-04-08 19:38 keltezéssel, Michael Meskes írta: > On Sun, Apr 08, 2012 at 06:35:33PM +0200, Boszormenyi Zoltan wrote: >> Do you want me to change this or will you do it? I am on holiday >> and will be back to work on wednesday. > I don't think waiting till later this week is a real problem. OK. > >> The possibility to test different readahead window sizes >> without modifying the source and recompiling was useful. > Sure, but you can still do that when not defining a fixed number in the > statement. OK. > >> The -R option simply provides a default without ornamenting >> the DECLARE statement. > Could you please incorporate these changes, too, when you're back from vacation? Sure. So, it's established that a specified READAHEAD N should not be overridden. Even an explicit READAHEAD 1. Only a non-decorated cursor can be overridden, even if a different default readahead window size is specified with e.g. "ecpg -R 8". If ECPGFETCHSZ is not present, 8 will be used, if ECPGFETCHSZ is present, its value will be used. ECPGopen() will need an extra bool argument to distinguish this. Is this acceptable? Noah, Michael? > >>> I cannot find a test that tests the environment variable giving the fetch size. >>> Could you please point me to that? >> I didn't write such a test. The reason is that while variables are >> exported by make from the Makefile to the binaries run by make >> e.g. CFLAGS et.al. for $(CC), "make check" simply runs pg_regress >> once which uses its own configuration file that doesn't have a >> way to set or unset an environment variable. This could be a useful >> extension to pg_regress though. > How about calling setenv() from the test program itself? Sure, I didn't think about it. It should be done before the first EXEC SQL OPEN cursor. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig& Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
On Tue, Apr 10, 2012 at 09:35:21AM +0200, Boszormenyi Zoltan wrote: > So, it's established that a specified READAHEAD N should not > be overridden. Even an explicit READAHEAD 1. > > Only a non-decorated cursor can be overridden, even if > a different default readahead window size is specified with > e.g. "ecpg -R 8". If ECPGFETCHSZ is not present, 8 will be used, > if ECPGFETCHSZ is present, its value will be used. ECPGopen() > will need an extra bool argument to distinguish this. > > Is this acceptable? Noah, Michael? Sounds perfect.
On Tue, Apr 10, 2012 at 10:37:22AM -0400, Noah Misch wrote: > > Only a non-decorated cursor can be overridden, even if > > a different default readahead window size is specified with > > e.g. "ecpg -R 8". If ECPGFETCHSZ is not present, 8 will be used, > > if ECPGFETCHSZ is present, its value will be used. ECPGopen() > > will need an extra bool argument to distinguish this. > > > > Is this acceptable? Noah, Michael? > > Sounds perfect. Fine by me. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
2012-04-10 16:55 keltezéssel, Michael Meskes írta: > On Tue, Apr 10, 2012 at 10:37:22AM -0400, Noah Misch wrote: >>> Only a non-decorated cursor can be overridden, even if >>> a different default readahead window size is specified with >>> e.g. "ecpg -R 8". If ECPGFETCHSZ is not present, 8 will be used, >>> if ECPGFETCHSZ is present, its value will be used. ECPGopen() >>> will need an extra bool argument to distinguish this. >>> >>> Is this acceptable? Noah, Michael? >> Sounds perfect. > Fine by me. > > Michael OK. Next question: now that both patches are intended to be applied, should I send a unified single patch that contains the previous functionality and the required fixes or a new one that only contains the last required fixes? Thanks in advance, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig& Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
On Tue, Apr 10, 2012 at 05:24:55PM +0200, Boszormenyi Zoltan wrote: > OK. Next question: now that both patches are intended to be applied, > should I send a unified single patch that contains the previous functionality > and the required fixes or a new one that only contains the last required fixes? I'm fine with whatever is easier for you. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
2012-04-10 17:34 keltezéssel, Michael Meskes írta: > On Tue, Apr 10, 2012 at 05:24:55PM +0200, Boszormenyi Zoltan wrote: >> OK. Next question: now that both patches are intended to be applied, >> should I send a unified single patch that contains the previous functionality >> and the required fixes or a new one that only contains the last required fixes? > I'm fine with whatever is easier for you. > > Michael I guess the second option is easier for all of us because reviewing it doesn't invalidate the previous ones. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig& Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Hi, 2012-04-10 16:55 keltezéssel, Michael Meskes írta: > On Tue, Apr 10, 2012 at 10:37:22AM -0400, Noah Misch wrote: >>> Only a non-decorated cursor can be overridden, even if >>> a different default readahead window size is specified with >>> e.g. "ecpg -R 8". If ECPGFETCHSZ is not present, 8 will be used, >>> if ECPGFETCHSZ is present, its value will be used. ECPGopen() >>> will need an extra bool argument to distinguish this. >>> >>> Is this acceptable? Noah, Michael? >> Sounds perfect. > Fine by me. > > Michael you commented on "two new options were added and they should be suboptions to -r". I looked at "man getopt_long" to see what I can do about the "-R" option and there seems to be a getsubopt() call which is an extension to getopt_long. My manpage under Fedora 16 says this: NAME getsubopt - parse suboption arguments from a string SYNOPSIS #include <stdlib.h> int getsubopt(char **optionp, char * const *tokens, char **valuep); Feature Test Macro Requirements for glibc (see feature_test_macros(7)): getsubopt(): _XOPEN_SOURCE >= 500 || _XOPEN_SOURCE && _XOPEN_SOURCE_EXTENDED || /* Since glibc 2.12: */ _POSIX_C_SOURCE >= 200809L I wonder whether the manual parsing of "-r" suboptions may be rewritten using this function or PostgreSQL supports systems without the above X/Open or POSIX support levels. Anyway, to make it possible to rewrite using the above call, I modified "-R" and it's now "-r readahead=number". Documentation is adjusted. With the above, it would be possible to use a comma separated list of "-r" suboptions, e.g. "-r prepare,questionmarks,readahead=16" in one option. Summary of other changes: - The result set size detection is a suboption of "-r", documentation is adjusted. - Only undecorated cursors use ECPGFETCHSZ, documentation is adjusted - "ecpg --help says ...default 0 (disabled)..." fixed. - Comment in cursor-readahead.pgc is fixed. - New regression test that exercises ECPGFETCHSZ=8 and a "non-readahead" cursor. The stderr file shows the "fetch forward 8" executed by the runtime. - Also added a note to the documentation about a possible performance trap if a previously written ECPG application uses its own custom readahead via multi-row FETCH statements. This patch should be applied over the two patches I last sent. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig& Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
On Tue, Apr 10, 2012 at 07:56:35PM +0200, Boszormenyi Zoltan wrote: > With the above, it would be possible to use a comma separated list of "-r" > suboptions, e.g. "-r prepare,questionmarks,readahead=16" in one option. Yes, that sounds like a good plan. But of course it's outside the scope of this patch, so we can add this later on. > - Also added a note to the documentation about a possible performance trap > if a previously written ECPG application uses its own custom readahead via > multi-row FETCH statements. I didn't know that before you send this patch. Noah, did you? Frankly, I don't like this at all. If I got it right that means a FETCH N is essantially computed as N times FETCH 1 unless you either add a non-standard option to the DECLARE statement or you add a command-line option to ecpg. Did I get that right? If so we would deliberately make ecpglib work incorrectly and remove performance. Why is that? I'm interested in what others think, but to me that sounds like a show-stopper. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
2012-04-16 04:46 keltezéssel, Michael Meskes írta: > On Tue, Apr 10, 2012 at 07:56:35PM +0200, Boszormenyi Zoltan wrote: >> With the above, it would be possible to use a comma separated list of "-r" >> suboptions, e.g. "-r prepare,questionmarks,readahead=16" in one option. > Yes, that sounds like a good plan. But of course it's outside the scope of this > patch, so we can add this later on. > >> - Also added a note to the documentation about a possible performance trap >> if a previously written ECPG application uses its own custom readahead via >> multi-row FETCH statements. > I didn't know that before you send this patch. Noah, did you? > > Frankly, I don't like this at all. If I got it right that means a FETCH N is > essantially computed as N times FETCH 1 unless you either add a non-standard > option to the DECLARE statement or you add a command-line option to ecpg. Did I > get that right? Yes, just like when the readahead window set to 256, FETCH 1024 will iterate through 4 windows or FETCH 64 iterates through the same window 4 times. This is the idea behind the "readahead window". > If so we would deliberately make ecpglib work incorrectly and remove > performance. Why is that? I'm interested in what others think, but to me that > sounds like a show-stopper. How about allowing the readahead window to be resized for the non-decorated case if the runtime encounters FETCH N and N is greater than the previous window? > > Michael > -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig& Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
On Mon, Apr 16, 2012 at 06:24:57AM +0200, Boszormenyi Zoltan wrote: > Yes, just like when the readahead window set to 256, FETCH 1024 > will iterate through 4 windows or FETCH 64 iterates through the > same window 4 times. This is the idea behind the "readahead window". Really? It's definitely not the idea behind FETCH 1024. Using the same window 4 times for FETCH 64 is the idea though, I agree. > How about allowing the readahead window to be resized for the > non-decorated case if the runtime encounters FETCH N and N is > greater than the previous window? To be resized by what? IMO a FETCH N should always be a FETCH N, no matter what, i.e. if the readahead window is larger, use it, but even if it's smaller we should still fetch N at the same time. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
2012-04-16 18:04 keltezéssel, Michael Meskes írta: > On Mon, Apr 16, 2012 at 06:24:57AM +0200, Boszormenyi Zoltan wrote: >> Yes, just like when the readahead window set to 256, FETCH 1024 >> will iterate through 4 windows or FETCH 64 iterates through the >> same window 4 times. This is the idea behind the "readahead window". > Really? It's definitely not the idea behind FETCH 1024. Using the same window 4 > times for FETCH 64 is the idea though, I agree. OK. I would like to stretch your agreement a little. :-) Can we state that caching means that if the cache should serve the incoming request(s) until the request spills out of it? If your answer to the above is "yes", then please consider this case: - readahead window is 256 (via ECPGFETCHSZ) - FETCH 64 was executed twice, so you are in the middle of the cache - FETCH 1024 is requested So, if I understand you correctly, you expect this scenario: - set a "one-time" readahead window size ( N - # of rows that can be served = 1024 - 128 = 768) so the next FETCH by theruntime will fullfill this request fully - serve the request's first 128 rows from the current cache - for the 129th row, FETCH 768 will be executed - all subsequent requests use the old readahead size >> How about allowing the readahead window to be resized for the >> non-decorated case if the runtime encounters FETCH N and N is >> greater than the previous window? > To be resized by what? By the new FETCH request. Instead of the above, I imagined this: - the runtime notices that the new request is larger than the current readahead window size, modifies the readahead windowsize upwards, so the next FETCH will use it - serve the request's first 128 rows from the current cache - for the 129th row, FETCH 1024 will be executed and the remaining 768 rows will be served from the new cache - all subsequent requests use the new readahead size, 1024 > IMO a FETCH N should always be a FETCH N, no matter > what This part of your statement contradicts with caching. :-) > , i.e. if the readahead window is larger, use it, but even if it's smaller > we should still fetch N at the same time. So, there can be occasional one-time larger requests but smaller ones should apply the set window size, right? Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig& Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
On Mon, Apr 16, 2012 at 07:18:07PM +0200, Boszormenyi Zoltan wrote: > OK. I would like to stretch your agreement a little. :-) > ... Yeah, you got a point here. > By the new FETCH request. Instead of the above, I imagined this: > - the runtime notices that the new request is larger than the current > readahead window size, modifies the readahead window size upwards, > so the next FETCH will use it > - serve the request's first 128 rows from the current cache > - for the 129th row, FETCH 1024 will be executed and the remaining > 768 rows will be served from the new cache That means window size goes up to 1024-128 for that one case? > - all subsequent requests use the new readahead size, 1024 Sounds reasonable to me. > So, there can be occasional one-time larger requests but > smaller ones should apply the set window size, right? Yes. I do agree that FETCH N cannot fetch N all the time, but please make it work like what you suggested to make sure people don't have to recompile. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
2012-04-17 05:52 keltezéssel, Michael Meskes írta: > On Mon, Apr 16, 2012 at 07:18:07PM +0200, Boszormenyi Zoltan wrote: >> OK. I would like to stretch your agreement a little. :-) >> ... > Yeah, you got a point here. > >> By the new FETCH request. Instead of the above, I imagined this: >> - the runtime notices that the new request is larger than the current >> readahead window size, modifies the readahead window size upwards, >> so the next FETCH will use it >> - serve the request's first 128 rows from the current cache >> - for the 129th row, FETCH 1024 will be executed and the remaining >> 768 rows will be served from the new cache > That means window size goes up to 1024-128 for that one case? I listed two scenarios. 1. occasional bump of the readahead window for large requests, for smaller requests it uses the originally set size 2. permanent bump of the readahead window for large requests (larger than previously seen), all subsequent requests use the new size Both can be implemented easily, which one do you prefer? If you always use very large requests, 1) behaves like 2) > >> - all subsequent requests use the new readahead size, 1024 > Sounds reasonable to me. > >> So, there can be occasional one-time larger requests but >> smaller ones should apply the set window size, right? > Yes. I do agree that FETCH N cannot fetch N all the time, but please make it > work like what you suggested to make sure people don't have to recompile. > > Michael -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig& Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
On Tue, Apr 17, 2012 at 06:02:34AM +0200, Boszormenyi Zoltan wrote: > I listed two scenarios. > 1. occasional bump of the readahead window for large requests, > for smaller requests it uses the originally set size > 2. permanent bump of the readahead window for large requests > (larger than previously seen), all subsequent requests use > the new size > > Both can be implemented easily, which one do you prefer? > If you always use very large requests, 1) behaves like 2) I'd say let's go for #2. #1 is probably more efficient but not what the programmer asked us to do. After all it's easy to increase the window size accordingly if you want so as a programmer. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
Hi, 2012-04-17 06:48 keltezéssel, Michael Meskes írta: > On Tue, Apr 17, 2012 at 06:02:34AM +0200, Boszormenyi Zoltan wrote: >> I listed two scenarios. >> 1. occasional bump of the readahead window for large requests, >> for smaller requests it uses the originally set size >> 2. permanent bump of the readahead window for large requests >> (larger than previously seen), all subsequent requests use >> the new size >> >> Both can be implemented easily, which one do you prefer? >> If you always use very large requests, 1) behaves like 2) > I'd say let's go for #2. #1 is probably more efficient but not what the > programmer asked us to do. After all it's easy to increase the window size > accordingly if you want so as a programmer. > > Michael OK, I will implement #2. Another question popped up: what to do with FETCH ALL? The current readahead window size or temporarily bumping it to say some tens of thousands can be used. We may not know how much is the "all records". This, although lowers performance, saves memory. Please, don't apply this patch yet. I discovered a rather big hole that can confuse the cursor position tracking if you do this: DECLARE mycur; MOVE ABSOLUTE n IN mycur; MOVE BACKWARD m IN mycur; If (n+m) is greater, but (n-m) is smaller than the number of rows in the cursor, the backend's and the caching code's ideas about where the cursor is will differ. I need to fix this before it can be applied. That will also need a new round of review. Sorry for that. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig& Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
> OK, I will implement #2. Another question popped up: what to do > with FETCH ALL? The current readahead window size or temporarily > bumping it to say some tens of thousands can be used. We may not > know how much is the "all records". This, although lowers performance, > saves memory. I would say doing a large fetch in two or three batches won't cost too much in terms of performance. > Please, don't apply this patch yet. I discovered a rather big hole > that can confuse the cursor position tracking if you do this: > ... > That will also need a new round of review. Sorry for that. No problem, better to find it now instead of after release. Anyway, I moved the patch to 2012-next (I hope I did it correctly) so 2012-1 can be closed. Let's try to get this patch done in the next commit fest. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
Hi, I am restarting this old thread... :-) 2012-04-24 10:17 keltezéssel, Michael Meskes írta: >> OK, I will implement #2. Another question popped up: what to do >> with FETCH ALL? The current readahead window size or temporarily >> bumping it to say some tens of thousands can be used. We may not >> know how much is the "all records". This, although lowers performance, >> saves memory. > I would say doing a large fetch in two or three batches won't cost too much in > terms of performance. > >> Please, don't apply this patch yet. I discovered a rather big hole >> that can confuse the cursor position tracking if you do this: >> ... >> That will also need a new round of review. Sorry for that. > No problem, better to find it now instead of after release. > > Anyway, I moved the patch to 2012-next (I hope I did it correctly) so 2012-1 > can be closed. Let's try to get this patch done in the next commit fest. > > Michael I had time to look into this patch of mine again after about 1.5 years. Frankly, this time was too long to remember every detail of the patch and looking at parts of the patch as a big entity was confusing. So I started fresh and to make review easier, I broke the patch up into small pieces that all build on each other. I have also fixed quite a few bugs, mostly in my code, but some in the ECPG parser and the regression tests as well. I have put the broken up patchset into a GIT tree of mine at GitHub: https://github.com/zboszor/ecpg-readahead/ but the huge compressed patch is also attached for reference. It was generated with $ git diff 221e92f64c6e136e550ec2592aac3ae0d4623209..870922676e6ae0faa4ebbf94b92e0b97ec418e16 ECPG regression tests are now Valgrind-clean except two of them but both are pre-existing bugs. 1. ecpg/test/compat_informix/rfmtlong.pgc points out a problem in ecpg/compatlib/informix.c ==5036== 1 errors in context 1 of 4: ==5036== Invalid read of size 4 ==5036== at 0x4E3453C: rfmtlong (informix.c:941) ==5036== by 0x4007DA: fmtlong.constprop.0 (rfmtlong.pgc:22) ==5036== by 0x4006BE: main (rfmtlong.pgc:45) ==5036== Address 0x60677d8 is 24 bytes inside a block of size 25 alloc'd ==5036== at 0x4C28409: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==5036== by 0x4E34268: rfmtlong (informix.c:783) ==5036== by 0x4007DA: fmtlong.constprop.0 (rfmtlong.pgc:22) ==5036== by 0x4006BE: main (rfmtlong.pgc:45) The same error is reported 4 times. 2. ecpg_add_mem() seems to leak memory: ==5463== 256 bytes in 16 blocks are definitely lost in loss record 1 of 1 ==5463== at 0x4C2A121: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==5463== by 0x4E3E153: ecpg_alloc (memory.c:21) ==5463== by 0x4E3E212: ecpg_add_mem (memory.c:110) ==5463== by 0x4E3542B: ecpg_store_result (execute.c:409) ==5463== by 0x4E37E5A: ecpg_process_output (execute.c:1777) ==5463== by 0x4E38CCA: ecpg_do (execute.c:2137) ==5463== by 0x4E38D8A: ECPGdo (execute.c:2159) ==5463== by 0x400A82: fn (alloc.pgc:51) ==5463== by 0x5152C52: start_thread (pthread_create.c:308) ==5463== by 0x545C13C: clone (clone.S:113) The last two issue we talked about in this thread are also implemented: - permanently raise the readahead window if the application sends a bigger FETCH command, and - temporarily raise the readahead window for FETCH ALL commands The cursor position tracking was completely rewritten, so the client side properly follows the cursor position known by the backend and doesn't skip MOVE statements where it shouldn't. The previously known bug is completely eliminated this way. Please, review that patch. Thanks in advance and best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
<div class="moz-cite-prefix">2013-08-17 12:08 keltezéssel, Boszormenyi Zoltan írta:<br /></div><blockquote cite="mid:520F4B90.2010800@cybertec.at"type="cite">Hi, <br /><br /> I am restarting this old thread... :-) <br /><br /> 2012-04-2410:17 keltezéssel, Michael Meskes írta: <br /><blockquote type="cite"><blockquote type="cite">OK, I will implement#2. Another question popped up: what to do <br /> with FETCH ALL? The current readahead window size or temporarily<br /> bumping it to say some tens of thousands can be used. We may not <br /> know how much is the "all records".This, although lowers performance, <br /> saves memory. <br /></blockquote> I would say doing a large fetch in twoor three batches won't cost too much in <br /> terms of performance. <br /><br /><blockquote type="cite">Please, don'tapply this patch yet. I discovered a rather big hole <br /> that can confuse the cursor position tracking if you dothis: <br /> ... <br /> That will also need a new round of review. Sorry for that. <br /></blockquote> No problem, betterto find it now instead of after release. <br /><br /> Anyway, I moved the patch to 2012-next (I hope I did it correctly)so 2012-1 <br /> can be closed. Let's try to get this patch done in the next commit fest. <br /><br /> Michael<br /></blockquote><br /> I had time to look into this patch of mine again after about 1.5 years. <br /> Frankly,this time was too long to remember every detail of the patch <br /> and looking at parts of the patch as a big entitywas confusing. <br /><br /> So I started fresh and to make review easier, I broke the patch up <br /> into small piecesthat all build on each other. I have also fixed quite <br /> a few bugs, mostly in my code, but some in the ECPG parserand <br /> the regression tests as well. <br /><br /> I have put the broken up patchset into a GIT tree of mine atGitHub: <br /><a class="moz-txt-link-freetext" href="https://github.com/zboszor/ecpg-readahead/">https://github.com/zboszor/ecpg-readahead/</a><br/> but the huge compressedpatch is also attached for reference. <br /> It was generated with <br /><br /> $ git diff 221e92f64c6e136e550ec2592aac3ae0d4623209..870922676e6ae0faa4ebbf94b92e0b97ec418e16<br/><br /> ECPG regression tests are nowValgrind-clean except two of them <br /> but both are pre-existing bugs. <br /><br /> 1. ecpg/test/compat_informix/rfmtlong.pgcpoints out a problem in <br /> ecpg/compatlib/informix.c <br /><br /> ==5036== 1errors in context 1 of 4: <br /> ==5036== Invalid read of size 4 <br /> ==5036== at 0x4E3453C: rfmtlong (informix.c:941)<br /> ==5036== by 0x4007DA: fmtlong.constprop.0 (rfmtlong.pgc:22) <br /> ==5036== by 0x4006BE: main(rfmtlong.pgc:45) <br /> ==5036== Address 0x60677d8 is 24 bytes inside a block of size 25 alloc'd <br /> ==5036== at 0x4C28409: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) <br /> ==5036== by 0x4E34268:rfmtlong (informix.c:783) <br /> ==5036== by 0x4007DA: fmtlong.constprop.0 (rfmtlong.pgc:22) <br /> ==5036== by 0x4006BE: main (rfmtlong.pgc:45) <br /><br /> The same error is reported 4 times. <br /><br /> 2. ecpg_add_mem()seems to leak memory: <br /><br /> ==5463== 256 bytes in 16 blocks are definitely lost in loss record 1 of1 <br /> ==5463== at 0x4C2A121: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) <br /> ==5463== by 0x4E3E153: ecpg_alloc (memory.c:21) <br /> ==5463== by 0x4E3E212: ecpg_add_mem (memory.c:110) <br /> ==5463== by 0x4E3542B: ecpg_store_result (execute.c:409) <br /> ==5463== by 0x4E37E5A: ecpg_process_output (execute.c:1777)<br /> ==5463== by 0x4E38CCA: ecpg_do (execute.c:2137) <br /> ==5463== by 0x4E38D8A: ECPGdo (execute.c:2159)<br /> ==5463== by 0x400A82: fn (alloc.pgc:51) <br /> ==5463== by 0x5152C52: start_thread (pthread_create.c:308)<br /> ==5463== by 0x545C13C: clone (clone.S:113) <br /><br /> The last two issue we talked aboutin this thread are also implemented: <br /> - permanently raise the readahead window if the application sends a <br/> bigger FETCH command, and <br /> - temporarily raise the readahead window for FETCH ALL commands <br /><br /> Thecursor position tracking was completely rewritten, so the client side <br /> properly follows the cursor position knownby the backend and doesn't <br /> skip MOVE statements where it shouldn't. The previously known <br /> bug is completelyeliminated this way. <br /><br /> Please, review that patch. <br /></blockquote><br /> I have another idea to makeECPG building on this huge patch.<br /><br /> Currently, UPDATE/DELETE WHERE CURRENT OF has to issue a MOVE<br /> beforethe command in case the cursor positions known by the application<br /> and the backend are different.<br /><br />My idea builds on the fact that UPDATE/DELETE RETURNING is present<br /> in all supported back branches.<br /><br /> Amini-parser only understanding SELECT, UPDATE and DELETE should<br /> be added to ecpglib, so<br /><br /> DECLARE cursorCURSOR FOR SELECT ...<br /><br /> and<br /><br /> PREPARE prepared_stmt FROM :query;<br /> DECLARE cursor CURSORFOR prepared_stmt;<br /><br /> can be analyzed and tweaked behind the application's back.<br /><br /> This is neededto detect whether a query is a simple updatable<br /> scan of a table, and returning errors early to the applicationif it's not,<br /> without actually sending the UPDATE/DELETE WHERE CURRENT OF<br /> query to the backend.<br/><br /> For the purpose of WHERE CURRENT OF, I would add a ctid<br /> column at the end of the targelist thatis treated like "resjunk"<br /> in the backend when returning data to the application.<br /><br /> So, SELECTs wouldreturn the ctid information of the tuples.<br /> The cursor query was a FETCH N with abs(N)>1 because of<br /> thereadahead. For this reason, the cursor positions known<br /> by the application and the backend are different.<br /><br/> The extra MOVE can be eliminated by replacing<br /><br /> UPDATE table SET ... WHERE CURRENT OF cursor;<br /><br/> with<br /><br /> UPDATE table SET ... WHERE ctid='...' RETURNING ctid;<br /><br /> Of course, if the originalquery contained RETURNING, the<br /> ctid field would be appended in resjunk fashion.<br /><br /> The new ctid canbe saved back into the cache using PQsetvalue()<br /> and the (position ; new ctid) pair into a hash, both can be<br />looked up in case the application wants to modify the<br /> same tuple again. Some protection is needed against growing<br/> the hash too big. But usually[*] programs don't go back<br /> to modify the same record twice.<br /><br /> [*]This is only an educated guess.<br /><br /> How about it?<br /><br /> Best regards,<br /> Zoltán Böszörményi<br /><br/><blockquote cite="mid:520F4B90.2010800@cybertec.at" type="cite"><br /> Thanks in advance and best regards, <br />Zoltán Böszörményi <br /><br /><br /><fieldset class="mimeAttachmentHeader"></fieldset><br /><pre wrap=""> </pre></blockquote><br /><br /><pre class="moz-signature" cols="90">-- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a> <aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a> </pre>
2013-08-17 12:08 keltezéssel, Boszormenyi Zoltan írta: > > I have put the broken up patchset into a GIT tree of mine at GitHub: > https://github.com/zboszor/ecpg-readahead/ > but the huge compressed patch is also attached for reference. I merged current PG GIT HEAD in the above tree and fixed a merge conflict caused by commit 673b527534893a4a8adb3cdef52fc645c13598ce The huge patch is attached for reference. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
On Wed, 2013-09-04 at 10:06 +0200, Boszormenyi Zoltan wrote: > 2013-08-17 12:08 keltezéssel, Boszormenyi Zoltan írta: > > > > I have put the broken up patchset into a GIT tree of mine at GitHub: > > https://github.com/zboszor/ecpg-readahead/ > > but the huge compressed patch is also attached for reference. > > I merged current PG GIT HEAD in the above tree and fixed a merge conflict > caused by commit 673b527534893a4a8adb3cdef52fc645c13598ce > > The huge patch is attached for reference. The documentation doesn't build: openjade:ecpg.sgml:478:8:E: end tag for "LITERAL" omitted, but OMITTAG NO was specified openjade:ecpg.sgml:477:40: start tag was here openjade:ecpg.sgml:478:8:E: end tag for "LITERAL" omitted, but OMITTAG NO was specified openjade:ecpg.sgml:477:20: start tag was here openjade:ecpg.sgml:478:8:E: end tag for "LITERAL" omitted, but OMITTAG NO was specified openjade:ecpg.sgml:473:81: start tag was here openjade:ecpg.sgml:478:8:E: end tag for "LITERAL" omitted, but OMITTAG NO was specified openjade:ecpg.sgml:473:56: start tag was here
2013-09-07 03:25 keltezéssel, Peter Eisentraut írta: > On Wed, 2013-09-04 at 10:06 +0200, Boszormenyi Zoltan wrote: >> 2013-08-17 12:08 keltezéssel, Boszormenyi Zoltan írta: >>> I have put the broken up patchset into a GIT tree of mine at GitHub: >>> https://github.com/zboszor/ecpg-readahead/ >>> but the huge compressed patch is also attached for reference. >> I merged current PG GIT HEAD in the above tree and fixed a merge conflict >> caused by commit 673b527534893a4a8adb3cdef52fc645c13598ce >> >> The huge patch is attached for reference. > The documentation doesn't build: > > openjade:ecpg.sgml:478:8:E: end tag for "LITERAL" omitted, but OMITTAG NO was specified > openjade:ecpg.sgml:477:40: start tag was here > openjade:ecpg.sgml:478:8:E: end tag for "LITERAL" omitted, but OMITTAG NO was specified > openjade:ecpg.sgml:477:20: start tag was here > openjade:ecpg.sgml:478:8:E: end tag for "LITERAL" omitted, but OMITTAG NO was specified > openjade:ecpg.sgml:473:81: start tag was here > openjade:ecpg.sgml:478:8:E: end tag for "LITERAL" omitted, but OMITTAG NO was specified > openjade:ecpg.sgml:473:56: start tag was here Thanks, fixed. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
You need to update the dblink regression tests.
2013-09-10 03:04 keltezéssel, Peter Eisentraut írta: > You need to update the dblink regression tests. Done. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
Proposal: UPDATE/DELETE ... WHERE OFFSET n OF cursor_name, was: Re: New ECPG idea, was: Re: ECPG FETCH readahead
From
Boszormenyi Zoltan
Date:
Hi, 2013-08-17 13:02 keltezéssel, Boszormenyi Zoltan írta: [snip, discussion of WHERE CURRENT OF in the ECPG client lib] I had a second thought about it and the client side caching and parser behind the application's back seems to be an overkill. Instead, I propose a different solution, which is a logical extension of FETCH { FORWARD | BACKWARD } N, which is a PostgreSQL extension. The proposed solution would be: UPDATE / DELETE ... WHERE OFFSET SignedIconst OF cursor_name I imagine that FETCH would keep the array of TIDs/ItemPointerDatas of the last FETCH statement. The argument to OFFSET would be mostly in negative terms, with 0 being equivalent of WHERE CURRENT OF. E.g.: FETCH 2 FROM mycur; -- fetches two rows UPDATE mytab SET ... WHERE OFFSET -1 OF mycur; -- updates the first row UPDATE mytab SET ... WHERE OFFSET 0 OF mycur; -- updates current row or FETCH 3 FROM mycur; -- fetches two rows, reaches end of the cursor UPDATE mytab SET ... WHERE OFFSET -2 OF mycur; -- updates the first row UPDATE mytab SET ... WHERE OFFSET -1 OF mycur; -- updates the second row UPDATE mytab SET ... WHERE OFFSET 0 OF mycur; -- throws an error like WHERE CURRENT OF or FETCH 3 FROM mycur; -- fetches two rows, reaches end of the cursor MOVE BACKWARD 2 IN mycur; UPDATE mytab SET ... WHERE OFFSET 0 OF mycur; -- updates the first row (now current) UPDATE mytab SET ... WHERE OFFSET 1 OF mycur; -- updates the second row The cached array can be kept valid until the next FETCH statement, even if moves out of the interval of the array, except in case the application changes the sign of the cursor position, e.g. previously it used MOVE ABSOLUTE with positive numbers and suddenly it switches to backward scanning with MOVE ABSOLUTE <negative number> or vice-versa. This would solve the only source of slowdown in the client side cursor caching in ECPG present in my current ECPG cursor readahead patch, there would be no more MOVE + UPDATE/DELETE WHERE CURRENT OF. On the other hand, exploiting this proposed feature in ECPG would make it incompatible with older servers unless it detects the server version it connects to and uses the current method. Comments? Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Boszormenyi Zoltan escribió: > 2013-09-10 03:04 keltezéssel, Peter Eisentraut írta: > >You need to update the dblink regression tests. > > Done. Dude, this is an humongous patch. I was shocked by it initially, but on further reading, I observed that it's only a huge patch which also does some mechanical changes to test output. I think it'd be better to split the part that's responsible for the changed lines in test output mentioning "ecpg_process_output". That should be a reasonably small patch which changes ecpg_execute slightly and adds the new function, is followed by the enormous resulting mechanical changes in test output. It should be possible to commit that relatively quickly. Then there's the rest of the patch, which would adds a huge pile of new code. I think there are some very minor changes to backend code as well -- would it make sense to post that as a separate piece? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
2013-10-11 00:16 keltezéssel, Alvaro Herrera írta: > Boszormenyi Zoltan escribió: >> 2013-09-10 03:04 keltezéssel, Peter Eisentraut írta: >>> You need to update the dblink regression tests. >> Done. > Dude, this is an humongous patch. You *know* that the patch is available in pieces at https://github.com/zboszor/ecpg-readahead right? You must have read the new "initial" announcement at http://archives.postgresql.org/message-id/520F4B90.2010800@cybertec.at You can review and merge them one by one. The transformation of ecpg_execute() and friends starts at 2d04ff83984c63c34e55175317e3e431eb58e00c > I was shocked by it initially, but on > further reading, I observed that it's only a huge patch which also does > some mechanical changes to test output. I think it'd be better to split > the part that's responsible for the changed lines in test output > mentioning "ecpg_process_output". It's 1e0747576e96aae3ec8c60c46baea130aaf8916e in the above repository. Also, another huge regression test patch is where the cursor readahead is enabled unconditionally: 27e43069082b29cb6fa4d3414e6484ec7fb80cbe > That should be a reasonably small > patch which changes ecpg_execute slightly and adds the new function, is > followed by the enormous resulting mechanical changes in test output. > It should be possible to commit that relatively quickly. Then there's > the rest of the patch, which would adds a huge pile of new code. > > I think there are some very minor changes to backend code as well -- > would it make sense to post that as a separate piece? It's 2ad207e6371a33d6a985c76ac066dd51ed5681cb Also, cd881f0363b1aff1508cfa347e8df6b4981f0ee7 to fix the dblink regression test. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
2013-10-11 00:16 keltezéssel, Alvaro Herrera írta: > Boszormenyi Zoltan escribió: >> 2013-09-10 03:04 keltezéssel, Peter Eisentraut írta: >>> You need to update the dblink regression tests. >> Done. > Dude, this is an humongous patch. I was shocked by it initially, but on > further reading, I observed that it's only a huge patch which also does > some mechanical changes to test output. I think it'd be better to split > the part that's responsible for the changed lines in test output > mentioning "ecpg_process_output". That should be a reasonably small > patch which changes ecpg_execute slightly and adds the new function, is > followed by the enormous resulting mechanical changes in test output. > It should be possible to commit that relatively quickly. Then there's > the rest of the patch, which would adds a huge pile of new code. > > I think there are some very minor changes to backend code as well -- > would it make sense to post that as a separate piece? I had to rebase the patch against current (today morning's) GIT, since there were a few changes against ECPG in the meantime. The old contents of my GIT repository was removed so you need to clone it fresh. https://github.com/zboszor/ecpg-readahead.git I won't post the humongous patch again, since sending a 90KB compressed file to everyone on the list is rude. You can pull the commits individually from the above repository. For the same reason, I won't add the DECLARE CURSOR command tag change separately since this is also part of this big feature. I have reordered some patches, like some independent bug fixes against the ECPG parser and regression tests. The backend change is also added early. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: > The old contents of my GIT repository was removed so you need to > clone it fresh. https://github.com/zboszor/ecpg-readahead.git > I won't post the humongous patch again, since sending a 90KB > compressed file to everyone on the list is rude. Patches of that weight show up on a regular basis. I don't think it's rude. -- Noah Misch EnterpriseDB http://www.enterprisedb.com
2013-11-12 07:01 keltezéssel, Noah Misch írta: > On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: >> The old contents of my GIT repository was removed so you need to >> clone it fresh. https://github.com/zboszor/ecpg-readahead.git >> I won't post the humongous patch again, since sending a 90KB >> compressed file to everyone on the list is rude. > Patches of that weight show up on a regular basis. I don't think it's rude. OK, here it is. -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
2013-11-12 07:15 keltezéssel, Boszormenyi Zoltan írta: > 2013-11-12 07:01 keltezéssel, Noah Misch írta: >> On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: >>> The old contents of my GIT repository was removed so you need to >>> clone it fresh. https://github.com/zboszor/ecpg-readahead.git >>> I won't post the humongous patch again, since sending a 90KB >>> compressed file to everyone on the list is rude. >> Patches of that weight show up on a regular basis. I don't think it's rude. > > OK, here it is. I have rebased the patchset after "ecpg: Split off mmfatal() from mmerror()" since it caused merge conflicts. It's at the usual place again, you need to clone it from scratch if you are interested in looking at git diff/log I have removed some previous ecpg_log() debug output and the total patch size is not so huge any more but I am going to post the split-up set in parts. Attached is the first few patches that are strictly generic ECPG fixes. They can be applied independently and obvious enough. Subsequent patches will come as reply to this email. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
Modify the DECLARE CURSOR command tag depending on the scrollable flag
From
Boszormenyi Zoltan
Date:
2013-11-20 14:41 keltezéssel, Boszormenyi Zoltan írta: > 2013-11-12 07:15 keltezéssel, Boszormenyi Zoltan írta: >> 2013-11-12 07:01 keltezéssel, Noah Misch írta: >>> On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: >>>> The old contents of my GIT repository was removed so you need to >>>> clone it fresh. https://github.com/zboszor/ecpg-readahead.git >>>> I won't post the humongous patch again, since sending a 90KB >>>> compressed file to everyone on the list is rude. >>> Patches of that weight show up on a regular basis. I don't think it's rude. >> >> OK, here it is. > > ... > Subsequent patches will come as reply to this email. Attached is the patch that modified the command tag returned by the DECLARE CURSOR command. It returns "DECLARE SCROLL CURSOR" or "DECLARE NO SCROLL CURSOR" depending on the cursor's scrollable flag that can be determined internally even if neither is asked explicitly. It is expected by the ECPG cursor readahead code. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
2013-11-20 14:41 keltezéssel, Boszormenyi Zoltan írta: > 2013-11-12 07:15 keltezéssel, Boszormenyi Zoltan írta: >> 2013-11-12 07:01 keltezéssel, Noah Misch írta: >>> On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: >>>> The old contents of my GIT repository was removed so you need to >>>> clone it fresh. https://github.com/zboszor/ecpg-readahead.git >>>> I won't post the humongous patch again, since sending a 90KB >>>> compressed file to everyone on the list is rude. >>> Patches of that weight show up on a regular basis. I don't think it's rude. >> >> OK, here it is. > > ... > Subsequent patches will come as reply to this email. Infrastructure changes in ecpglib/execute.c to split up ECPGdo and ecpg_execute() and expose the parts as functions internal to ecpglib. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
2013-11-20 14:41 keltezéssel, Boszormenyi Zoltan írta: > 2013-11-12 07:15 keltezéssel, Boszormenyi Zoltan írta: >> 2013-11-12 07:01 keltezéssel, Noah Misch írta: >>> On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: >>>> The old contents of my GIT repository was removed so you need to >>>> clone it fresh. https://github.com/zboszor/ecpg-readahead.git >>>> I won't post the humongous patch again, since sending a 90KB >>>> compressed file to everyone on the list is rude. >>> Patches of that weight show up on a regular basis. I don't think it's rude. >> >> OK, here it is. > > ... > Subsequent patches will come as reply to this email. ecpg_log() fixes after part 1 that produces a lot of regression test changes. This patch is over 200K in itself so I send it separately and compressed. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
2013-11-20 14:41 keltezéssel, Boszormenyi Zoltan írta: > 2013-11-12 07:15 keltezéssel, Boszormenyi Zoltan írta: >> 2013-11-12 07:01 keltezéssel, Noah Misch írta: >>> On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: >>>> The old contents of my GIT repository was removed so you need to >>>> clone it fresh. https://github.com/zboszor/ecpg-readahead.git >>>> I won't post the humongous patch again, since sending a 90KB >>>> compressed file to everyone on the list is rude. >>> Patches of that weight show up on a regular basis. I don't think it's rude. >> >> OK, here it is. > > ... > Subsequent patches will come as reply to this email. Further ecpglib/execute.c changes. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
2013-11-20 14:41 keltezéssel, Boszormenyi Zoltan írta: > 2013-11-12 07:15 keltezéssel, Boszormenyi Zoltan írta: >> 2013-11-12 07:01 keltezéssel, Noah Misch írta: >>> On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: >>>> The old contents of my GIT repository was removed so you need to >>>> clone it fresh. https://github.com/zboszor/ecpg-readahead.git >>>> I won't post the humongous patch again, since sending a 90KB >>>> compressed file to everyone on the list is rude. >>> Patches of that weight show up on a regular basis. I don't think it's rude. >> >> OK, here it is. > > ... > Subsequent patches will come as reply to this email. This is another, semi independent subfeature of ECPG readahead. It's about 150K by itself, so I send it compressed. The purpose of this patch is to track (sub-)transactions and cursors in ecpglib to reduce network turnaround and speed up the application in certain cases. E.g. cursors are discarded upon ROLLBACK TO SAVEPOINT and ecpglib needs to know about it. When an unknown savepoint or cursor name is sent, ecpglib would not send the command to the server in an open transaction after this patch. Instead, it flips a "client-side error" flag and returns the same error the backend would in this case. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
2013-11-20 14:41 keltezéssel, Boszormenyi Zoltan írta: > 2013-11-12 07:15 keltezéssel, Boszormenyi Zoltan írta: >> 2013-11-12 07:01 keltezéssel, Noah Misch írta: >>> On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: >>>> The old contents of my GIT repository was removed so you need to >>>> clone it fresh. https://github.com/zboszor/ecpg-readahead.git >>>> I won't post the humongous patch again, since sending a 90KB >>>> compressed file to everyone on the list is rude. >>> Patches of that weight show up on a regular basis. I don't think it's rude. >> >> OK, here it is. > > ... > Subsequent patches will come as reply to this email. Patch 24 slightly speeds up ECPGsetcommit() at runtime, shifting the strcmp() cost to the ecpg preprocessor. It also avoids the confusion coming from a manually crafted C source that sends a string different from "on" and "off". Patch 25 is another subfeature of the ECPG readahead code. ECPGopen() gets the user-specified (or unspecified) scrollable flag. ECPGfetch() gets the amount of tuples to fetch separately as a fixed value, which may or may not be also present in the variable argument list. It makes it unnecessary to parse the FETCH/MOVE query and extract the amount of tuples that way. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
2013-11-20 14:41 keltezéssel, Boszormenyi Zoltan írta: > 2013-11-12 07:15 keltezéssel, Boszormenyi Zoltan írta: >> 2013-11-12 07:01 keltezéssel, Noah Misch írta: >>> On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: >>>> The old contents of my GIT repository was removed so you need to >>>> clone it fresh. https://github.com/zboszor/ecpg-readahead.git >>>> I won't post the humongous patch again, since sending a 90KB >>>> compressed file to everyone on the list is rude. >>> Patches of that weight show up on a regular basis. I don't think it's rude. >> >> OK, here it is. > > ... > Subsequent patches will come as reply to this email. $SUBJECT, the real thing. Needs all previous patches. Compared the the previous incarnation, a lot of debugging ecpg_log() lines are removed. The newly introduced regression tests have much smaller .stderr outputs because of this, the patch is about 800K smaller in total. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
2013-11-20 15:25 keltezéssel, Boszormenyi Zoltan írta: > 2013-11-20 14:41 keltezéssel, Boszormenyi Zoltan írta: >> 2013-11-12 07:15 keltezéssel, Boszormenyi Zoltan írta: >>> 2013-11-12 07:01 keltezéssel, Noah Misch írta: >>>> On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: >>>>> The old contents of my GIT repository was removed so you need to >>>>> clone it fresh. https://github.com/zboszor/ecpg-readahead.git >>>>> I won't post the humongous patch again, since sending a 90KB >>>>> compressed file to everyone on the list is rude. >>>> Patches of that weight show up on a regular basis. I don't think it's rude. >>> >>> OK, here it is. >> >> ... >> Subsequent patches will come as reply to this email. > Followup patches/subfeatures for the ECPG readahead code. The last (31st) patch drives all cursors through the readahead code and changes a lot of regression tests. It's over 100K so I send it compressed. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
Boszormenyi Zoltan <zb@cybertec.at> writes: > Attached is the patch that modified the command tag returned by > the DECLARE CURSOR command. It returns "DECLARE SCROLL CURSOR" > or "DECLARE NO SCROLL CURSOR" depending on the cursor's > scrollable flag that can be determined internally even if neither is > asked explicitly. This does not strike me as an acceptable change. It will break any code that's expecting the existing command tag, for little or no benefit to most applications. Even if backwards compatibility were of no concern, I'm not convinced it's a good thing to expose the backend's internal choices about query plans used for cursors, which is what this is basically doing. > It is expected by the ECPG cursor readahead code. And that doesn't sound like a sufficient excuse. You should only assume a cursor is scrollable if SCROLL was specified in the cursor declaration command, which it'd seem to me is something ECPG would or easily could know about commands it issues. regards, tom lane
Re: Modify the DECLARE CURSOR command tag depending on the scrollable flag
From
Boszormenyi Zoltan
Date:
2013-11-23 22:01 keltezéssel, Tom Lane írta: > Boszormenyi Zoltan <zb@cybertec.at> writes: >> Attached is the patch that modified the command tag returned by >> the DECLARE CURSOR command. It returns "DECLARE SCROLL CURSOR" >> or "DECLARE NO SCROLL CURSOR" depending on the cursor's >> scrollable flag that can be determined internally even if neither is >> asked explicitly. > This does not strike me as an acceptable change. It will break any code > that's expecting the existing command tag, for little or no benefit > to most applications. Even if backwards compatibility were of no concern, > I'm not convinced it's a good thing to expose the backend's internal > choices about query plans used for cursors, which is what this is > basically doing. I saw code in the backend allowing a cursor to be scrollable, although it was not declared as such. How about ripping that out? That way there would be no incentive for lazy SQL coding using simple cursors. You can argue that it would also break application compatibility but on the other hand, such a code has always been buggy and should be fixed. >> It is expected by the ECPG cursor readahead code. > And that doesn't sound like a sufficient excuse. You should only assume a > cursor is scrollable if SCROLL was specified in the cursor declaration > command, which it'd seem to me is something ECPG would or easily could > know about commands it issues. Yes, it can and I have a patch in the series passing this info to ecpglib. I am also arguing for backward compatibility on a different angle: this small backend change would still allow using simple cursors in ECPG while using the cursor readahead. And it's not the first time drivers have to adapt to new PostgreSQL major versions. If it was, I wouldn't have the courage to set a precedent either. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Boszormenyi Zoltan <zb@cybertec.at> writes: > 2013-11-23 22:01 keltez�ssel, Tom Lane �rta: >> Boszormenyi Zoltan <zb@cybertec.at> writes: >>> Attached is the patch that modified the command tag returned by >>> the DECLARE CURSOR command. It returns "DECLARE SCROLL CURSOR" >>> or "DECLARE NO SCROLL CURSOR" depending on the cursor's >>> scrollable flag that can be determined internally even if neither is >>> asked explicitly. >> This does not strike me as an acceptable change. It will break any code >> that's expecting the existing command tag, for little or no benefit >> to most applications. Even if backwards compatibility were of no concern, >> I'm not convinced it's a good thing to expose the backend's internal >> choices about query plans used for cursors, which is what this is >> basically doing. > I saw code in the backend allowing a cursor to be scrollable, although > it was not declared as such. How about ripping that out? That also fails the unnecessary-backwards-compatibility-break test. regards, tom lane
Re: Modify the DECLARE CURSOR command tag depending on the scrollable flag
From
Boszormenyi Zoltan
Date:
<div class="moz-cite-prefix">2013-11-27 19:16 keltezéssel, Tom Lane írta:<br /></div><blockquote cite="mid:19730.1385576177@sss.pgh.pa.us"type="cite"><pre wrap="">Boszormenyi Zoltan <a class="moz-txt-link-rfc2396E" href="mailto:zb@cybertec.at"><zb@cybertec.at></a>writes: </pre><blockquote type="cite"><pre wrap="">2013-11-23 22:01 keltezéssel, Tom Lane írta: </pre><blockquote type="cite"><pre wrap="">Boszormenyi Zoltan <a class="moz-txt-link-rfc2396E" href="mailto:zb@cybertec.at"><zb@cybertec.at></a>writes: </pre><blockquote type="cite"><pre wrap="">Attached is the patch that modified the command tag returned by the DECLARE CURSOR command. It returns "DECLARE SCROLL CURSOR" or "DECLARE NO SCROLL CURSOR" depending on the cursor's scrollable flag that can be determined internally even if neither is asked explicitly. </pre></blockquote></blockquote></blockquote><pre wrap=""> </pre><blockquote type="cite"><blockquote type="cite"><pre wrap="">This does not strike me as an acceptable change. It willbreak any code that's expecting the existing command tag, for little or no benefit to most applications. Even if backwards compatibility were of no concern, I'm not convinced it's a good thing to expose the backend's internal choices about query plans used for cursors, which is what this is basically doing. </pre></blockquote></blockquote><pre wrap=""> </pre><blockquote type="cite"><pre wrap="">I saw code in the backend allowing a cursor to be scrollable, although it was not declared as such. How about ripping that out? </pre></blockquote><pre wrap=""> That also fails the unnecessary-backwards-compatibility-break test.</pre></blockquote><br /> If you read the rest of themail, it turns out it wasn't a serious question.<br /><br /> Getting the SCROLL / NO SCROLL flags from the preprocessoris no problem.<br /> The only problem is when it's unspecified.<br /><br /> Treating it as NO SCROLL (or addingit to the DECLARE command behind<br /> the application's back) would break apps that want to scroll backward.<br />(Not ABSOLUTE.)<br /><br /> On the other hand, what problems would one face when adding SCROLL<br /> implicitly if it'sunspecified? It's not a workable solution either, see below.<br /><br /> As the documentation suggests, an applicationthat wants to use<br /> UPDATE/DELETE WHERE CURRENT OF ..., it's highly recommended<br /> that the cursor is fora FOR UPDATE query. Watch this scenario:<br /><br /> zozo=> begin;<br /> BEGIN<br /> zozo=> declare mycur cursorfor select * from t1 for update;<br /> DECLARE CURSOR<br /> zozo=> fetch from mycur;<br /> id | t <br /> ----+---<br/> 1 | a<br /> (1 row)<br /><br /> zozo=> fetch from mycur;<br /> id | t <br /> ----+---<br /> 2 | b<br/> (1 row)<br /><br /> zozo=> update t1 set t=t||'_x' where current of mycur;<br /> UPDATE 1<br /> zozo=> fetchfrom mycur;<br /> id | t <br /> ----+---<br /> 3 | c<br /> (1 row)<br /><br /> zozo=> delete from t1 where currentof mycur;<br /> DELETE 1<br /> zozo=> move absolute 0 in mycur;<br /> MOVE 0<br /> zozo=> fetch from mycur;<br/> id | t <br /> ----+---<br /> 1 | a<br /> (1 row)<br /><br /> zozo=> fetch from mycur;<br /> id | t <br/> ----+---<br /> (0 rows)<br /><br /> Although the server complains about MOVE BACKWARD, it's not<br /> complaining aboutMOVE ABSOLUTE, despite it's clearly moving<br /> backward. The cursor position is tracked in the backend in a long<br/> variable and it's not overflowed. This is also legacy behaviour,<br /> changing it would break backward compatibility.<br/><br /> The other problem I see is with the documentation: it says that<br /> the INSENSITIVE keyword isjust a placeholder, all cursors are insensitive.<br /> It's clearly false. Moving back to the start, previously existingrows<br /> won't show up again. It's not strictly a sensitive cursor, either,<br /> because the row with id=2 wouldshow up with the new value of "t".<br /> This neither sensitive nor insensitive behaviour is what the SQL<br /> standardcalls an "asensitive" cursor. It would worth a doc change.<br /> This is what's written in 9.3:<br /><br /> "<br/> If the cursor's query includes <tt class="LITERAL">FOR UPDATE</tt> or <tt class="LITERAL">FOR SHARE</tt>, then returnedrows are locked at the time they are first fetched, in the same way as for a regular <a href="http://www.postgresql.org/docs/9.3/interactive/sql-select.html">SELECT</a>command with these options. In addition,the returned rows will be the most up-to-date versions; therefore these options provide the equivalent of what theSQL standard calls a <span class="QUOTE">"sensitive cursor"</span>. (Specifying <tt class="LITERAL">INSENSITIVE</tt> togetherwith <tt class="LITERAL">FOR UPDATE</tt> or <tt class="LITERAL">FOR SHARE</tt> is an error.)<br /> "<br /> ( <a class="moz-txt-link-freetext" href="http://www.postgresql.org/docs/9.3/interactive/sql-declare.html">http://www.postgresql.org/docs/9.3/interactive/sql-declare.html</a> )<br/><br /> However, if the cursor is declared without FOR UPDATE, both<br /> the explicit SCROLL keyword (or implicit,if the query is simple),<br /> scrolling backward and DML with WHERE CURRENT OF are allowed.<br /> In this case,the cursor is really insensitive, FETCH statements<br /> after MOVE ABSOLUTE 0 return all rows with their original data.<br/><br /> This is just to show that adding SCROLL behind the application's<br /> back is also pointless. If the query(which can also be a prepared<br /> statement in ECPG) contains FOR UPDATE, adding SCROLL to the<br /> DECLARE statementwould make it fail.<br /><br /> If you consider all these:<br /><br /> - certain combinations of query and DECLAREstmt flags fail;<br /> - adding NO SCROLL is breaking backward compatibility;<br /> - the readahead code has to reallyknow whether the cursor is<br /> scrollable so it can behave just like the server;<br /><br /> then returning theSCROLL / NO SCROLL flag in the command tag is<br /> not a bad solution in my view. In fact, this was the only workable<br/> solution I could come up with to make it work reliably when neither<br /> SCROLL nor NO SCROLL is specifiedby the application.<br /><br /> Best regards,<br /> Zoltán Böszörményi<br /><br /><pre class="moz-signature" cols="90">-- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a> <aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a> </pre>
Re: Modify the DECLARE CURSOR command tag depending on the scrollable flag
From
Alvaro Herrera
Date:
Boszormenyi Zoltan escribió: > If you consider all these: > > - certain combinations of query and DECLARE stmt flags fail; > - adding NO SCROLL is breaking backward compatibility; > - the readahead code has to really know whether the cursor is > scrollable so it can behave just like the server; > > then returning the SCROLL / NO SCROLL flag in the command tag is > not a bad solution in my view. In fact, this was the only workable > solution I could come up with to make it work reliably when neither > SCROLL nor NO SCROLL is specified by the application. Would it work to have a function of some sort to which you give a cursor name and it returns whether it is scrollable or not? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Boszormenyi Zoltan <zb@cybertec.at> writes: > If you consider all these: > - certain combinations of query and DECLARE stmt flags fail; > - adding NO SCROLL is breaking backward compatibility; > - the readahead code has to really know whether the cursor is > scrollable so it can behave just like the server; If you're claiming that readahead inside ECPG will behave absolutely transparently in all cases, I think that's bogus anyway. Consider for example a query that will get a divide-by-zero error when it computes the 110th row --- but the application stops after fetching 100 rows. Everything's fine, until you insert some readahead logic. Queries containing volatile functions might also not be happy about readahead. Given these considerations, I think it'd be better to allow explicit application control over whether read-ahead happens for a particular query. And I have no problem whatsoever with requiring that the cursor be explicitly marked SCROLL or NO SCROLL before read-ahead will occur. regards, tom lane
Re: Modify the DECLARE CURSOR command tag depending on the scrollable flag
From
Peter Eisentraut
Date:
On 11/27/13, 2:49 PM, Alvaro Herrera wrote: > Would it work to have a function of some sort to which you give a cursor > name and it returns whether it is scrollable or not? That might make sense. I think this case is similar to the question whether a view is updatable. You wouldn't put that information in the CREATE VIEW command tag.
Re: Modify the DECLARE CURSOR command tag depending on the scrollable flag
From
Peter Eisentraut
Date:
On 11/27/13, 3:47 PM, Tom Lane wrote: > Given these considerations, I think it'd be better to allow explicit > application control over whether read-ahead happens for a particular > query. And I have no problem whatsoever with requiring that the cursor > be explicitly marked SCROLL or NO SCROLL before read-ahead will occur. Well, technically, unspecified means NO SCROLL according to the SQL standard. A lot of applications in ECPG are ported from other systems, which might make that assumption. It wouldn't be very nice to have to change all that.
Peter Eisentraut <peter_e@gmx.net> writes: > On 11/27/13, 3:47 PM, Tom Lane wrote: >> Given these considerations, I think it'd be better to allow explicit >> application control over whether read-ahead happens for a particular >> query. And I have no problem whatsoever with requiring that the cursor >> be explicitly marked SCROLL or NO SCROLL before read-ahead will occur. > Well, technically, unspecified means NO SCROLL according to the SQL > standard. A lot of applications in ECPG are ported from other systems, > which might make that assumption. It wouldn't be very nice to have to > change all that. Hm. So you're suggesting that ECPG fix this problem by inserting an explicit NO SCROLL clause into translated DECLARE CURSOR commands, if there's not a SCROLL clause? That would solve the problem of the ECPG library not being sure which behavior applies, but it might break existing apps that were unknowingly relying on a simple cursor being scrollable. OTOH any such app would be subject to breakage anyway as a result of planner changes, so it's hard to complain against this, as long as it's happening in a major version update. I'm for it. regards, tom lane
Re: Modify the DECLARE CURSOR command tag depending on the scrollable flag
From
Boszormenyi Zoltan
Date:
2013-11-27 20:49 keltezéssel, Alvaro Herrera írta: > Boszormenyi Zoltan escribió: > >> If you consider all these: >> >> - certain combinations of query and DECLARE stmt flags fail; >> - adding NO SCROLL is breaking backward compatibility; >> - the readahead code has to really know whether the cursor is >> scrollable so it can behave just like the server; >> >> then returning the SCROLL / NO SCROLL flag in the command tag is >> not a bad solution in my view. In fact, this was the only workable >> solution I could come up with to make it work reliably when neither >> SCROLL nor NO SCROLL is specified by the application. > Would it work to have a function of some sort to which you give a cursor > name and it returns whether it is scrollable or not? D'oh. Yes, that would also work. Thanks for the idea. :-) I will implement it and adapt my remaining patches. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Re: Modify the DECLARE CURSOR command tag depending on the scrollable flag
From
Boszormenyi Zoltan
Date:
2013-11-28 00:17 keltezéssel, Tom Lane írta: > Peter Eisentraut <peter_e@gmx.net> writes: >> On 11/27/13, 3:47 PM, Tom Lane wrote: >>> Given these considerations, I think it'd be better to allow explicit >>> application control over whether read-ahead happens for a particular >>> query. And I have no problem whatsoever with requiring that the cursor >>> be explicitly marked SCROLL or NO SCROLL before read-ahead will occur. >> Well, technically, unspecified means NO SCROLL according to the SQL >> standard. A lot of applications in ECPG are ported from other systems, >> which might make that assumption. It wouldn't be very nice to have to >> change all that. > Hm. So you're suggesting that ECPG fix this problem by inserting an > explicit NO SCROLL clause into translated DECLARE CURSOR commands, if > there's not a SCROLL clause? > > That would solve the problem of the ECPG library not being sure which > behavior applies, but it might break existing apps that were unknowingly > relying on a simple cursor being scrollable. OTOH any such app would be > subject to breakage anyway as a result of planner changes, so it's hard to > complain against this, as long as it's happening in a major version > update. > > I'm for it. If all such apps are expected to be ported from other system, then yes, that would also work. However, I am not so sure about this. One thing is sure. With this change, ecpglib can still work with older PostgreSQL versions but the application's behaviour changes if the cursor doesn't have an explicit SCROLL/NO SCROLL. In my first mail yesterday, I wrote "such a code has always been buggy and should be fixed" and I meant it seriously. I was only half serious with ripping this support code out of the backend. However, this feature might be deprecated and removed in about 3 major release or what the usual policy is. Inconsistency between different clients is also no good. You can only enforce similar client behaviour with the server behaviour. Anyway, is explicitly adding NO SCROLL the preferred solution for everyone? Best regards, Zoltán Böszörményi > > regards, tom lane > -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Re: Modify the DECLARE CURSOR command tag depending on the scrollable flag
From
Michael Meskes
Date:
On Thu, Nov 28, 2013 at 09:04:17AM +0100, Boszormenyi Zoltan wrote: > >>Well, technically, unspecified means NO SCROLL according to the SQL > >>standard. A lot of applications in ECPG are ported from other systems, That means by automatically adding a literal NO SCROLL to the command we obey standard, right? That's fine by me. > >behavior applies, but it might break existing apps that were unknowingly > >relying on a simple cursor being scrollable. OTOH any such app would be > >subject to breakage anyway as a result of planner changes, so it's hard to > >complain against this, as long as it's happening in a major version > >update. Ported applications might be in the same boat. I'm not sure if all other DBMSs stick with the standard if nothing is specified. So again, adding it might help make it clearer. > Anyway, is explicitly adding NO SCROLL the preferred solution for everyone? +1 from me. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at gmail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
Re: Modify the DECLARE CURSOR command tag depending on the scrollable flag
From
Boszormenyi Zoltan
Date:
2013-11-28 09:55 keltezéssel, Michael Meskes írta: > On Thu, Nov 28, 2013 at 09:04:17AM +0100, Boszormenyi Zoltan wrote: >>>> Well, technically, unspecified means NO SCROLL according to the SQL >>>> standard. A lot of applications in ECPG are ported from other systems, > That means by automatically adding a literal NO SCROLL to the command we obey > standard, right? That's fine by me. > >>> behavior applies, but it might break existing apps that were unknowingly >>> relying on a simple cursor being scrollable. OTOH any such app would be >>> subject to breakage anyway as a result of planner changes, so it's hard to >>> complain against this, as long as it's happening in a major version >>> update. > Ported applications might be in the same boat. I'm not sure if all other DBMSs > stick with the standard if nothing is specified. So again, adding it might help > make it clearer. > >> Anyway, is explicitly adding NO SCROLL the preferred solution for everyone? > +1 from me. Thanks, I will rework the patches this way. > > Michael -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
2013-11-20 14:53 keltezéssel, Boszormenyi Zoltan írta: > 2013-11-20 14:41 keltezéssel, Boszormenyi Zoltan írta: >> 2013-11-12 07:15 keltezéssel, Boszormenyi Zoltan írta: >>> 2013-11-12 07:01 keltezéssel, Noah Misch írta: >>>> On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: >>>>> The old contents of my GIT repository was removed so you need to >>>>> clone it fresh. https://github.com/zboszor/ecpg-readahead.git >>>>> I won't post the humongous patch again, since sending a 90KB >>>>> compressed file to everyone on the list is rude. >>>> Patches of that weight show up on a regular basis. I don't think it's rude. >>> >>> OK, here it is. >> >> ... >> Subsequent patches will come as reply to this email. > > Infrastructure changes in ecpglib/execute.c to split up > ECPGdo and ecpg_execute() and expose the parts as > functions internal to ecpglib. Rebased after killing the patch that changed the DECLARE CURSOR command tag. All infrastructure patches are attached, some of them compressed. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
2013-11-20 15:25 keltezéssel, Boszormenyi Zoltan írta: > 2013-11-20 14:41 keltezéssel, Boszormenyi Zoltan írta: >> 2013-11-12 07:15 keltezéssel, Boszormenyi Zoltan írta: >>> 2013-11-12 07:01 keltezéssel, Noah Misch írta: >>>> On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: >>>>> The old contents of my GIT repository was removed so you need to >>>>> clone it fresh. https://github.com/zboszor/ecpg-readahead.git >>>>> I won't post the humongous patch again, since sending a 90KB >>>>> compressed file to everyone on the list is rude. >>>> Patches of that weight show up on a regular basis. I don't think it's rude. >>> >>> OK, here it is. >> >> ... >> Subsequent patches will come as reply to this email. > > $SUBJECT, the real thing. Needs all previous patches. > Compared the the previous incarnation, a lot of debugging > ecpg_log() lines are removed. The newly introduced regression > tests have much smaller .stderr outputs because of this, > the patch is about 800K smaller in total. Rebased after killing the patch that changed the DECLARE CURSOR command tag. The ECPG readahead patch and all the small following patches are attached. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
Re: Modify the DECLARE CURSOR command tag depending on the scrollable flag
From
Peter Eisentraut
Date:
On Wed, 2013-11-27 at 18:17 -0500, Tom Lane wrote: > Hm. So you're suggesting that ECPG fix this problem by inserting an > explicit NO SCROLL clause into translated DECLARE CURSOR commands, if > there's not a SCROLL clause? I wouldn't go that far yet. Do I understand this right that the future readahead code needs separate behavior depending on whether a cursor is scrollable? I would think that whatever you do with NO SCROLL cursors would also work with SCROLL cursors, so if you don't know what the cursor is, just use the code for NO SCROLL.
Re: Modify the DECLARE CURSOR command tag depending on the scrollable flag
From
Boszormenyi Zoltan
Date:
2013-11-29 04:56 keltezéssel, Peter Eisentraut írta: > On Wed, 2013-11-27 at 18:17 -0500, Tom Lane wrote: >> Hm. So you're suggesting that ECPG fix this problem by inserting an >> explicit NO SCROLL clause into translated DECLARE CURSOR commands, if >> there's not a SCROLL clause? > I wouldn't go that far yet. > > Do I understand this right that the future readahead code needs separate > behavior depending on whether a cursor is scrollable? The same code is used. However, when the cursor is not scrollable, it returns an error early to the application when it wants to go backward. It still allows FETCH/MOVE ABSOLUTE to an earlier position just like the backend, in which case scanning the cursor is restarted from the beginning. If the cursor is not scrollable, the readahead code must not serve tuples from the currently populated cache backward. If the cursor is scrollable, FETCH BACKWARD is served from the cache. > I would think > that whatever you do with NO SCROLL cursors would also work with SCROLL > cursors, so if you don't know what the cursor is, just use the code for > NO SCROLL. > > > > -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
The changes are very well divided into logical units, so I think I could understand the ideas. I'm not too familiar with the ecpg internals, so consider this at least a coding review. git apply: Clean, except for one finding below Build: no errors/warnings Documentation build: no errors/warnings. The changes do appear in ecpg.html. Regression tests: all passed Non-Linux platforms: I can't verify, but I haven't noticed any new dependencies added. Comments (in the source code): good. (My) additional comments: 22.patch -------- tuples_left > 0 instead of just tuples_left seems to me safer in for-loops. Not sure if there's a convention for this though. 23.patch -------- git apply --verbose ~/cybertec/patches/ecpq/23.patch /home/anton/cybertec/patches/ecpq/23.patch:494: space before tab in indent. /*------ /home/anton/cybertec/patches/ecpq/23.patch:495: space before tab in indent. translator: this stringwill be truncated at 149 characters expanded. */ /home/anton/cybertec/patches/ecpq/23.patch:4019: trailing whitespace. exec sql close :curname; Tests - 23.patch ---------------- src/interfaces/ecpg/test/sql/cursorsubxact.pgc /* * Test the implicit RELEASE SAVEPOINT if a SAVEPOINT * is used with an already existing name. */ Shouldn't it be "... if a CURSOR is used with an already existing name?". Or just "... implicit RELEASE SAVEPOINT after an error"? I'd also appreciate a comment where exactly the savepoint is (implicitly) released. 23.patch and 24.patch --------------------- SO_MAJOR_VERSION and also interfaces/ecpg/include/ecpglib.h is changed Thus all client applications probably need to be preprocessed & compiled against the PG 9.4. I don't know how this is usually enforced. I'm mentioning it for the sake of completeness. // Antonin Houska (Tony) On 11/28/2013 03:21 PM, Boszormenyi Zoltan wrote: > 2013-11-20 14:53 keltezéssel, Boszormenyi Zoltan írta: >> 2013-11-20 14:41 keltezéssel, Boszormenyi Zoltan írta: >>> 2013-11-12 07:15 keltezéssel, Boszormenyi Zoltan írta: >>>> 2013-11-12 07:01 keltezéssel, Noah Misch írta: >>>>> On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: >>>>>> The old contents of my GIT repository was removed so you need to >>>>>> clone it fresh. https://github.com/zboszor/ecpg-readahead.git >>>>>> I won't post the humongous patch again, since sending a 90KB >>>>>> compressed file to everyone on the list is rude. >>>>> Patches of that weight show up on a regular basis. I don't think it's rude. >>>> >>>> OK, here it is. >>> >>> ... >>> Subsequent patches will come as reply to this email. >> >> Infrastructure changes in ecpglib/execute.c to split up >> ECPGdo and ecpg_execute() and expose the parts as >> functions internal to ecpglib. > > Rebased after killing the patch that changed the DECLARE CURSOR command tag. > All infrastructure patches are attached, some of them compressed. > > Best regards, > Zoltán Böszörményi > > > >
Thanks for the review. 2013-12-03 16:48 keltezéssel, Antonin Houska írta: > The changes are very well divided into logical units, so I think I could > understand the ideas. I'm not too familiar with the ecpg internals, so > consider this at least a coding review. > > > git apply: Clean, except for one finding below > > Build: no errors/warnings > > Documentation build: no errors/warnings. The changes do appear in ecpg.html. > > Regression tests: all passed > > Non-Linux platforms: I can't verify, but I haven't noticed any new > dependencies added. > > Comments (in the source code): good. > > > (My) additional comments: > > > 22.patch > -------- > > tuples_left > 0 > > instead of just > > tuples_left > > seems to me safer in for-loops. Not sure if there's a convention for > this though. I'll look at it, maybe the >0 had a reason. > > 23.patch > -------- > > git apply --verbose ~/cybertec/patches/ecpq/23.patch > /home/anton/cybertec/patches/ecpq/23.patch:494: space before tab in indent. > /*------ > /home/anton/cybertec/patches/ecpq/23.patch:495: space before tab in indent. > translator: this string will be truncated at > 149 characters expanded. */ > /home/anton/cybertec/patches/ecpq/23.patch:4019: trailing whitespace. > exec sql close :curname; I will fix the extra spaces. > Tests - 23.patch > ---------------- > > src/interfaces/ecpg/test/sql/cursorsubxact.pgc > > > /* > * Test the implicit RELEASE SAVEPOINT if a SAVEPOINT > * is used with an already existing name. > */ > > Shouldn't it be "... if a CURSOR is used with an already existing > name?". Or just "... implicit RELEASE SAVEPOINT after an error"? > I'd also appreciate a comment where exactly the savepoint is > (implicitly) released. If you do this: SAVEPOINT A; <some operations> SAVEPOINT A; /* same savepoint name */ then the operations between the two are implicitly committed to the higher subtransaction, i.e. it works as if there was a RELEASE SAVEPOINT A; before the second SAVEPOINT A; It happens to be tested with a DECLARE CURSOR statement and the runtime has to refresh its knowledge about the cursor by propagating it into a subtransaction one level higher. > 23.patch and 24.patch > --------------------- > > SO_MAJOR_VERSION and also interfaces/ecpg/include/ecpglib.h is changed > > Thus all client applications probably need to be preprocessed & compiled > against the PG 9.4. I don't know how this is usually enforced. I'm > mentioning it for the sake of completeness. The soversion has changed because of the changes in already existing exported functions. > > // Antonin Houska (Tony) > > > On 11/28/2013 03:21 PM, Boszormenyi Zoltan wrote: >> 2013-11-20 14:53 keltezéssel, Boszormenyi Zoltan írta: >>> 2013-11-20 14:41 keltezéssel, Boszormenyi Zoltan írta: >>>> 2013-11-12 07:15 keltezéssel, Boszormenyi Zoltan írta: >>>>> 2013-11-12 07:01 keltezéssel, Noah Misch írta: >>>>>> On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: >>>>>>> The old contents of my GIT repository was removed so you need to >>>>>>> clone it fresh. https://github.com/zboszor/ecpg-readahead.git >>>>>>> I won't post the humongous patch again, since sending a 90KB >>>>>>> compressed file to everyone on the list is rude. >>>>>> Patches of that weight show up on a regular basis. I don't think it's rude. >>>>> OK, here it is. >>>> ... >>>> Subsequent patches will come as reply to this email. >>> Infrastructure changes in ecpglib/execute.c to split up >>> ECPGdo and ecpg_execute() and expose the parts as >>> functions internal to ecpglib. >> Rebased after killing the patch that changed the DECLARE CURSOR command tag. >> All infrastructure patches are attached, some of them compressed. >> >> Best regards, >> Zoltán Böszörményi >> >> >> >> -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
2013-12-03 16:48 keltezéssel, Antonin Houska írta: > 22.patch > -------- > > tuples_left > 0 > > instead of just > > tuples_left > > seems to me safer in for-loops. Not sure if there's a convention for > this though. I decided not to change this. The "tuples_left" variable starts at a non-negative value and decreased by one in every iteration. The previous code also used the same "test for non-zero" test with the variable "ntuples". > 23.patch > -------- > > git apply --verbose ~/cybertec/patches/ecpq/23.patch > /home/anton/cybertec/patches/ecpq/23.patch:494: space before tab in indent. > /*------ > /home/anton/cybertec/patches/ecpq/23.patch:495: space before tab in indent. > translator: this string will be truncated at > 149 characters expanded. */ > /home/anton/cybertec/patches/ecpq/23.patch:4019: trailing whitespace. > exec sql close :curname; I fixed this in this patch and all subsequent ones. > Tests - 23.patch > ---------------- > > src/interfaces/ecpg/test/sql/cursorsubxact.pgc > > > /* > * Test the implicit RELEASE SAVEPOINT if a SAVEPOINT > * is used with an already existing name. > */ > > Shouldn't it be "... if a CURSOR is used with an already existing > name?". Or just "... implicit RELEASE SAVEPOINT after an error"? > I'd also appreciate a comment where exactly the savepoint is > (implicitly) released. I have already answered this in my previous answer. All patches are attached again for completeness. > 23.patch and 24.patch > --------------------- > > SO_MAJOR_VERSION and also interfaces/ecpg/include/ecpglib.h is changed > > Thus all client applications probably need to be preprocessed & compiled > against the PG 9.4. I don't know how this is usually enforced. I'm > mentioning it for the sake of completeness. > > // Antonin Houska (Tony) > > > On 11/28/2013 03:21 PM, Boszormenyi Zoltan wrote: >> 2013-11-20 14:53 keltezéssel, Boszormenyi Zoltan írta: >>> 2013-11-20 14:41 keltezéssel, Boszormenyi Zoltan írta: >>>> 2013-11-12 07:15 keltezéssel, Boszormenyi Zoltan írta: >>>>> 2013-11-12 07:01 keltezéssel, Noah Misch írta: >>>>>> On Mon, Nov 11, 2013 at 10:17:54AM +0100, Boszormenyi Zoltan wrote: >>>>>>> The old contents of my GIT repository was removed so you need to >>>>>>> clone it fresh. https://github.com/zboszor/ecpg-readahead.git >>>>>>> I won't post the humongous patch again, since sending a 90KB >>>>>>> compressed file to everyone on the list is rude. >>>>>> Patches of that weight show up on a regular basis. I don't think it's rude. >>>>> OK, here it is. >>>> ... >>>> Subsequent patches will come as reply to this email. >>> Infrastructure changes in ecpglib/execute.c to split up >>> ECPGdo and ecpg_execute() and expose the parts as >>> functions internal to ecpglib. >> Rebased after killing the patch that changed the DECLARE CURSOR command tag. >> All infrastructure patches are attached, some of them compressed. >> >> Best regards, >> Zoltán Böszörményi >> >> >> >> -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
2013-11-28 15:23 keltezéssel, Boszormenyi Zoltan írta: > Rebased after killing the patch that changed the DECLARE CURSOR command tag. > The ECPG readahead patch and all the small following patches are attached. Fixed the extra spaces that "git apply" complains about. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
2013-12-04 14:51 keltezéssel, Boszormenyi Zoltan írta: > 2013-12-03 16:48 keltezéssel, Antonin Houska írta: > >> Tests - 23.patch >> ---------------- >> >> src/interfaces/ecpg/test/sql/cursorsubxact.pgc >> >> >> /* >> * Test the implicit RELEASE SAVEPOINT if a SAVEPOINT >> * is used with an already existing name. >> */ >> >> Shouldn't it be "... if a CURSOR is used with an already existing >> name?". Or just "... implicit RELEASE SAVEPOINT after an error"? >> I'd also appreciate a comment where exactly the savepoint is >> (implicitly) released. > > I have already answered this in my previous answer. And I was wrong in that. The comments in the test were rearranged a little and the fact in the above comment is now actually tested. Some harmless unused variables were also removed and an uninitialized variable usage was fixed. Because of these and the above changes a lot of patches need to be rebased. All patches are attached again for completeness. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
Rebased patches after the regression test and other details were fixed in the infrastructure part. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Attachment
Tested git apply and build again. No warnings. The regression test also looks good to me now. I'm done with this review. (Not sure if I should move it to 'ready for committer' status or the CFM should do). // Antonin Houska (Tony) On 12/06/2013 02:01 PM, Boszormenyi Zoltan wrote: > 2013-12-04 14:51 keltezéssel, Boszormenyi Zoltan írta: >> 2013-12-03 16:48 keltezéssel, Antonin Houska írta: >> >>> Tests - 23.patch >>> ---------------- >>> >>> src/interfaces/ecpg/test/sql/cursorsubxact.pgc >>> >>> >>> /* >>> * Test the implicit RELEASE SAVEPOINT if a SAVEPOINT >>> * is used with an already existing name. >>> */ >>> >>> Shouldn't it be "... if a CURSOR is used with an already existing >>> name?". Or just "... implicit RELEASE SAVEPOINT after an error"? >>> I'd also appreciate a comment where exactly the savepoint is >>> (implicitly) released. >> >> I have already answered this in my previous answer. > > And I was wrong in that. The comments in the test were rearranged > a little and the fact in the above comment is now actually tested. > > Some harmless unused variables were also removed and an > uninitialized variable usage was fixed. Because of these and the above > changes a lot of patches need to be rebased. > > All patches are attached again for completeness. > > Best regards, > Zoltán Böszörményi >
On 12/6/13, 9:58 AM, Antonin Houska wrote: > Tested git apply and build again. No warnings. > > The regression test also looks good to me now. > > I'm done with this review. > > (Not sure if I should move it to 'ready for committer' status or the CFM > should do). You should do that, but I'll do it now.
This patch didn't make it out of the 2013-11 commit fest. You should move it to the next commit fest (probably with an updated patch) before January 15th.
2013-12-21 14:56 keltezéssel, Peter Eisentraut írta: > This patch didn't make it out of the 2013-11 commit fest. You should > move it to the next commit fest (probably with an updated patch) > before January 15th. Done. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Boszormenyi Zoltan escribió: > All patches are attached again for completeness. Thanks. I pushed a commit comprising patches 09 through 14. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera escribió: > Boszormenyi Zoltan escribió: > > > All patches are attached again for completeness. > > Thanks. I pushed a commit comprising patches 09 through 14. Now also pushed 15 to 17. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Boszormenyi Zoltan escribió: > Rebased patches after the regression test and other details were fixed > in the infrastructure part. This thread started in 2010, and various pieces have been applied already and some others have changed in nature. Would you please post a new patchset, containing rebased patches that still need application, in a new email thread to be linked in the commitfest entry? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
2014-01-16 22:13 keltezéssel, Alvaro Herrera írta: > Alvaro Herrera escribió: >> Boszormenyi Zoltan escribió: >> >>> All patches are attached again for completeness. >> Thanks. I pushed a commit comprising patches 09 through 14. > Now also pushed 15 to 17. Thanks very much.
2014-01-16 23:42 keltezéssel, Alvaro Herrera írta: > Boszormenyi Zoltan escribió: >> Rebased patches after the regression test and other details were fixed >> in the infrastructure part. > This thread started in 2010, and various pieces have been applied > already and some others have changed in nature. Would you please post a > new patchset, containing rebased patches that still need application, in > a new email thread to be linked in the commitfest entry? I will do that. Best regards, Zoltán Böszörményi
Hi, Alvaro Herrera wrote: > Boszormenyi Zoltan escribió: >> Rebased patches after the regression test and other details were fixed >> in the infrastructure part. > > This thread started in 2010, and various pieces have been applied > already and some others have changed in nature. Would you please post a > new patchset, containing rebased patches that still need application, in > a new email thread to be linked in the commitfest entry? I hope Thunderbird did the right thing and didn't include the reference message ID when I told it to "edit as new". So supposedly this is a new thread but with all the cc: addresses kept. I have rebased all remaining patches and kept the numbering. All the patches from 18 to 25 that were previously in the "ECPG infrastructure" CF link are included here. There is no functional change. Best regards, Zoltán Böszörményi
Attachment
2014-01-18 18:08 keltezéssel, Boszormenyi Zoltan írta: > Hi, > > > Alvaro Herrera wrote: >> Boszormenyi Zoltan escribió: >>> Rebased patches after the regression test and other details were fixed >>> in the infrastructure part. >> >> This thread started in 2010, and various pieces have been applied >> already and some others have changed in nature. Would you please post a >> new patchset, containing rebased patches that still need application, in >> a new email thread to be linked in the commitfest entry? > > I hope Thunderbird did the right thing and didn't include the reference > message ID when I told it to "edit as new". So supposedly this is a new > thread but with all the cc: addresses kept. > > I have rebased all remaining patches and kept the numbering. > All the patches from 18 to 25 that were previously in the > "ECPG infrastructure" CF link are included here. > > There is no functional change. Because of the recent bugfixes in the ECPG area, the patchset didn't apply cleanly anymore. Rebased with no functional change. Best regards, Zoltán Böszörményi
Attachment
2014-04-16 10:54 keltezéssel, Boszormenyi Zoltan írta: > 2014-01-18 18:08 keltezéssel, Boszormenyi Zoltan írta: >> Hi, >> >> >> Alvaro Herrera wrote: >>> Boszormenyi Zoltan escribió: >>>> Rebased patches after the regression test and other details were fixed >>>> in the infrastructure part. >>> >>> This thread started in 2010, and various pieces have been applied >>> already and some others have changed in nature. Would you please post a >>> new patchset, containing rebased patches that still need application, in >>> a new email thread to be linked in the commitfest entry? >> >> I hope Thunderbird did the right thing and didn't include the reference >> message ID when I told it to "edit as new". So supposedly this is a new >> thread but with all the cc: addresses kept. >> >> I have rebased all remaining patches and kept the numbering. >> All the patches from 18 to 25 that were previously in the >> "ECPG infrastructure" CF link are included here. >> >> There is no functional change. > > Because of the recent bugfixes in the ECPG area, the patchset > didn't apply cleanly anymore. Rebased with no functional change. The patches themselves are a little bigger though. It's because the patches are generated with 10 lines of context, this was set in my $HOME/.gitconfig: [diff] context = 10 Best regards, Zoltán Böszörményi
I haven't been too familiar with the ECPG internals so far but tried to do my best. Generic criteria ---------------- * Does it follow the project coding guidelines? Yes. * Are there portability issues? Shouldn't be. I even noticed the code tries to avoid platform-specific behaviour of standard library function - see comment about strtoll() in Linux in 25.patch. (I personally don't know how that function works elsewhere but that shouldn't matter.) * Are the comments sufficient and accurate? Yes, mostly. Just a few improvements recommended below. * Does it do what it says, correctly? IMO, yes. * Does it produce compiler warnings? No. * Can you make it crash? No. Only some of the parts deserve comments: 23.patch -------- Reviewed earlier as a component of the relate patch (http://www.postgresql.org/message-id/52A1E61A.7010100@gmail.com) and minimum changes done since that time. Nevertheless, a few more comments: * How about a regression test for the new ECPGcursor_dml() function? * ECPGtrans() - arguments are explained, but the return (bool) value should be as well. * line breaks (pgindent might help): static void output_cursor_name(struct cursor *ptr) { instead of static void output_cursor_name(struct cursor *ptr) { * confused messages in src/interfaces/ecpg/test/sql/cursorsubxact.pgc, starting at 100: exec sql open :curname; if (sqlca.sqlcode < 0) printf("open %s (case sensitive) failed with SQLSTATE %5s\n", curname, sqlca.sqlstate); else printf("close %s (case sensitive) succeeded\n", curname); I suppose both should be "open". 26.patch (the READAHEAD feature itself) --------------------------------------- I tried to understand the code but couldn't find any obvious error. The coding style looks clean. Maybe just the arguments and return value of the ecpglib functinons (ECPGopen, ECPGopen, ECPGfetch, ...) deserve a bit more attention. As for tests, I find them comprehensive and almost everything they do is clear to me. Just the following was worth questions: * sql-cursor-ra-fetch.stderr [NO_PID]: ecpg_execute on line 169: query: move forward all in scroll_cur; with 0 parameter(s) on connection regress1 ... [NO_PID]: ecpg_execute on line 169: query: move relative -3 in scroll_cur; with 0 parameter(s) on As the first iteration is special anyway, wouldn't "move absolute -3" be more efficient than the existing 2 commands? The following test (also FETCH RELATIVE) uses "move absolute": [NO_PID]: ecpg_execute on line 186: query: move absolute -20 in scroll_cur; with 0 parameter(s) on connection regress1 Other than this, I had an idea to improve the behaviour if READAHEAD is smaller than the actual step, but then realized that 29.patch actually does fix that :-) * cursor-ra-move.pgc What's the relevant difference between unspec_cur1 and unspec_cur2 cursors? There's no difference in scrollability or ordering. And the tests seem to be identical. * cursor-ra-swdir.pgc No questions * cursorsubxact.pgc This appears to be the test we discussed earlier: http://www.postgresql.org/message-id/52A1CAB6.9020600@cybertec.at The only difference I see is minor change of log message of DECLARE command. Therefore I didn't have to recheck the logic of the test. 28.patch -------- * ecpg_cursor_do_move_absolute() and ecpg_cursor_do_move_all() should better be declared in 26.patch. Just a pedantic comment - all the parts will probably be applied all at once. Other ----- Besides the individual parts I propose some typo fixes and improvements in wording: * doc/src/sgml/ecpg.sgml 462c462 < ECPG does cursor accounting in its runtime library and this makes possible --- > ECPG does cursor accounting in its runtime library and this makes it possible 504c504 < recommended to recompile using option <option>-r readahead=number</option> --- > recommended to recompile it using option <option>-r readahead=number</option> * src/interfaces/ecpg/ecpglib/extern.h 97c97 < * The cursor was created in this level of * (sub-)transaction. --- > * The cursor was created at this level of (sub-)transaction. * src/interfaces/ecpg/ecpglib/README.cursor+subxact 4c4 < Contents of tuples returned by a cursor always reflect the data present at --- > Contents of tuples returned by a cursor always reflects the data present at 29c29 < needs two operations. If the next command issued by the application spills --- > need two operations. If the next command issued by the application spills 32c32 < kinds of commands as is after 3 cache misses. FETCH FORWARD/BACKWARD allows --- > kinds of commands "as is" after 3 cache misses. FETCH FORWARD/BACKWARD allows 81c81 < I can also be negative (but also 1-based) if the application started with --- > It can also be negative (but also 1-based) if the application started with 132c132 < is that (sub-)transactions are also needed to be tracked. These two are --- > is that (sub-)transactions also need to be tracked. These two are 148c148 < and he issued command may be executed without an error, causing unwanted --- > and the issued command may be executed without an error, causing unwanted In addition, please consider the following stuff in src/interfaces/ecpg/ecpglib/README.cursor+subxact - it can't be expressed as diff because I'm not 100% sure about the intended meaning "either implicitly propagated" - I'd expect "either ... or ...". Or remove no "either" at all? "Committing a transaction or releasing a subtransaction make cursors ..." -> "Committing a transaction or releasing a subtransactionpropagates the cursor(s) ... " ? "The idea behind cursor readahead is to move one tuple less than asked by the application ..." My understanding of sql-cursor-ra-fetch.stderr is that the application does not have to do MOVE explicitly. Maybe "... to move to the adjacent lower / upper tuple of the supposed beginning of the result set and then have ecpglib perform FETCH FORWARD / BACKWARD N respectively ..." Consider it just a tentative proposal, I can easily be wrong :-) That's it for my review. // Tony On 04/16/2014 10:54 AM, Boszormenyi Zoltan wrote: > 2014-01-18 18:08 keltezéssel, Boszormenyi Zoltan írta: >> Hi, >> >> >> Alvaro Herrera wrote: >>> Boszormenyi Zoltan escribió: >>>> Rebased patches after the regression test and other details were fixed >>>> in the infrastructure part. >>> >>> This thread started in 2010, and various pieces have been applied >>> already and some others have changed in nature. Would you please post a >>> new patchset, containing rebased patches that still need application, in >>> a new email thread to be linked in the commitfest entry? >> >> I hope Thunderbird did the right thing and didn't include the reference >> message ID when I told it to "edit as new". So supposedly this is a new >> thread but with all the cc: addresses kept. >> >> I have rebased all remaining patches and kept the numbering. >> All the patches from 18 to 25 that were previously in the >> "ECPG infrastructure" CF link are included here. >> >> There is no functional change. > > Because of the recent bugfixes in the ECPG area, the patchset > didn't apply cleanly anymore. Rebased with no functional change. > > Best regards, > Zoltán Böszörményi >
Antonin Houska wrote: > I haven't been too familiar with the ECPG internals so far but tried to > do my best. I'm afraid we're stuck on this patch until Michael has time to review it, or some other committer wants to acquire maintainership rights in the ECPG code. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi, thanks for the review. 2014-04-23 17:20 keltezéssel, Antonin Houska írta: > I haven't been too familiar with the ECPG internals so far but tried to > do my best. > > > Generic criteria > ---------------- > > * Does it follow the project coding guidelines? > > Yes. > > > * Are there portability issues? > > Shouldn't be. I even noticed the code tries to avoid platform-specific > behaviour of standard library function - see comment about strtoll() in > Linux in 25.patch. (I personally don't know how that function works > elsewhere but that shouldn't matter.) > > > * Are the comments sufficient and accurate? > > Yes, mostly. Just a few improvements recommended below. > > > * Does it do what it says, correctly? > IMO, yes. > > > * Does it produce compiler warnings? > No. > > > * Can you make it crash? > No. > > > Only some of the parts deserve comments: > > > 23.patch > -------- > > Reviewed earlier as a component of the relate patch > (http://www.postgresql.org/message-id/52A1E61A.7010100@gmail.com) > and minimum changes done since that time. Nevertheless, a few more comments: > > * How about a regression test for the new ECPGcursor_dml() function? It makes sense to add one. > > * ECPGtrans() - arguments are explained, but the return (bool) value > should be as well. All exported ECPG functions returns bool. IIRC the code generated by "EXEC SQL WHENEVER <something-else-than-CONTINUE>" makes use of the returned value. > > * line breaks (pgindent might help): > > static void > output_cursor_name(struct cursor *ptr) > { > > instead of > > static void output_cursor_name(struct cursor *ptr) > { OK. > > > * confused messages in src/interfaces/ecpg/test/sql/cursorsubxact.pgc, > starting at 100: > > exec sql open :curname; > if (sqlca.sqlcode < 0) > printf("open %s (case sensitive) failed with SQLSTATE > %5s\n", curname, sqlca.sqlstate); > else > printf("close %s (case sensitive) succeeded\n", curname); > > I suppose both should be "open". Yes, oversight. > > > 26.patch (the READAHEAD feature itself) > --------------------------------------- > > I tried to understand the code but couldn't find any obvious error. The > coding style looks clean. Maybe just the arguments and return value of > the ecpglib functinons (ECPGopen, ECPGopen, ECPGfetch, ...) deserve a > bit more attention. What do you mean exactly? > > As for tests, I find them comprehensive and almost everything they do is > clear to me. Just the following was worth questions: > > * sql-cursor-ra-fetch.stderr > > [NO_PID]: ecpg_execute on line 169: query: move forward all in > scroll_cur; with 0 parameter(s) on connection regress1 > ... > [NO_PID]: ecpg_execute on line 169: query: move relative -3 in > scroll_cur; with 0 parameter(s) on > > As the first iteration is special anyway, wouldn't "move absolute -3" be > more efficient than the existing 2 commands? The caching code tries to do the correct thing whichever direction the cursor is scanned. AFAIR this one explicitly tests invalidating the readahead window if you fall off it by using MOVE FORWARD ALL. > > The following test (also FETCH RELATIVE) uses "move absolute": > > [NO_PID]: ecpg_execute on line 186: query: move absolute -20 in > scroll_cur; with 0 parameter(s) on connection regress1 > > > Other than this, I had an idea to improve the behaviour if READAHEAD is > smaller than the actual step, but then realized that 29.patch actually > does fix that :-) B-) > * cursor-ra-move.pgc > > What's the relevant difference between unspec_cur1 and unspec_cur2 > cursors? There's no difference in scrollability or ordering. And the > tests seem to be identical. Oh. Erm. That is a leftover from a previous incarnation when I thought it's a good idea to let the server return the actual scrollability flag because the server can actually know. The simple query when running in psql is implicitly scrollable but the query with the join is not. That idea was shot down with prejudice but I forgot to adjust the test and remove one of these cursors. Now all queries where scrollability is not specified are explicitly modified (behind the application's back) to have NO SCROLL. ( Please, don't revise your previous opinion, Tom Lane included... ;-) ) > > > * cursor-ra-swdir.pgc > > No questions > > > * cursorsubxact.pgc > > This appears to be the test we discussed earlier: > http://www.postgresql.org/message-id/52A1CAB6.9020600@cybertec.at > > The only difference I see is minor change of log message of DECLARE > command. Therefore I didn't have to recheck the logic of the test. > > > 28.patch > -------- > > * ecpg_cursor_do_move_absolute() and ecpg_cursor_do_move_all() should > better be declared in 26.patch. Just a pedantic comment - all the parts > will probably be applied all at once. When I re-roll the patchset, I will move this chunk to 26. > > > Other > ----- > > Besides the individual parts I propose some typo fixes and > improvements in wording: > > > * doc/src/sgml/ecpg.sgml > > 462c462 > < ECPG does cursor accounting in its runtime library and this makes > possible > --- >> ECPG does cursor accounting in its runtime library and this makes > it possible > 504c504 > < recommended to recompile using option <option>-r > readahead=number</option> > --- >> recommended to recompile it using option <option>-r > readahead=number</option> > > > * src/interfaces/ecpg/ecpglib/extern.h > > 97c97 > < * The cursor was created in this level of * (sub-)transaction. > --- >> * The cursor was created at this level of (sub-)transaction. > > * src/interfaces/ecpg/ecpglib/README.cursor+subxact > > 4c4 > < Contents of tuples returned by a cursor always reflect the data present at > --- >> Contents of tuples returned by a cursor always reflects the data > present at Contents of tuples ... reflect ... Plural. The verb is correct as is. :-) > 29c29 > < needs two operations. If the next command issued by the application spills > --- >> need two operations. If the next command issued by the application spills I'll re-read and see what context "need" is in. Hey, a context diff would have made it more obvious. ;-) > 32c32 > < kinds of commands as is after 3 cache misses. FETCH FORWARD/BACKWARD > allows > --- >> kinds of commands "as is" after 3 cache misses. FETCH FORWARD/BACKWARD > allows > 81c81 > < I can also be negative (but also 1-based) if the application started with > --- >> It can also be negative (but also 1-based) if the application started with Sure. This sentence is not about "I" but it's true either way. :-D > 132c132 > < is that (sub-)transactions are also needed to be tracked. These two are > --- >> is that (sub-)transactions also need to be tracked. These two are > 148c148 > < and he issued command may be executed without an error, causing unwanted > --- >> and the issued command may be executed without an error, causing unwanted I will fix these. > > > In addition, please consider the following stuff in > src/interfaces/ecpg/ecpglib/README.cursor+subxact - it can't be > expressed as diff because I'm not 100% sure about the intended meaning > > > "either implicitly propagated" - I'd expect "either ... or ...". Or > remove no "either" at all? > > > > "Committing a transaction or releasing a subtransaction make cursors ..." > -> > "Committing a transaction or releasing a subtransaction propagates the > cursor(s) ... " > ? > > > > "The idea behind cursor readahead is to move one tuple less than asked > by the application ..." > > My understanding of sql-cursor-ra-fetch.stderr is that the application > does not have to do MOVE explicitly. Maybe > > "... to move to the adjacent lower / upper tuple of the supposed > beginning of the result set and then have ecpglib perform FETCH FORWARD > / BACKWARD N respectively ..." > > Consider it just a tentative proposal, I can easily be wrong :-) I will read your comments again with fresh eyes. > > That's it for my review. Thank you very much. > > // Tony > > On 04/16/2014 10:54 AM, Boszormenyi Zoltan wrote: >> 2014-01-18 18:08 keltezéssel, Boszormenyi Zoltan írta: >>> Hi, >>> >>> >>> Alvaro Herrera wrote: >>>> Boszormenyi Zoltan escribió: >>>>> Rebased patches after the regression test and other details were fixed >>>>> in the infrastructure part. >>>> This thread started in 2010, and various pieces have been applied >>>> already and some others have changed in nature. Would you please post a >>>> new patchset, containing rebased patches that still need application, in >>>> a new email thread to be linked in the commitfest entry? >>> I hope Thunderbird did the right thing and didn't include the reference >>> message ID when I told it to "edit as new". So supposedly this is a new >>> thread but with all the cc: addresses kept. >>> >>> I have rebased all remaining patches and kept the numbering. >>> All the patches from 18 to 25 that were previously in the >>> "ECPG infrastructure" CF link are included here. >>> >>> There is no functional change. >> Because of the recent bugfixes in the ECPG area, the patchset >> didn't apply cleanly anymore. Rebased with no functional change. >> >> Best regards, >> Zoltán Böszörményi >> > >
[Now I'm only replying where my explanation seems useful. If you expect anything else, please remind me.] On 04/23/2014 06:41 PM, Boszormenyi Zoltan wrote: > > All exported ECPG functions returns bool. IIRC the code generated by > "EXEC SQL WHENEVER <something-else-than-CONTINUE>" makes use > of the returned value. ok >> >> 26.patch (the READAHEAD feature itself) >> --------------------------------------- >> Maybe just the arguments and return value of >> the ecpglib functinons (ECPGopen, ECPGopen, ECPGfetch, ...) deserve a >> bit more attention. > > What do you mean exactly? Basically the missing description of return type was most blatant, but you explained it above. Now that I see some of the existing library functions, the descriptions of parameters are neither too eloquent. So ignore this remark. >> >> * sql-cursor-ra-fetch.stderr >> >> [NO_PID]: ecpg_execute on line 169: query: move forward all in >> scroll_cur; with 0 parameter(s) on connection regress1 >> ... >> [NO_PID]: ecpg_execute on line 169: query: move relative -3 in >> scroll_cur; with 0 parameter(s) on >> >> As the first iteration is special anyway, wouldn't "move absolute -3" be >> more efficient than the existing 2 commands? > > The caching code tries to do the correct thing whichever direction > the cursor is scanned. AFAIR this one explicitly tests invalidating > the readahead window if you fall off it by using MOVE FORWARD ALL. I have no doubt about correctness of the logic, just suspect that a single MOVE command could do the action. Perhaps consider it my paranoia and let committer judge if it's worth a change (especially if the related amount of coding would seem inadequate). >> >> Other >> ----- >> >> Besides the individual parts I propose some typo fixes and >> improvements in wording: >> >> >> * doc/src/sgml/ecpg.sgml In general, I'm not English native speaker, can be wrong in some cases. Just pointed out what I thought is worth checking. // Tony
On 04/23/2014 05:24 PM, Alvaro Herrera wrote: > Antonin Houska wrote: >> I haven't been too familiar with the ECPG internals so far but tried to >> do my best. > > I'm afraid we're stuck on this patch until Michael has time to review > it, or some other committer wants to acquire maintainership rights in > the ECPG code. > Committer availability might well be the issue, but missing review probably too. Whether this review is enough to move the patch to "ready for committer" - I tend to let the next CFM decide. (I don't find it productive to ignite another round of discussion about kinds of reviews - already saw some.) // Tony
On Thu, Apr 24, 2014 at 12:15:41AM +0200, Antonin Houska wrote: > Whether this review is enough to move the patch to "ready for committer" > - I tend to let the next CFM decide. (I don't find it productive to > ignite another round of discussion about kinds of reviews - already saw > some.) In today's CF process, we trust reviewers to make that decision. When all unambiguous defects known to you have been resolved, please mark the patch Ready for Committer. Your review covered more ground than the average review, so don't worry about it being too cursory to qualify. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Thanks an awful lot Antonin. > Committer availability might well be the issue, but missing review > probably too. Yes, you're right. If my taks is mostly one last glance and a commit I will make time for that. > Whether this review is enough to move the patch to "ready for committer" > - I tend to let the next CFM decide. (I don't find it productive to > ignite another round of discussion about kinds of reviews - already saw > some.) I saw some remarks in your review that Zoltan wants to address. Once I got the updated version I'll have a look at it. Zoltan, could you send a new version by end of day tomorrow? I'll be sitting on a plane for a longer time again on Saturday. :) Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at gmail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
2014-04-24 14:50 keltezéssel, Michael Meskes írta: > Thanks an awful lot Antonin. > >> Committer availability might well be the issue, but missing review >> probably too. > Yes, you're right. If my taks is mostly one last glance and a commit I will make time for that. > >> Whether this review is enough to move the patch to "ready for committer" >> - I tend to let the next CFM decide. (I don't find it productive to >> ignite another round of discussion about kinds of reviews - already saw >> some.) > I saw some remarks in your review that Zoltan wants to address. Once I got the > updated version I'll have a look at it. > > Zoltan, could you send a new version by end of day tomorrow? I'll be sitting on > a plane for a longer time again on Saturday. :) I will try to. Thanks in advance, Zoltán > > Michael
Just a quickie: I remember noticing earlier that a few comments on functions would probably get mangled badly by pgindent. You probably want to wrap them in /*----- */ to avoid this. In a very quick glance now I saw them in ecpg_get_data, ecpg_cursor_next_pos, ECPGfetch. Perhaps you want to run pgindent on the files you modify, review the changes, and apply tweaks to avoid unwanted ones. (I don't mean to submit pgindented files, because they will be fixed later on anyway; I only suggest tweaking things that would be damaged.) -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
2014-04-24 15:19 keltezéssel, Boszormenyi Zoltan írta: > 2014-04-24 14:50 keltezéssel, Michael Meskes írta: >> Thanks an awful lot Antonin. >> >>> Committer availability might well be the issue, but missing review >>> probably too. >> Yes, you're right. If my taks is mostly one last glance and a commit I will make time >> for that. >> >>> Whether this review is enough to move the patch to "ready for committer" >>> - I tend to let the next CFM decide. (I don't find it productive to >>> ignite another round of discussion about kinds of reviews - already saw >>> some.) >> I saw some remarks in your review that Zoltan wants to address. Once I got the >> updated version I'll have a look at it. >> >> Zoltan, could you send a new version by end of day tomorrow? I'll be sitting on >> a plane for a longer time again on Saturday. :) > > I will try to. Unfortunately, I won't make the deadline because of life (I had to attend a funeral today) and because Antonin has opened a can of worms with this comment: > * How about a regression test for the new ECPGcursor_dml() function? There are some aspects that may need a new discussion. The SQL standard wants an "updatable cursor" for positioned DML (i.e. UPDATE/DELETE with the WHERE CURRENT OF clause) This means passing FOR UPDATE in the query. FOR UPDATE + SCROLL cursor is an impossible combination, ERROR is thrown when DECLARE is executed. This combination can (and should?) be detected in the ECPG preprocessor and it would prevent runtime errors. It's not implemented at the moment. Fortunately, a previous discussion resulted in explicitly passing NO SCROLL for cursors where SCROLL is not specified, it's in 25.patch I intend to extend it a little for SQL standard compliance with embedded SQL. FOR UPDATE should also implicitly mean NO SCROLL. Both the FOR UPDATE + explicit SCROLL and the explicit SCROLL + usage of positioned DML can be detected in the preprocessor and they should throw an error. Then the regression test would really make sense. But these checks in ECPG would still leave a big hole, and it's the other DECLARE variant with the query passed in a prepared statement with "EXEC SQL PREPARE prep_stmt FROM :query_string;" Plugging this hole would require adding a simplified syntax checker to libecpg that only checks the SelectStmt or re-adding the backend code to tell the application the cursor's scrollable (and perhaps the updatable) property. I must have forgotten but surely this was the reason for changing the DECLARE command tag in the first place which was shot down already. So, only the other choice remains, the syntax checker in ecpglib. I think implementing it would make the caching code miss 9.4, since it adds a whole new set of code but the Perl magic for the ECPG syntax checker may be mostly reusable here. Best regards, Zoltán Böszörményi