Thread: ProcessUtility_hook
We can hook SELECT, INSERT, UPDATE and DELETE with ExecutorXxx_hooks, but there is no way to hook DDL commands. Can I submit a patch to add ProcessUtility_hook? pg_stat_statements module also handle DDL commands as same as normal queries. Fortunately, autovacuum calls call vacuum() directly, so the statistics will not filled with VACUUM and ANALYZE commands by autovacuum. There might be another usage for the hook. Since we can filter all SQL commands using ExecutorRun_hook and ProcessUtility_hook, so we could write query text for database audit with programmable filter in the hooks. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Itagaki Takahiro escreveu: > We can hook SELECT, INSERT, UPDATE and DELETE with ExecutorXxx_hooks, > but there is no way to hook DDL commands. Can I submit a patch to add > ProcessUtility_hook? > +1. Hey, I was thinking about that yesterday. ;) > pg_stat_statements module also handle DDL commands as same as normal > queries. Fortunately, autovacuum calls call vacuum() directly, > so the statistics will not filled with VACUUM and ANALYZE commands > by autovacuum. > I don't look at the code but maybe there is a way to hook something there too. But I'd leave it alone for now, unless it's few lines of code. -- Euler Taveira de Oliveira http://www.timbira.com/
Here is a patch to add ProcessUtility_hook to handle all DDL commands in user modules. (ProcessUtility_hook_20091021.patch) It adds a global variable 'ProcessUtility_hook' and export the original function as 'standard_ProcessUtility'. Euler Taveira de Oliveira <euler@timbira.com> wrote: > > pg_stat_statements module also handle DDL commands as same as normal > > queries. Fortunately, autovacuum calls call vacuum() directly, > > so the statistics will not filled with VACUUM and ANALYZE commands > > by autovacuum. > > > I don't look at the code but maybe there is a way to hook something there too. > But I'd leave it alone for now, unless it's few lines of code. I see it is debatable whether pg_stat_statements should support the hook. So I split the change in another patch. (pgss_utility_hook_20091021.patch) Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Attachment
Itagaki Takahiro escreveu: > Here is a patch to add ProcessUtility_hook to handle all DDL > commands in user modules. (ProcessUtility_hook_20091021.patch) > It adds a global variable 'ProcessUtility_hook' and > export the original function as 'standard_ProcessUtility'. > I've reviewed your patch. It applies cleanly and compiles without warnings. It lacks documentation. Could you update it? The functionality is divided in two parts. The first part is a hook in the utility module. The idea is capture the commands that doesn't pass through executor. I'm afraid that that hook will be used only for capturing non-DML queries. If so, why don't we hack the tcop/postgres.c and grab those queries from the same point we log statements? This feature is similar to the executor one. But in the executor case, we use the plan for other things. Other utilities are (i) using that hook to filter out some non-DML commands and (ii) developing some non-DML replication mechanism for trigger-based replication solutions. The second part is to use that hook to capture non-DML commands for pg_stat_statements module. Do we need to have rows = 0 for non-DML commands? Maybe NULL could be an appropriate value. The PREPARE command stopped to countthe number of rows. Should we count the rowsin EXECUTE command or in the PREPARE command? The other command that doesn't count properly is COPY. Could you fix that? I'm concerned about storing some commands that expose passwords like CREATE ROLE foo PASSWORD 'secret'; I know that the queries are only showed to superusers but maybe we should add this information to documentation or provide a mechanism to exclude those commands. I don't know if it is worth the trouble adding some code to capture VACUUM and ANALYZE commands called inside autovacuum. -- Euler Taveira de Oliveira http://www.timbira.com/
Euler Taveira de Oliveira <euler@timbira.com> wrote: > The functionality is divided in two parts. The first part is a hook in the > utility module. The idea is capture the commands that doesn't pass through > executor. I'm afraid that that hook will be used only for capturing non-DML > queries. If so, why don't we hack the tcop/postgres.c and grab those queries > from the same point we log statements? DDLs can be used from user defined functions. It has the same reason why we have executor hooks instead of tcop hooks. > The second part is to use that hook to capture non-DML commands for > pg_stat_statements module. - I fixed a bug that it should handle only isTopLevel command. - A new parameter pg_stat_statements.track_ddl (boolean) is added to enable or disable the feature. > Do we need to have rows = 0 for non-DML commands? > Maybe NULL could be an appropriate value. Yes, but it requires additional management to handle 0 and NULL separately. I don't think it is worth doing. > The PREPARE command stopped to count > the number of rows. Should we count the rows in EXECUTE command or in the > PREPARE command? It requires major rewrites of EXECUTE command to pass the number of affected rows to caller. I doubt it is worth fixing because almost all drivers use protocol-level prepared statements instead of PREPARE+EXECUTE. > The other command that doesn't count properly is COPY. Could > you fix that? I added codes for it. > I'm concerned about storing some commands that expose passwords > like CREATE ROLE foo PASSWORD 'secret'; I know that the queries are only > showed to superusers but maybe we should add this information to documentation > or provide a mechanism to exclude those commands. I think there is no additonal problem there because we can see the 'secret' command using pg_stat_activity even now. > I don't know if it is worth the trouble adding some code to capture VACUUM and > ANALYZE commands called inside autovacuum. I'd like to exclude VACUUM and ANALYZE by autovacuum because pg_stat_statements should not filled with those commands. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Attachment
I have applied this patch, with only a minor wording improvement: Specify <literal>on</> to track DDL commands, which excludes <command>SELECT</>, ^^^^^^^^^^^^^^ Thanks. --------------------------------------------------------------------------- Itagaki Takahiro wrote: > > Euler Taveira de Oliveira <euler@timbira.com> wrote: > > > The functionality is divided in two parts. The first part is a hook in the > > utility module. The idea is capture the commands that doesn't pass through > > executor. I'm afraid that that hook will be used only for capturing non-DML > > queries. If so, why don't we hack the tcop/postgres.c and grab those queries > > from the same point we log statements? > > DDLs can be used from user defined functions. It has the same reason > why we have executor hooks instead of tcop hooks. > > > > The second part is to use that hook to capture non-DML commands for > > pg_stat_statements module. > > - I fixed a bug that it should handle only isTopLevel command. > - A new parameter pg_stat_statements.track_ddl (boolean) is > added to enable or disable the feature. > > > Do we need to have rows = 0 for non-DML commands? > > Maybe NULL could be an appropriate value. > > Yes, but it requires additional management to handle 0 and NULL > separately. I don't think it is worth doing. > > > The PREPARE command stopped to count > > the number of rows. Should we count the rows in EXECUTE command or in the > > PREPARE command? > > It requires major rewrites of EXECUTE command to pass the number of > affected rows to caller. I doubt it is worth fixing because almost all > drivers use protocol-level prepared statements instead of PREPARE+EXECUTE. > > > The other command that doesn't count properly is COPY. Could > > you fix that? > > I added codes for it. > > > I'm concerned about storing some commands that expose passwords > > like CREATE ROLE foo PASSWORD 'secret'; I know that the queries are only > > showed to superusers but maybe we should add this information to documentation > > or provide a mechanism to exclude those commands. > > I think there is no additonal problem there because we can see the 'secret' > command using pg_stat_activity even now. > > > > I don't know if it is worth the trouble adding some code to capture VACUUM and > > ANALYZE commands called inside autovacuum. > > I'd like to exclude VACUUM and ANALYZE by autovacuum because > pg_stat_statements should not filled with those commands. > > > Regards, > --- > ITAGAKI Takahiro > NTT Open Source Software Center > [ Attachment, skipping... ] > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes: > I have applied this patch, with only a minor wording improvement: Uh, we weren't even done reviewing this were we? regards, tom lane
Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > I have applied this patch, with only a minor wording improvement: > > Uh, we weren't even done reviewing this were we? Uh, I am new to this commitfest wiki thing, but it did have a review by Euler Taveira de Oliveira: https://commitfest.postgresql.org/action/patch_view?id=196 and the author replied. Is there more that needs to be done? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes: > Tom Lane wrote: >> Uh, we weren't even done reviewing this were we? > Uh, I am new to this commitfest wiki thing, but it did have a review by > Euler Taveira de Oliveira: > https://commitfest.postgresql.org/action/patch_view?id=196 > and the author replied. Is there more that needs to be done? It wasn't marked Ready For Committer, so presumably the reviewer wasn't done with it. I know I hadn't looked at it at all, because I was waiting for the commitfest review process to finish. regards, tom lane
Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Tom Lane wrote: > >> Uh, we weren't even done reviewing this were we? > > > Uh, I am new to this commitfest wiki thing, but it did have a review by > > Euler Taveira de Oliveira: > > https://commitfest.postgresql.org/action/patch_view?id=196 > > and the author replied. Is there more that needs to be done? > > It wasn't marked Ready For Committer, so presumably the reviewer > wasn't done with it. I know I hadn't looked at it at all, because > I was waiting for the commitfest review process to finish. So, if someone writes a patch, and it is reviewed, and the patch author updates the patch and replies, it still should be reviewed again before being committed? I was unclear on that. The updated patch only appeared today, so maybe it was ready, but the commit fest manager has to indicate that in the status before I review/apply it? Should I revert the patch? So there is nothing for me to do to help? The only two patches I see as ready for committer are HS and SR; not going to touch those. ;-) Also, we are two weeks into the commit fest and we have more unapplied patches than applied ones. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
I wrote: > It wasn't marked Ready For Committer, so presumably the reviewer > wasn't done with it. I know I hadn't looked at it at all, because > I was waiting for the commitfest review process to finish. ... and now that I have, I find at least four highly questionable things about it: 1. The placement of the hook. Why is it three lines down in ProcessUtility? It's probably reasonable to have the Assert first, but I don't see why the hook function should have the ability to editorialize on the behavior of everything about ProcessUtility *except* the read-only-xact check. 2. The naming and documentation of the added GUC setting for pg_stat_statements. "track_ddl" seems pretty bizarre to me because there are many utility statements that no one would call DDL. COPY, for example, is certainly not DDL. Why not call it "track_utility"? 3. The enable-condition test in pgss_ProcessUtility. Is it really appropriate to be gating this by isTopLevel? I should think that the nested_level check in pgss_enabled would be sufficient and more likely to do what's expected. 4. The special case for CopyStmt. That's just weird, and it adds a maintenance requirement we don't need. I don't see a really good argument why COPY (alone among utility statements) deserves to have a rowcount tracked by pg_stat_statements, but even if you want that it'd be better to rely on examining the completionTag after the fact. The fact that the tag is "COPY nnnn" is part of the user-visible API for COPY and won't change lightly. The division of labor between ProcessUtility and copy.c is far more volatile, but this patch has injected itself into that. regards, tom lane
Bruce Momjian <bruce@momjian.us> writes: > So, if someone writes a patch, and it is reviewed, and the patch author > updates the patch and replies, it still should be reviewed again before > being committed? Well, that's for the reviewer to say --- if the update satisfies his concerns, he should sign off on it, if not not. I've tried to avoid pre-empting that process. > Also, we are two weeks into the commit fest and we have more unapplied > patches than applied ones. Yup. Lots of unfinished reviews out there. Robert spent a good deal of effort in the last two fests trying to light fires under reviewers; do you want to take up that cudgel? I think wholesale commits of things that haven't finished review is mostly going to send a signal that the review process doesn't matter, which is *not* the signal I think we should send. regards, tom lane
OK, reverted and placed back in "Needs Review" status. --------------------------------------------------------------------------- Tom Lane wrote: > I wrote: > > It wasn't marked Ready For Committer, so presumably the reviewer > > wasn't done with it. I know I hadn't looked at it at all, because > > I was waiting for the commitfest review process to finish. > > ... and now that I have, I find at least four highly questionable > things about it: > > 1. The placement of the hook. Why is it three lines down in > ProcessUtility? It's probably reasonable to have the Assert first, > but I don't see why the hook function should have the ability to > editorialize on the behavior of everything about ProcessUtility > *except* the read-only-xact check. > > 2. The naming and documentation of the added GUC setting for > pg_stat_statements. "track_ddl" seems pretty bizarre to me because > there are many utility statements that no one would call DDL. COPY, > for example, is certainly not DDL. Why not call it "track_utility"? > > 3. The enable-condition test in pgss_ProcessUtility. Is it really > appropriate to be gating this by isTopLevel? I should think that > the nested_level check in pgss_enabled would be sufficient and > more likely to do what's expected. > > 4. The special case for CopyStmt. That's just weird, and it adds > a maintenance requirement we don't need. I don't see a really good > argument why COPY (alone among utility statements) deserves to have > a rowcount tracked by pg_stat_statements, but even if you want that > it'd be better to rely on examining the completionTag after the fact. > The fact that the tag is "COPY nnnn" is part of the user-visible API > for COPY and won't change lightly. The division of labor between > ProcessUtility and copy.c is far more volatile, but this patch has > injected itself into that. > > regards, tom lane -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > So, if someone writes a patch, and it is reviewed, and the patch author > > updates the patch and replies, it still should be reviewed again before > > being committed? > > Well, that's for the reviewer to say --- if the update satisfies his > concerns, he should sign off on it, if not not. I've tried to avoid > pre-empting that process. OK, so the reviewer knows he has to reply to the author's comments, OK. > > Also, we are two weeks into the commit fest and we have more unapplied > > patches than applied ones. > > Yup. Lots of unfinished reviews out there. Robert spent a good deal > of effort in the last two fests trying to light fires under reviewers; > do you want to take up that cudgel? I think wholesale commits of things I am afraid I am then duplicating work the commit fest manager is doing, and if someone is bugged by me and the CF manager, they might get upset. > that haven't finished review is mostly going to send a signal that the > review process doesn't matter, which is *not* the signal I think we > should send. True. Maybe I am best focusing on open issues like the threading and psql -1 patches I worked on today. There is certainly enough of that stuff to keep me busy. I thought I could help with the commit fest load, but now I am unsure. That non-commit-fest stuff has to be done too so maybe managing that will help. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > So, if someone writes a patch, and it is reviewed, and the patch author > updates the patch and replies, it still should be reviewed again before > being committed? That's generally how things were expected to happen--to at least double-check that the proposed fixes match what was expected. I don't think it was spelled out very well in the past though. > Also, we are two weeks into the commit fest and we have more unapplied > patches than applied ones. > Considering that one of those was a holiday week with a lot of schedule disruption proceeding it, I don't know how much faster things could have moved. There were large chunks of the reviewer volunteers who wanted only jobs they could finish before the holiday, and others who weren't available at all until afterwards. And I don't know about every else, but I took all four days off and started today behind by several hundred list messages. I was planning to start nagging again tomorrow, hoping that most would be caught up from any holiday time off too by then. Right now, of the 16 patches listed in "Needs Review" besides the ECPG ones Michael is working through, the breakdown is like this: Not reviewed at all yet: 6 Reviewed once, updated, waiting for re-review: 10 So the bulk of the problem for keeping the pipeline moving is in these re-reviews holding up transitions to "Ready for committer". I've had some discussion with Robert about how to better distinguish in the management app when the reviewer has work they're expected to do vs. when they think they're done with things. We're converging toward a more clear set of written guidelines to provide for managing future CommitFests as part of that, right now there's a few too many fuzzy parts for my liking. If the need here is to speed up how fast things are fed to committers, we can certainly do that. The current process still favors having reviewers do as much as possible first, as shown by all the stuff sitting in the re-review queue. The work we're waiting on them for could be done by the committers instead if we want to shorten the whole process a bit. I don't think that's really what you want though. -- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.com
Greg Smith wrote: > > If the need here is to speed up how fast things are fed to committers, > we can certainly do that. The current process still favors having > reviewers do as much as possible first, as shown by all the stuff > sitting in the re-review queue. The work we're waiting on them for > could be done by the committers instead if we want to shorten the > whole process a bit. I don't think that's really what you want though. > As I have observed before, I think we need some infrastructure to help committers claim items, so we don't duplicate work. Right now the only items marked "ready for reviewer" are Streaming Replication and Hot Standby, which I imagine Heiki will be handling. I'm going to look at the YAML format for EXPLAIN patch shortly. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > As I have observed before, I think we need some infrastructure to help > committers claim items, so we don't duplicate work. Robert acknowledged the need for a "claimed by committer" field in the fest application, but he hasn't got round to it yet. In the meantime I've been adding a "Taking this one..." type of comment to an entry I want to claim. > I'm going to look at the YAML format for EXPLAIN patch shortly. Do we have consensus yet that we want YAML? It seemed, well, yet another format without all that much advantage over what's there. regards, tom lane
On Nov 30, 2009, at 8:08 PM, Tom Lane wrote: >> I'm going to look at the YAML format for EXPLAIN patch shortly. > > Do we have consensus yet that we want YAML? It seemed, well, > yet another format without all that much advantage over what's > there. Legibility++ David
Tom Lane wrote: >> I'm going to look at the YAML format for EXPLAIN patch shortly. >> > > Do we have consensus yet that we want YAML? It seemed, well, > yet another format without all that much advantage over what's > there. > > > I think you and I are the two main skeptics ;-) But we seem to be rather in the minority, and it's not terribly invasive from what I have seen so far. cheers andrew
On Mon, Nov 30, 2009 at 10:16 PM, Greg Smith <greg@2ndquadrant.com> wrote: > Considering that one of those was a holiday week with a lot of schedule > disruption proceeding it, I don't know how much faster things could have > moved. There were large chunks of the reviewer volunteers who wanted only > jobs they could finish before the holiday, and others who weren't available > at all until afterwards. And I don't know about every else, but I took all > four days off and started today behind by several hundred list messages. I > was planning to start nagging again tomorrow, hoping that most would be > caught up from any holiday time off too by then. > > Right now, of the 16 patches listed in "Needs Review" besides the ECPG ones > Michael is working through, the breakdown is like this: > > Not reviewed at all yet: 6 > Reviewed once, updated, waiting for re-review: 10 > > So the bulk of the problem for keeping the pipeline moving is in these > re-reviews holding up transitions to "Ready for committer". I've had some > discussion with Robert about how to better distinguish in the management app > when the reviewer has work they're expected to do vs. when they think > they're done with things. We're converging toward a more clear set of > written guidelines to provide for managing future CommitFests as part of > that, right now there's a few too many fuzzy parts for my liking. > > If the need here is to speed up how fast things are fed to committers, we > can certainly do that. The current process still favors having reviewers do > as much as possible first, as shown by all the stuff sitting in the > re-review queue. The work we're waiting on them for could be done by the > committers instead if we want to shorten the whole process a bit. I don't > think that's really what you want though. I think the pressure has to be applied all through the process. People who haven't provided a review by now are certainly overdue for a polite poke, Thanksgiving or no Thanksgiving. If the first review doesn't happen until the third week of the CommitFest, how is the patch going to get a fair shake by the end of the fourth one? I mean, if that happens to a small number of patches, OK, but when it's 20% of what's in the CommitFest, it seems like it's bound to lead to a huge crunch at the end. In any case, unlike last CommitFest, where the problem was a lack of adequate committer activity, it's pretty clear that the the problem this CommitFest has been a combination of slow reviews and slow updates by patch authors. I've been keeping a loose eye on the patch queue and my impression is that there has rarely been more than 1 patch other than HS + SR in the "Ready for Committer" state, and many times zero. That's means the pipeline is stalling, and that's bad. ...Robert
Tom Lane escreveu: > Bruce Momjian <bruce@momjian.us> writes: >> So, if someone writes a patch, and it is reviewed, and the patch author >> updates the patch and replies, it still should be reviewed again before >> being committed? > > Well, that's for the reviewer to say --- if the update satisfies his > concerns, he should sign off on it, if not not. I've tried to avoid > pre-empting that process. > That's correct. I didn't have time to review the new patch yet. :( I'll do it later today. IIRC Tom had some objections (during the last CF) the way the patch was proposed and suggested changes. Let's see if the Takahiro-san did everything that was suggested. -- Euler Taveira de Oliveira http://www.timbira.com/
On Mon, Nov 30, 2009 at 11:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> As I have observed before, I think we need some infrastructure to help >> committers claim items, so we don't duplicate work. > > Robert acknowledged the need for a "claimed by committer" field in the > fest application, but he hasn't got round to it yet. In the meantime > I've been adding a "Taking this one..." type of comment to an entry > I want to claim. Sorry I haven't gotten around to this. Beyond being a little burned out a little bit, I have been a little bit under the weather and a little occupied with life apart from PostgreSQL, as if there were such a thing. Anyway, one of the concerns I have about this is that adding another field to the commitfest_view page seems as though it will create some layout issues - the leftmost column will get squished. I could (a) go ahead and do it anyway or (b) do it, but modify the layout in some unspecified way so that it doesn't impact the format as much or of course (c) not do it. Any thoughts? It would also like to clarify the use case for this a little bit more.Is this just to track patches which committers arein the process of committing (or have committed)? Or would a committer potentially set this on some patch that was still being reviewed, and if so would that mean "don't review this any more because I'm taking over" or "I'm planning to pick this up when the review process completes" or something else? Thanks, ...Robert
Robert Haas wrote: > On Mon, Nov 30, 2009 at 11:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Andrew Dunstan <andrew@dunslane.net> writes: > >> As I have observed before, I think we need some infrastructure to help > >> committers claim items, so we don't duplicate work. > > > > Robert acknowledged the need for a "claimed by committer" field in the > > fest application, but he hasn't got round to it yet. ?In the meantime > > I've been adding a "Taking this one..." type of comment to an entry > > I want to claim. > > Sorry I haven't gotten around to this. Beyond being a little burned > out a little bit, I have been a little bit under the weather and a > little occupied with life apart from PostgreSQL, as if there were such > a thing. Anyway, one of the concerns I have about this is that adding > another field to the commitfest_view page seems as though it will > create some layout issues - the leftmost column will get squished. I > could (a) go ahead and do it anyway or (b) do it, but modify the > layout in some unspecified way so that it doesn't impact the format as > much or of course (c) not do it. Any thoughts? > > It would also like to clarify the use case for this a little bit more. > Is this just to track patches which committers are in the process of > committing (or have committed)? Or would a committer potentially set > this on some patch that was still being reviewed, and if so would that > mean "don't review this any more because I'm taking over" or "I'm > planning to pick this up when the review process completes" or > something else? I am thinking we can just add a new status, "claimed by committer" and not bother about adding a new column with the committer name. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: >> It would also like to clarify the use case for this a little bit more. >> Is this just to track patches which committers are in the process of >> committing (or have committed)? Or would a committer potentially set >> this on some patch that was still being reviewed, and if so would that >> mean "don't review this any more because I'm taking over" or "I'm >> planning to pick this up when the review process completes" or >> something else? >> > > I am thinking we can just add a new status, "claimed by committer" and > not bother about adding a new column with the committer name. > > I think it would be more flexible and useful to be able to "claim" it before the review is finished. It would mean in effect "I am following this and intend to commit it when it's ready". cheers andrew
On Tue, Dec 1, 2009 at 8:27 AM, Andrew Dunstan <andrew@dunslane.net> wrote: > Bruce Momjian wrote: >>> >>> It would also like to clarify the use case for this a little bit more. >>> Is this just to track patches which committers are in the process of >>> committing (or have committed)? Or would a committer potentially set >>> this on some patch that was still being reviewed, and if so would that >>> mean "don't review this any more because I'm taking over" or "I'm >>> planning to pick this up when the review process completes" or >>> something else? >> >> I am thinking we can just add a new status, "claimed by committer" and >> not bother about adding a new column with the committer name. > > I think it would be more flexible and useful to be able to "claim" it before > the review is finished. It would mean in effect "I am following this and > intend to commit it when it's ready". If we went with Bruce's interpretation, we could have a "committer" field that only appears when the status is "Claimed by Committer" or "Committed" and the contents of that field could be displayed in parentheses in the status column, like this: Claimed by Committer (Tom Lane). If we went with Andrew's interpretation, we would need a completely separate column, because there wouldn't be any logical relationship between the status field and the committer field. Any other votes? Tom? On a possibly related note, I am not totally sure that we want to enshrine the principle that committers categorically won't touch patches that are not yet marked Ready for Committer. For major patches like SE-PostgreSQL or the partitioning stuff, early committer involvement is an extremely important ingredient for success. And, I have an uncomfortable feeling about having Tom, Bruce, and Andrew all intentionally sitting on the bench waiting for reviews to complete while the days tick away. On the other hand, I also agree with Tom's point that, if completing reviews doesn't affect whether the patch gets in, there's less incentive for people to review. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Nov 30, 2009 at 11:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Robert acknowledged the need for a "claimed by committer" field in the >> fest application, but he hasn't got round to it yet. > Sorry I haven't gotten around to this. Beyond being a little burned > out a little bit, I have been a little bit under the weather and a > little occupied with life apart from PostgreSQL, as if there were such > a thing. Anyway, one of the concerns I have about this is that adding > another field to the commitfest_view page seems as though it will > create some layout issues - the leftmost column will get squished. I > could (a) go ahead and do it anyway or (b) do it, but modify the > layout in some unspecified way so that it doesn't impact the format as > much or of course (c) not do it. Any thoughts? I would be satisfied if there were a "claimed by" field in the per-patch detail page, which is where you'd have to go to set it anyway. If you want you could add an additional status value "claimed by committer" so it'd be visible in the main page. > It would also like to clarify the use case for this a little bit more. It's to keep committers from treading on each others' toes. Right now, if say Andrew is working over a patch with intent to commit, there's no visibility of that fact in the fest status. I would imagine that a patch should not normally get into this state until it's been marked "ready for committer" by the reviewer. Except perhaps in cases where the reviewer and committer are the same person. regards, tom lane
Robert Haas <robertmhaas@gmail.com> writes: > If we went with Bruce's interpretation, we could have a "committer" > field that only appears when the status is "Claimed by Committer" or > "Committed" and the contents of that field could be displayed in > parentheses in the status column, like this: Claimed by Committer (Tom > Lane). > If we went with Andrew's interpretation, we would need a completely > separate column, because there wouldn't be any logical relationship > between the status field and the committer field. > Any other votes? Tom? I'm happy with Andrew's interpretation --- I just want a separate text field for inserting a committer's name. I don't want any magic behavior of that field. > On a possibly related note, I am not totally sure that we want to > enshrine the principle that committers categorically won't touch > patches that are not yet marked Ready for Committer. No, but I think that should be the default assumption once a reviewer has been assigned. If the reviewer doesn't totally fall down on the job, he/she should be allowed to finish reviewing. regards, tom lane
On Tue, Dec 1, 2009 at 9:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> If we went with Bruce's interpretation, we could have a "committer" >> field that only appears when the status is "Claimed by Committer" or >> "Committed" and the contents of that field could be displayed in >> parentheses in the status column, like this: Claimed by Committer (Tom >> Lane). > >> If we went with Andrew's interpretation, we would need a completely >> separate column, because there wouldn't be any logical relationship >> between the status field and the committer field. > >> Any other votes? Tom? > > I'm happy with Andrew's interpretation --- I just want a separate text > field for inserting a committer's name. I don't want any magic behavior > of that field. OK, I'll add a separate text field for the committer's name, but for now it won't display on the summary page, just the detail page. ...Robert
BTW, if you have time for any purely cosmetic details ... the way the CommitFest field on a patch detail page works has always struck me as weird. It's a data field, and so if it has any behavior at all that behavior ought to involve modifying its value. But what it actually is is a navigation link. I think you ought to drop it down to a plain text field and add a "Back to CommitFest" link to the page header, similar to the way navigation works on other dependent pages. regards, tom lane
Robert Haas wrote: > > OK, I'll add a separate text field for the committer's name, but for > now it won't display on the summary page, just the detail page. > > > Perhaps it could go underneath the reviewer names, maybe in a different color. (And yes, like many of us I suck at GUI design, so feel free to discount any suggestions I make in that area). cheers andrew
On 11/30/09 8:17 PM, Andrew Dunstan wrote: > Do we have consensus yet that we want YAML? It seemed, well, > yet another format without all that much advantage over what's > there. Well, what's the code count? What dependencies, if any, does it add? --Josh
Josh Berkus wrote: > On 11/30/09 8:17 PM, Andrew Dunstan wrote: > >> Do we have consensus yet that we want YAML? It seemed, well, >> yet another format without all that much advantage over what's >> there. >> > > Well, what's the code count? What dependencies, if any, does it add? > > > The patch itself is quite small. There are no extra external dependencies. YAML and JSON are pretty much interchangeable for our purposes. According to Wikipedia, "Both functionally and syntactically, JSON is effectively a subset of YAML." See <http://en.wikipedia.org/wiki/JSON#YAML> So the YAML parsers should be able to handle the JSON output. The only thing we'd be buying with this patch is making a bit happier some people who prefer reading the YAML syntax. For machine readability we'd be gaining precisely nothing. I guess the question is this: when are we going to say "No more output formats. We have enough."? One consideration is this: the more formats we support the dumber the output will be. Already the XML output is arguably dumber than it should be, because XML elements are two-dimensional (they can have property lists (attributes) and child elements) but JSON/YAML nodes are one-dimensional, so we have made some things that one might normally expect to be attributes in XML into child elements. While adding YAML won't impose any additional burden of that kind, because its semantics are so close to those of JSON, other output formats well might. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > YAML and JSON are pretty much interchangeable for our purposes. > According to Wikipedia, "Both functionally and syntactically, JSON is > effectively a subset of YAML." See > <http://en.wikipedia.org/wiki/JSON#YAML> So the YAML parsers should be > able to handle the JSON output. The only thing we'd be buying with this > patch is making a bit happier some people who prefer reading the YAML > syntax. For machine readability we'd be gaining precisely nothing. Hmm. So the argument for it is "let's make a machine-readable format more human-readable"? I'm not getting the point. People should look at the regular text output. > One consideration is this: the more formats we support the dumber the > output will be. Already the XML output is arguably dumber than it should > be, because XML elements are two-dimensional (they can have property > lists (attributes) and child elements) but JSON/YAML nodes are > one-dimensional, so we have made some things that one might normally > expect to be attributes in XML into child elements. While adding YAML > won't impose any additional burden of that kind, because its semantics > are so close to those of JSON, other output formats well might. I tend to look at it the other way around: having to support output formats that have significantly different data models is a Good Thing because it forces you to design sufficiently general code mechanisms. If YAML had yet another data model it might actually be a useful exercise to get the code to handle that. However, if it's not teaching us anything we didn't learn from JSON, there's no gain from that viewpoint either. regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> YAML... > > Hmm. So the argument for it is "let's make a machine-readable format > more human-readable"? I'm not getting the point. People should look > at the regular text output. IMHO YAML beats the regular text format for human-readability - at least for people with narrow terminal windows, and for novices. Greg posted examples comparing regular-text vs yaml vs json here: http://archives.postgresql.org/pgsql-hackers/2009-08/msg02090.php I think it's more human-readable for novices since it explicitly spells out what values refer to startup values vs totals. I think it's more human-readable to me because the current text format frequently wraps for me on even a modestly complex query, and I find scrolling down easier than scrolling both ways. None of the other machine-intended formats seem to suit that purpose well because they're dominated by a lot of markup. That said, though, it's not that big a deal.
All, If some people want it, and there's no significant maintenance burden associated with YAML output, then why not? Mind you, if it was practical, I'd suggest that YAML ... and all additional Explain formats ... should be a contrib module. Anything other than XML and JSON will be fairly marginal. --Josh Berkus
On Wed, 2009-12-02 at 10:45 -0800, Josh Berkus wrote: > All, > > If some people want it, and there's no significant maintenance burden > associated with YAML output, then why not? > > Mind you, if it was practical, I'd suggest that YAML ... and all > additional Explain formats ... should be a contrib module. Anything > other than XML and JSON will be fairly marginal. That would be my take... have explain kick out XML (or whatever) and then parse it into anything you want. That way it isn't additional burden into core. Joshua D. Drake > > --Josh Berkus > > -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering If the world pushes look it in the eye and GRR. Then push back harder. - Salamander
2009/12/3 Ron Mayer <rm_pg@cheapcomplexdevices.com>: > Tom Lane wrote: >> Hmm. So the argument for it is "let's make a machine-readable format >> more human-readable"? I'm not getting the point. People should look >> at the regular text output. > > IMHO YAML beats the regular text format for human-readability - > at least for people with narrow terminal windows, and for novices. Agreed. Calling the regular text output of EXPLAIN "human readable" is an exaggeration. Cheers, BJ
Ron Mayer <rm_pg@cheapcomplexdevices.com> writes: > Tom Lane wrote: >> Hmm. So the argument for it is "let's make a machine-readable format >> more human-readable"? I'm not getting the point. People should look >> at the regular text output. > IMHO YAML beats the regular text format for human-readability - > at least for people with narrow terminal windows, and for novices. > Greg posted examples comparing regular-text vs yaml vs json here: > http://archives.postgresql.org/pgsql-hackers/2009-08/msg02090.php Mph. Maybe I've been looking at the traditional format too long, but I don't find the YAML version better --- it's so verbose that you could only see a small fraction of a query at a time. The main strike against the traditional format IME is exactly what Greg alludes to in that message: it's prone to being rendered totally unreadable by email line-wrapping. However, I'm unconvinced that YAML would be any better; it looks like it's still critically dependent on the location and amount of whitespace in order to be readable. The lines might be a bit shorter on average, but you're still going to hit a narrow window's right margin pretty quick in any complicated plan. In any case, the real killer is email clients that feel no obligation to preserve whitespace layout at all, and this would certainly not look much better after that treatment. regards, tom lane
On Wed, 2009-12-02 at 10:45 -0800, Josh Berkus wrote: > All, > > If some people want it, and there's no significant maintenance burden > associated with YAML output, then why not? > > Mind you, if it was practical, I'd suggest that YAML ... and all > additional Explain formats ... should be a contrib module. Anything > other than XML and JSON will be fairly marginal. That would be my take... have explain kick out XML (or whatever) and then parse it into anything you want. That way it isn't additional burden into core. Joshua D. Drake > > --Josh Berkus > > -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering If the world pushes look it in the eye and GRR. Then push back harder. - Salamander
On Tue, Dec 1, 2009 at 9:43 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Dec 1, 2009 at 9:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> If we went with Bruce's interpretation, we could have a "committer" >>> field that only appears when the status is "Claimed by Committer" or >>> "Committed" and the contents of that field could be displayed in >>> parentheses in the status column, like this: Claimed by Committer (Tom >>> Lane). >> >>> If we went with Andrew's interpretation, we would need a completely >>> separate column, because there wouldn't be any logical relationship >>> between the status field and the committer field. >> >>> Any other votes? Tom? >> >> I'm happy with Andrew's interpretation --- I just want a separate text >> field for inserting a committer's name. I don't want any magic behavior >> of that field. > > OK, I'll add a separate text field for the committer's name, but for > now it won't display on the summary page, just the detail page. Done. ...Robert
Robert Haas wrote: >>> I'm happy with Andrew's interpretation --- I just want a separate text >>> field for inserting a committer's name. I don't want any magic behavior >>> of that field. >>> >> OK, I'll add a separate text field for the committer's name, but for >> now it won't display on the summary page, just the detail page. >> > > Done. > > > Cool. Thanks. cheers andrew
Tom Lane <tgl@sss.pgh.pa.us> wrote: > ... and now that I have, I find at least four highly questionable > things about it: > > 1. The placement of the hook. Why is it three lines down in > ProcessUtility? It's probably reasonable to have the Assert first, > but I don't see why the hook function should have the ability to > editorialize on the behavior of everything about ProcessUtility > *except* the read-only-xact check. I moved the initialization of completionTag into standard_ProcessUtility. > 2. The naming and documentation of the added GUC setting for > pg_stat_statements. "track_ddl" seems pretty bizarre to me because > there are many utility statements that no one would call DDL. COPY, > for example, is certainly not DDL. Why not call it "track_utility"? Ok, fixed. > 3. The enable-condition test in pgss_ProcessUtility. Is it really > appropriate to be gating this by isTopLevel? I should think that > the nested_level check in pgss_enabled would be sufficient and > more likely to do what's expected. I removed the isTopLevel check. I was worried about auto-generated utility commands; generated sub commands are called with the same query string as the top query. Don't it confuse statistics? > 4. The special case for CopyStmt. That's just weird, and it adds > a maintenance requirement we don't need. I don't see a really good > argument why COPY (alone among utility statements) deserves to have > a rowcount tracked by pg_stat_statements, but even if you want that > it'd be better to rely on examining the completionTag after the fact. > The fact that the tag is "COPY nnnn" is part of the user-visible API > for COPY and won't change lightly. The division of labor between > ProcessUtility and copy.c is far more volatile, but this patch has > injected itself into that. Ok, fixed. I've thought string-based interface is not desirable, but it should be a stable API. COPY and INSERT/UPDATE/DELETE (used by EXECUTE) are counted by pg_stat_statements, but EXECUTE SELECT is impossible. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Attachment
On Wed, Dec 2, 2009 at 4:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Ron Mayer <rm_pg@cheapcomplexdevices.com> writes: >> Tom Lane wrote: >>> Hmm. So the argument for it is "let's make a machine-readable format >>> more human-readable"? I'm not getting the point. People should look >>> at the regular text output. > >> IMHO YAML beats the regular text format for human-readability - >> at least for people with narrow terminal windows, and for novices. >> Greg posted examples comparing regular-text vs yaml vs json here: >> http://archives.postgresql.org/pgsql-hackers/2009-08/msg02090.php > > Mph. Maybe I've been looking at the traditional format too long, > but I don't find the YAML version better --- it's so verbose that > you could only see a small fraction of a query at a time. I've been thinking about this a little more and I think I am going to vote (if I get a vote) to reject this patch. I think Andrew hit the crucial point upthread: the latest version of YAML is a superset of JSON, which means that the only possible use cases for this patch are: - people using older YAML parsers that can't handle the latest version (because if they can handle the latest version then they can just use that on the existing JSON format), and - people who want to use the YAML format as a substitute for text format on grounds of readability. The first target doesn't seem worth aiming at. Admittedly the latest version of the YAML 1.2 spec - whose stated goal is JSON-compatibilty - is only two months old, so there may be many YAML users who don't have parsers that are completely JSON-compatible. But presumably this problem will resolve itself over time. With respect to the second one, I am not going to argue that the current text format is ideal in all ways. In particular, on complex plans, I find it difficult to match up the plans for the inner and outer sides of any given node, which may be far enough apart vertically that it's difficult to tell exactly which bits are in the same column. Furthermore, the main output row for each node is wider than I would like, especially when using EXPLAIN ANALYZE. Even on relatively simple plans, I have to maximize my terminal window to forestall wrapping, and on complex plans, wrapping is inevitable even if I do maximize the window. For all of that, in the nearly-two-years I've been reading pgsql-hackers, I can't remember one complaint about the visual presentation of the EXPLAIN output. It's possible that my memory is faulty, but I think I would remember if there had been very many. On top of that, if you did want YAML for easier readability, what aspect of the output is more readable in YAML than it is in text format? The only answer I can think of is that you like having each data element on a separate line, so that the plan is much longer but somewhat narrower. But if that's what you want, the JSON output is almost as good - the only difference is a bit of extra punctuation. ...Robert
> On top of that, if you did want YAML for easier readability, what > aspect of the output is more readable in YAML than it is in text > format? The only answer I can think of is that you like having each > data element on a separate line, so that the plan is much longer but > somewhat narrower. But if that's what you want, the JSON output is > almost as good - the only difference is a bit of extra punctuation. "almost as good" ... I agree with Kevin that it's more readable. The whole patch just adds 144 lines. It doesn't look to me like there's significant maintenance burden involved, but of course I need to defer to the more experienced. It's even possible that we could reduce the size of the patch still further if we really looked at it as just a differently punctuated JSON. Having compared the JSON and YAML output formats, I think having YAML as a 2nd human-readable format might be valuable, even though it adds nothing to machine-processing. Again, if there were a sensible way to do YAML as a contrib module, I'd go for that, but there isn't. --Josh Berkus
On Fri, Dec 4, 2009 at 7:42 PM, Josh Berkus <josh@agliodbs.com> wrote: >> On top of that, if you did want YAML for easier readability, what >> aspect of the output is more readable in YAML than it is in text >> format? The only answer I can think of is that you like having each >> data element on a separate line, so that the plan is much longer but >> somewhat narrower. But if that's what you want, the JSON output is >> almost as good - the only difference is a bit of extra punctuation. > > "almost as good" ... I agree with Kevin that it's more readable. > > The whole patch just adds 144 lines. It doesn't look to me like there's > significant maintenance burden involved, but of course I need to defer > to the more experienced. It's even possible that we could reduce the > size of the patch still further if we really looked at it as just a > differently punctuated JSON. > > Having compared the JSON and YAML output formats, I think having YAML as > a 2nd human-readable format might be valuable, even though it adds > nothing to machine-processing. > > Again, if there were a sensible way to do YAML as a contrib module, I'd > go for that, but there isn't. I don't think the maintenance burden is the issue, per se. It's more that I don't like the idea of putting in a bunch of formats that are only trivially different from each other, and if there were ever a case where we should reject a new format because it is too similar to an existing one, this seems to be it. If that's a bad reason for rejecting a new format, then I don't have a second one, but we may end up with a lot of formats - and that WILL be a maintenance burden, especially if we ever want to make non-trivial extensions to the output format. Frankly, just adding new fields (perhaps controlled by new options) is never going to be that big of a deal, but there will certainly come a day when someone wants to do something more novel, like dumping parse-tree representations of expressions or something along the line of Tom Raney's visual explain tool, which dumped out every path the planner considered. I don't really want to be the person who has to tell the person who writes that patch "sorry, but we have to reject your patch until it supports every one of our numerous slightly different output formats". One possibility for contrib-module-izing this, and perhaps other output formats that people might like to see, is to write a function that takes the JSON or XML output as input and does the appropriate translation. For something like YAML, whose semantics are so close to JSON, this should be pretty simple. One of the reasons why I was hot to get JSON support into the initial version of machine-readable EXPLAIN is because JSON maps very cleanly onto the type of data structures that are common in scripting languages: everything is lists and hashes, nested inside each other, and text and numeric scalars. So you can read a JSON object into a script written in Perl, PHP, Python, Ruby, JavaScript, and probably half a dozen other languages and get a native object. From there, it's very easy to write the data back out in whatever format you happen to prefer just by walking the data structure. I suspect that a JSON-to-YAML converter in Perl would be less than 50 lines. (The XML format can also be transformed using things like XSLT, but I'm less familiar with those tools.) ...Robert
Josh Berkus wrote: >> ... YAML for easier readability ... > > "almost as good" ... I agree with Kevin that it's more readable. > > Again, if there were a sensible way to do YAML as a contrib module, I'd > go for that, but there isn't. Perhaps that's be a direction this could take? What would it take for this feature to be a demo/example for some future modules system. It seems like there have been a few recent features related to decorating output (UTF8 tables, YAML explain, \d... updates). While there's no great way to make this a contrib module today, would it make sense to add such hooks for an eventual module system?
Ron Mayer escreveu: > While there's no great way to make this a contrib module today, > would it make sense to add such hooks for an eventual module system? > I don't think so. It's easier to code a converter tool. -- Euler Taveira de Oliveira http://www.timbira.com/
Josh Berkus <josh@agliodbs.com> wrote: > Having compared the JSON and YAML output formats, I think having YAML as > a 2nd human-readable format might be valuable, even though it adds > nothing to machine-processing. Sure. YAML is human readable and can print more information that is too verbose in one-line text format. +1 to add YAML format to EXPLAIN. > Again, if there were a sensible way to do YAML as a contrib module, I'd > go for that, but there isn't. Some ideas to export formatters:1. Provides register_explain_format(name, handler) function. Contrib modules can registertheir handlers when LOADed. 2. Provides a method to specify a filter function. XML output is passed to the function and probably converted with XSLT. BTW, an extensible formatter is surely useful, but what will we do about XML and JSON formats when we have it? Will we remove them from core? If we have a plan to the direction, we should complete it before 8.5.0 release. We will be hard to revert them after the release. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
On Sun, Dec 6, 2009 at 8:34 PM, Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp> wrote: > > Josh Berkus <josh@agliodbs.com> wrote: > >> Having compared the JSON and YAML output formats, I think having YAML as >> a 2nd human-readable format might be valuable, even though it adds >> nothing to machine-processing. > > Sure. YAML is human readable and can print more information that is > too verbose in one-line text format. +1 to add YAML format to EXPLAIN. > >> Again, if there were a sensible way to do YAML as a contrib module, I'd >> go for that, but there isn't. > > Some ideas to export formatters: > 1. Provides register_explain_format(name, handler) function. > Contrib modules can register their handlers when LOADed. > > 2. Provides a method to specify a filter function. XML output is > passed to the function and probably converted with XSLT. > > BTW, an extensible formatter is surely useful, but what will we do about > XML and JSON formats when we have it? Will we remove them from core? > If we have a plan to the direction, we should complete it before 8.5.0 > release. We will be hard to revert them after the release. I sent a previous message about this, but just realized I forgot to copy the list. I thought about the possibility of a pluggable system for formats when I wrote the original machine-readable explain patch. It seemed to me at the time that if we went this route any contrib module would have to duplicate significant portions of explain.c, no matter where we put the hooks. We'd presumably feel obliged to update the contrib module when making any changes to explain.c, so I think the maintenance burden would actually be higher that way than with everything in core. The main point here for me is that the JSON format is already parseable by YAML parsers, and can probably be turned into YAML using a very short Perl script - possibly even using a sed script. I think that it's overkill to support two formats that are that similar. Tallying up the votes on this patch, we have Tom Lane, Andrew Dunstan, Josh Drake, and myself arguing against including this in core, and Josh Berkus and Itagaki Takahiro arguing for it. Ron Mayer seemed somewhat in favor of it in his message on-list but sent a message off-list taking the opposite position. Brendan Jurd cast aspersions on the human-readability of the text format but didn't explicitly take a position on the core issue, and Euler Taveira de Oliveira suggested that writing a converter would be easier than writing a module framework (which I think is almost certainly true) but also didn't explicitly take a position on the core issue. Overall, that sounds to me like a slight preponderance of "no" votes. Anyone think I've mis-tallied or want to weigh in? ...Robert
Robert Haas wrote: > The main point here for me is that the JSON format is already > parseable by YAML parsers, and can probably be turned into YAML using > a very short Perl script - possibly even using a sed script. I think > that it's overkill to support two formats that are that similar. > It's not the case that JSON can be turned into YAML or that it just happens that it can be parsed by YAML parsers. While there was some possible divergence in earlier versions, a JSON 1.2 document *is* in YAML format already. JSON is actually a subset of YAML that uses one of the many possible YAML styles--basically, YAML accepts anything in JSON format, along with others. This means that by providing JSON output, we've *already* provided YAML output, too. Just not the nice looking output people tend to associate with YAML. Accordingly, there is really no basis for this patch to exist from the perspective of helping a typical tool author. If you want to parse YAML robustly, you're going to grab someone's parsing library to do it rather than writing it yourself, and if you do that it will accept the existing JSON output just fine too. Basically this patch lives or dies by whether it looks so much nicer to people as to justify its code weight. Given the above, I don't understand why writing this patch was deemed worthwhile in the first place, but I hate to tell people they can't have something they find visually appealing just because I don't think it's an improvement. Consider me a neutral vote, although I suspect the above may sway some people who were on the fence toward disapproval. -- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.com
Greg Smith <greg@2ndquadrant.com> writes: > Given the above, I don't understand why writing this patch was deemed > worthwhile in the first place, It was written and submitted by one person who did not bother to ask first whether anyone else thought it was worthwhile. So its presence on the CF list should not be taken as evidence that there's consensus for it. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > It was written and submitted by one person who did not bother to ask > first whether anyone else thought it was worthwhile. So its presence > on the CF list should not be taken as evidence that there's consensus > for it. Should we have "Needs Discussion" phase before "Needs Review" ? Reviews, including me, think patches with needs-review status are worthwhile. In contrast, contributers often register their patches to CF without discussions just because of no response; they cannot find whether no response is silent approval or not. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
2009/12/7 Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp>: > > Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> It was written and submitted by one person who did not bother to ask >> first whether anyone else thought it was worthwhile. So its presence >> on the CF list should not be taken as evidence that there's consensus >> for it. > > Should we have "Needs Discussion" phase before "Needs Review" ? > Reviews, including me, think patches with needs-review status are > worthwhile. In contrast, contributers often register their patches > to CF without discussions just because of no response; they cannot > find whether no response is silent approval or not. +1. Sometimes a reviewer waits for the consensus in the community when someone else waits for review (, because it is marked as "Needs Review"). Regards, -- Hitoshi Harada
Hi, Greg Smith <greg@2ndquadrant.com> writes: > Robert Haas wrote: >> The main point here for me is that the JSON format is already >> parseable by YAML parsers, and can probably be turned into YAML using >> a very short Perl script - possibly even using a sed script. I think >> that it's overkill to support two formats that are that similar. >> > It's not the case that JSON can be turned into YAML or that it just happens > that it can be parsed by YAML parsers. While there was some possible > divergence in earlier versions, a JSON 1.2 document *is* in YAML format > already. JSON is actually a subset of YAML that uses one of the many > possible YAML styles--basically, YAML accepts anything in JSON format, along > with others. This means that by providing JSON output, we've *already* > provided YAML output, too. Just not the nice looking output people tend to > associate with YAML. Well we have JSON and agreed it was a good idea to have it. Now JSON is a subset of YAML and some would prefer another YAML style (me included). If the problem is supporting 2 formats in core rather than 3, what about replacing the current JSON support with the YAML one? At a later point we could even have JSON support back by having the YAML printer able to output different YAML styles, but I guess that's not where we are now. Vote: +1 for YAML even if that means dropping JSON. Regards, -- dim
On mån, 2009-12-07 at 17:14 +0900, Hitoshi Harada wrote: > 2009/12/7 Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp>: > > > > Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > >> It was written and submitted by one person who did not bother to ask > >> first whether anyone else thought it was worthwhile. So its presence > >> on the CF list should not be taken as evidence that there's consensus > >> for it. > > > > Should we have "Needs Discussion" phase before "Needs Review" ? > > Reviews, including me, think patches with needs-review status are > > worthwhile. In contrast, contributers often register their patches > > to CF without discussions just because of no response; they cannot > > find whether no response is silent approval or not. > > +1. Sometimes a reviewer waits for the consensus in the community when > someone else waits for review (, because it is marked as "Needs > Review"). Yes, I would have had use for this myself a couple of times.
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 > Tallying up the votes on this patch First, I would hope that you don't overlook the author of the patch (me) as an "aye" vote. :) Additionally, if you are going to tally, please consider the positive responses from people on -hackers and -general when the patch was first posted and there was a question about what the format would look like. YAML is designed to be machine-parseable (just like XML) *AND* human-readable (obstensibly like our current output). JSON tries to overcome the shortcomings of XML, but doesn't quite get it right. YAML does: it's small, intuitive, and only as verbose as needed. Human readable, yet structured enough to be read by machines as well. As far as the arguments for making an extensible system for supporting other formats, but I think this is putting the cart before the horse. While this feature is only in cvs at the moment, I've not heard a single person state another format they would like to see. I cannot think of one myself (machine parseable anyway: I'd love to see an "abbreviated" version that takes out the explain bits of explain analyze and changes rows= to R=, loops= to L=, etc.). This is a small patch, easy to maintain, and useful to more people than just myself, judging from the responses to the earlier thread. ObPartingShot: I hope that those in the "machine readable output should never be seen by human eyes" camp will support my upcoming patch to remove all extra whitespace, including newlines, from the XML format. :) - -- Greg Sabino Mullane greg@turnstep.com End Point Corporation PGP Key: 0x14964AC8 200912070727 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iEYEAREDAAYFAksc9QMACgkQvJuQZxSWSshJXwCfeHrspp39mqGJD5Z86121F1Ap VJEAoMIXrXe8/vS5rPnzyfbX04fJjp36 =H5BM -----END PGP SIGNATURE-----
* Dimitri Fontaine: > Well we have JSON and agreed it was a good idea to have it. Now JSON is > a subset of YAML and some would prefer another YAML style (me included). YAML is rather difficult to parse, and most parsers assume a trusted document source because they support arbitrary class instantiations (and it's not possible to switch this off, or it's rather involved to do so). Plain JSON doesn't have this issue. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> It was written and submitted by one person who did not bother to ask >> first whether anyone else thought it was worthwhile. So its presence >> on the CF list should not be taken as evidence that there's consensus >> for it. > Should we have "Needs Discussion" phase before "Needs Review" ? > Reviews, including me, think patches with needs-review status are > worthwhile. In contrast, contributers often register their patches > to CF without discussions just because of no response; they cannot > find whether no response is silent approval or not. Hm, I guess the question would be: what is the condition for getting out of that state? It's clear who is supposed to move a patch out of 'Needs Review', 'Waiting for Author', or 'Ready for Committer' respectively. I don't know who's got the authority to decide that something has or has not achieved community consensus. Right at the moment we handle this sort of problem in a very informal way, but if it's going to become part of the commitfest state for a patch I think we need to be a bit less informal. regards, tom lane
Dimitri Fontaine wrote: > If the problem is supporting 2 formats in core rather than 3, what about > replacing the current JSON support with the YAML one? > That's a step backwards. By providing JSON format, we've also satisfied people who want YAML. Ripping out JSON would mean we *only* support YAML. There are far many more JSON parsers than YAML parsers, which is why I thought the current code committed was good enough. Anyway, the fact that I have a technical opinion suggests to me I've caught up with the discussion now, so let's talk about where we're at. I think that the ongoing discussion here is likely to consume more resources than the expected future maintenance of this small bit of code. I believe the following to be true: -The patch is small to apply -It would also be easy to remove in the future should a more modular EXPLAIN implementation replace it -Having one more "legacy" format to satisfy would actually improve the odds that a future modular EXPLAIN would be robustly designed -There is no way a modular explain will be written any time soon -While it's questionable whether it's strictly a majority on voting here, it's close, which suggests there is plenty of support for wanting this feature -Since nothing is removed the people who aren't in favor of this format aren't negatively impacted by it being committed All that suggests to me that we've cleared the usual "do we want it?" hurdles that would normally go along with deciding whether a patch should go to a committer or not. That leaves code quality then. Are there any open issues with that? There's some notes about line-breaks in the CF app. Once we have a patch with those issues resolved, this should go to a committer for final veto power on its inclusion, and then we're done here. -- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.com
Greg Smith wrote: > That's a step backwards. By providing JSON format, we've also satisfied > people who want YAML. Ripping out JSON would mean we *only* support > YAML. There are far many more JSON parsers than YAML parsers, which is > why I thought the current code committed was good enough. XML parsers are common enough IMHO the other computer readable formats can't be that important from a computer-readability perspective, leaving their main benefit as being human friendly. I like YAML output; but I think the most compelling arguments against the patch are that if so many people want different formats it may be a good use case for external modules. And far more than yaml output, I'd like to see a flexible module system with an equivalent of "cpan install yaml" or "gem install yaml". I suppose one could argue that instead of YAML we design a different human-oriented format for loosely structured data; but that seems even harder.
Florian Weimer escribió: > * Dimitri Fontaine: > > > Well we have JSON and agreed it was a good idea to have it. Now JSON is > > a subset of YAML and some would prefer another YAML style (me included). > > YAML is rather difficult to parse, and most parsers assume a trusted > document source because they support arbitrary class instantiations > (and it's not possible to switch this off, or it's rather involved to > do so). That's irrelevant because EXPLAIN YAML is already a trusted document source. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Robert Haas escribió: > Tallying up the votes on this patch, we have Tom Lane, Andrew Dunstan, > Josh Drake, and myself arguing against including this in core, and > Josh Berkus and Itagaki Takahiro arguing for it. Ron Mayer seemed > somewhat in favor of it in his message on-list but sent a message > off-list taking the opposite position. Brendan Jurd cast aspersions > on the human-readability of the text format but didn't explicitly take > a position on the core issue, Please add my vote for YAML, because of the human-unreadability of the default text format. In fact I'd argue we could rip out the default text format and replace it with this one. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Mon, 2009-12-07 at 14:52 -0300, Alvaro Herrera wrote: > Robert Haas escribió: > > > Tallying up the votes on this patch, we have Tom Lane, Andrew Dunstan, > > Josh Drake, and myself arguing against including this in core, and > > Josh Berkus and Itagaki Takahiro arguing for it. Ron Mayer seemed > > somewhat in favor of it in his message on-list but sent a message > > off-list taking the opposite position. Brendan Jurd cast aspersions > > on the human-readability of the text format but didn't explicitly take > > a position on the core issue, > > Please add my vote for YAML, because of the human-unreadability of the > default text format. In fact I'd argue we could rip out the default > text format and replace it with this one. That would almost correct. My standing is we should have one format that is parsed. The current "text" explain output certainly does not qualify. The XML one or YAML one does. Pick one, then parse it into whatever you want from the client. Sincerely, Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering If the world pushes look it in the eye and GRR. Then push back harder. - Salamander
On Dec 7, 2009, at 9:52 AM, Alvaro Herrera wrote: >> Tallying up the votes on this patch, we have Tom Lane, Andrew Dunstan, >> Josh Drake, and myself arguing against including this in core, and >> Josh Berkus and Itagaki Takahiro arguing for it. Ron Mayer seemed >> somewhat in favor of it in his message on-list but sent a message >> off-list taking the opposite position. Brendan Jurd cast aspersions >> on the human-readability of the text format but didn't explicitly take >> a position on the core issue, > > Please add my vote for YAML, because of the human-unreadability of the > default text format. In fact I'd argue we could rip out the default > text format and replace it with this one. +1 David
All, What it's sounding like is that we ought to have a plug-in (both for postmaster and for psql) which allows the calling of an external library to parse explain output. Then people could covert XML/JSON into whatever they wanted. --Josh Berkus
Josh Berkus wrote: > All, > > What it's sounding like is that we ought to have a plug-in (both for > postmaster and for psql) which allows the calling of an external library > to parse explain output. Then people could covert XML/JSON into > whatever they wanted. > > > Not everything is sanely convertible into some sort of plugin. A plugin mechanism for this would be FAR more trouble that it is worth, IMNSHO. We are massively over-egging this pudding (as a culinary blogger you should appreciate this analogy). cheers andrew
Josh Berkus wrote: > What it's sounding like is that we ought to have a plug-in (both for > postmaster and for psql) which allows the calling of an external library > to parse explain output. Then people could covert XML/JSON into > whatever they wanted. > It would be kinder to just reject the YAML patch than to make it wait for this. -- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.com
> Not everything is sanely convertible into some sort of plugin. A plugin > mechanism for this would be FAR more trouble that it is worth, IMNSHO. > > We are massively over-egging this pudding (as a culinary blogger you > should appreciate this analogy). OK, then let's just accept it. It's small, has a maintainer, is useful to some people, and doesn't create any wierd complications. I think, given the knowledge that YAML is now a subdialect of JSON it could potentially be made smaller, but I can't say how at the moment. --Josh Berkus
Josh Berkus wrote: >> Not everything is sanely convertible into some sort of plugin. A plugin >> mechanism for this would be FAR more trouble that it is worth, IMNSHO. >> >> We are massively over-egging this pudding (as a culinary blogger you >> should appreciate this analogy). >> > > OK, then let's just accept it. It's small, has a maintainer, is useful > to some people, and doesn't create any wierd complications. I think, > given the knowledge that YAML is now a subdialect of JSON it could > potentially be made smaller, but I can't say how at the moment. > > > Actually, it's the other way, JSON is a subset of YAML. I was in fact prepared to commit this patch, despite some significant misgivings about its wisdom, mainly because it does have such a low impact. But then other people raised objections. I'm not sure how strong those objections are, though. I must say that while the YAML output might look a bit nicer than the JSON output, the difference strikes me as mostly marginal. But I guess it's like beauty and obscenity, something in the eye of the beholder. De gustibus non est disputandum. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > I was in fact prepared to commit this patch, despite some significant > misgivings about its wisdom, mainly because it does have such a low > impact. But then other people raised objections. I'm not sure how strong > those objections are, though. The "lite" version posted by Itagaki-san on 11/30 seems short enough that maybe we should just stop arguing and apply it. There were some other versions that fooled around with existing logic, which I was a lot less happy about because of the difficulty of being sure that nothing was broken. I definitely don't think we should get involved with trying to create support for plugin formatters or anything like that --- the amount of effort required seems far out of proportion to the benefit. regards, tom lane
On Mon, 2009-12-07 at 14:52 -0300, Alvaro Herrera wrote: > Robert Haas escribió: > > > Tallying up the votes on this patch, we have Tom Lane, Andrew Dunstan, > > Josh Drake, and myself arguing against including this in core, and > > Josh Berkus and Itagaki Takahiro arguing for it. Ron Mayer seemed > > somewhat in favor of it in his message on-list but sent a message > > off-list taking the opposite position. Brendan Jurd cast aspersions > > on the human-readability of the text format but didn't explicitly take > > a position on the core issue, > > Please add my vote for YAML, because of the human-unreadability of the > default text format. In fact I'd argue we could rip out the default > text format and replace it with this one. That would almost correct. My standing is we should have one format that is parsed. The current "text" explain output certainly does not qualify. The XML one or YAML one does. Pick one, then parse it into whatever you want from the client. Sincerely, Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering If the world pushes look it in the eye and GRR. Then push back harder. - Salamander
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> I was in fact prepared to commit this patch, despite some significant >> misgivings about its wisdom, mainly because it does have such a low >> impact. But then other people raised objections. I'm not sure how strong >> those objections are, though. >> > > The "lite" version posted by Itagaki-san on 11/30 seems short enough > that maybe we should just stop arguing and apply it. There were some > other versions that fooled around with existing logic, which I was a lot > less happy about because of the difficulty of being sure that nothing > was broken. > Well, I guess he can commit it himself now ;-) > I definitely don't think we should get involved with trying to create > support for plugin formatters or anything like that --- the amount of > effort required seems far out of proportion to the benefit. > > right. cheers andrew
* Alvaro Herrera: > Florian Weimer escribió: >> * Dimitri Fontaine: >> >> > Well we have JSON and agreed it was a good idea to have it. Now JSON is >> > a subset of YAML and some would prefer another YAML style (me included). >> >> YAML is rather difficult to parse, and most parsers assume a trusted >> document source because they support arbitrary class instantiations >> (and it's not possible to switch this off, or it's rather involved to >> do so). > > That's irrelevant because EXPLAIN YAML is already a trusted document > source. Uhm, really? Do I have to expect code execution on the client when I connect to a database? I hope not. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
On Mon, Dec 07, 2009 at 07:07:13PM -0500, Andrew Dunstan wrote: > > > Josh Berkus wrote: >>> Not everything is sanely convertible into some sort of plugin. A plugin >>> mechanism for this would be FAR more trouble that it is worth, IMNSHO. >>> >>> We are massively over-egging this pudding (as a culinary blogger you >>> should appreciate this analogy). >>> >> >> OK, then let's just accept it. It's small, has a maintainer, is useful >> to some people, and doesn't create any wierd complications. I think, >> given the knowledge that YAML is now a subdialect of JSON it could >> potentially be made smaller, but I can't say how at the moment. > > Actually, it's the other way, JSON is a subset of YAML. I've no contribution to the main topic, but I'd like to point out that the "JSON is a subset of YAML" meme is not without controversy: http://search.cpan.org/~mlehmann/JSON-XS-2.26/XS.pm#JSON_and_YAML It may not be relevant in your use-case, but I thought it worth a mention. Tim.
Tim Bunce wrote: > > I've no contribution to the main topic, but I'd like to point out that > the "JSON is a subset of YAML" meme is not without controversy: > > http://search.cpan.org/~mlehmann/JSON-XS-2.26/XS.pm#JSON_and_YAML > > It may not be relevant in your use-case, but I thought it worth a mention. > > > Ouch. Thanks for bringing it to our attention. Well, if we're going to commit this, as now appears likely, we should have some language lawyers go over our code for both YAML and JSON with a fine tooth comb to make sure what we are producing is strictly According To Hoyle. cheers andrew
On Tue, Dec 8, 2009 at 9:13 AM, Andrew Dunstan <andrew@dunslane.net> wrote: > Well, if we're going to commit this, as now appears likely, we should have > some language lawyers go over our code for both YAML and JSON with a fine > tooth comb to make sure what we are producing is strictly According To > Hoyle. +1. I'm a little concerned about the bit about the YAML specification changing, too, but at least if we can ensure that we're compliant with the spec that is current at the time the code goes in we have a leg to stand on. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > +1. I'm a little concerned about the bit about the YAML specification > changing, too, but at least if we can ensure that we're compliant with > the spec that is current at the time the code goes in we have a leg to > stand on. If the spec is in flux, that seems like More Than Sufficient reason to reject the patch for the time being. It can be resubmitted when it's no longer shooting at a moving target. regards, tom lane
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 > If the spec is in flux, that seems like More Than Sufficient reason > to reject the patch for the time being. It can be resubmitted when > it's no longer shooting at a moving target. Saying that it is in flux is a bit of a stretch. Even if it were, the parts that do change are nothing that will affect us. We're doing dirt-simple YAML (and JSON) generation. Basically, 'name: value' pairs plus some list building via indents and dashes. I'm completely not worried about our usage ever falling afoul of future YAML or JSON spec changes. (This goes for the person on this list concerned about the output being too hard to parse. Yes, YAML has lots of tiny corner cases and elaborate syntax, but we're not using any of those, so parsing should be quite possible for any YAML parser out there). - -- Greg Sabino Mullane greg@turnstep.com End Point Corporation PGP Key: 0x14964AC8 200912081104 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iEYEAREDAAYFAkseeZQACgkQvJuQZxSWSsjNlACg3j5zNPnGzNiXtRG0r9OZnlY3 qjkAoOvzcq+S9qLGQIMbZ0BH55P+TtH/ =icE3 -----END PGP SIGNATURE-----
Can I ask the final decision whether the YAML formatter should be applied or rejected? As far as I read the discussion, we can apply it because serveral users want it and we don't have a plan to support extensible formatters in the core. Greg Smith <greg@2ndquadrant.com> wrote: > -The patch is small to apply > -Having one more "legacy" format to satisfy would actually improve the > odds that a future modular EXPLAIN would be robustly designed > -While it's questionable whether it's strictly a majority on voting > here, it's close, which suggests there is plenty of support for wanting > this feature > -Since nothing is removed the people who aren't in favor of this format > aren't negatively impacted by it being committed Comments from additional discussion: - YAML format is the second (or the best for some people) human-unreadabe format for EXPLAIN output. - JSON is not always a subset of YAML. It is not recommended to generate/parse YAML with a JSON generator/parser or viceversa. - We won't add any core hooks and plug-in to format EXPLAIN output. Client tools may convert the output if needed. (ex.using XLST) - The spec of YAML might be changed in the feature, but we use only the basic parts. The parts will not be changed. Regards, --- Takahiro Itagaki NTT Open Source Software Center
Takahiro Itagaki wrote: > Can I ask the final decision whether the YAML formatter should be > applied or rejected? As far as I read the discussion, we can apply it > because serveral users want it and we don't have a plan to support > extensible formatters in the core. > The path I thought made sense at this point was to mark the patch ready for a committer, since it sounds like everyone is done with it now, and have another committer besides yourself do a final review as part of that. At this point, I think we've justified the feature and confirmed the feature works. Given the controversy, I think another set of eyes to make sure it's not going to be a maintenance headache moving forward should (as usual) have the final say on whether the code goes in or not, because that's only drawback to it left to committing it I see at this point. To be clear about which version we're talking about: http://archives.postgresql.org/message-id/20091130123456.4A03.52131E4D@oss.ntt.co.jp is the candidate for commit that includes the cleanup you've already done. -- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.com
Greg Smith wrote: > Takahiro Itagaki wrote: >> Can I ask the final decision whether the YAML formatter should be >> applied or rejected? As far as I read the discussion, we can apply it >> because serveral users want it and we don't have a plan to support >> extensible formatters in the core. >> > The path I thought made sense at this point was to mark the patch > ready for a committer, since it sounds like everyone is done with it > now, and have another committer besides yourself do a final review as > part of that. That brings us back to where this conversation started ;-) I'll pick it up. cheers andrew
It looks like the last round of issues on this patch only came from Tom's concerns, so I'm not sure if anyone but Tom (or a similarly picky alternate committer) is going to find anything else in a re-review. It looks to me like all the issues were sorted out anyway. Euler, you're off the hook for this one; it looks "ready for committer" to me. -- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.com
Greg Smith <greg@2ndquadrant.com> writes: > To be clear about which version we're talking about: > http://archives.postgresql.org/message-id/20091130123456.4A03.52131E4D@oss.ntt.co.jp > is the candidate for commit that includes the cleanup you've already done. I suppose this is subject to the same auto_explain problem already noticed for JSON, but I would suggest that we commit this first and then fix both together, rather than create merge issues. regards, tom lane
On Tue, Dec 8, 2009 at 10:12 PM, Greg Smith <greg@2ndquadrant.com> wrote: > It looks like the last round of issues on this patch only came from Tom's > concerns, so I'm not sure if anyone but Tom (or a similarly picky alternate > committer) is going to find anything else in a re-review. It looks to me > like all the issues were sorted out anyway. Euler, you're off the hook for > this one; it looks "ready for committer" to me. Since Itagaki Takahiro is now a committer, I sort of assumed he would be committing his own patches. I am not really sure what the etiquette is in this area, but there seems to be an implication here someone else will be committing this, which isn't necessarily my understanding of how it works. Certainly, unless someone has a contrary opinion, I think he can go ahead if he wishes. On the other hand, if it's helpful, I'm more than happy to pick up this one and/or EXPLAIN BUFFERS for final review and commit. ...Robert
Robert Haas wrote: > Since Itagaki Takahiro is now a committer, I sort of assumed he would > be committing his own patches. Maybe, but I wasn't going to be the one to suggest that Tom get cut out of the loop after he raised a list of issues with the patch already. I think the situation for EXPLAIN BUFFERS is much simpler, given that the last round of feedback was only quibbling over the line formatting and docs. What I think is a reasonable general guideline is to route submitted patches back to their author to commit only when the patch has been recently free of code issues during its review. If we've already had another committer chime in with concerns, they should probably confirm themselves that the updated version is suitable to commit, and do so instead of the author. That just seems like a reasonable risk-reduction workflow approach to me, similar to how the "sign-off" practice works on some other projects. -- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.com
Why does this patch #ifdef out the _PG_fini code in pg_stat_statements? Where you check for INSERT, UPDATE, and DELETE return codes in pgss_ProcessUtility, I think this deserves a comment explaining that these could occur as a result of EXECUTE. It wasn't obvious to me, anyway. It seems to me that the current hook placement does not address this complaint >> 1. The placement of the hook. Why is it three lines down in >> ProcessUtility? It's probably reasonable to have the Assert first, >> but I don't see why the hook function should have the ability to >> editorialize on the behavior of everything about ProcessUtility >> *except* the read-only-xact check. ...Robert
Robert Haas <robertmhaas@gmail.com> wrote: > Why does this patch #ifdef out the _PG_fini code in pg_stat_statements? That's because _PG_fini is never called in current releases. We could remove it completely, but I'd like to leave it for future releases where _PG_fini callback is re-enabled. Or, removing #ifdef (enabling _PG_fini function) is also harmless. > Where you check for INSERT, UPDATE, and DELETE return codes in > pgss_ProcessUtility, I think this deserves a comment explaining that > these could occur as a result of EXECUTE. It wasn't obvious to me, > anyway. Like this? /** Parse command tag to retrieve the number of affected rows.* COPY command returns COPY tag. EXECUTE command might returnINSERT,* UPDATE, or DELETE tags, but we cannot retrieve the number of rows* for SELECT. We assume other commands alwaysreturn 0 row.*/ > It seems to me that the current hook placement does not address this complaint > >> I don't see why the hook function should have the ability to > >> editorialize on the behavior of everything about ProcessUtility > >> *except* the read-only-xact check. I moved the initialization code of completionTag as the comment, but check_xact_readonly() should be called before the hook. Am I missing something? Regards, --- Takahiro Itagaki NTT Open Source Software Center
On Wed, Dec 9, 2009 at 9:33 PM, Takahiro Itagaki <itagaki.takahiro@oss.ntt.co.jp> wrote: > Robert Haas <robertmhaas@gmail.com> wrote: > >> Why does this patch #ifdef out the _PG_fini code in pg_stat_statements? > > That's because _PG_fini is never called in current releases. > We could remove it completely, but I'd like to leave it for future > releases where _PG_fini callback is re-enabled. > Or, removing #ifdef (enabling _PG_fini function) is also harmless. I guess my vote is for leaving it alone, but I might not know what I'm talking about. :-) >> Where you check for INSERT, UPDATE, and DELETE return codes in >> pgss_ProcessUtility, I think this deserves a comment explaining that >> these could occur as a result of EXECUTE. It wasn't obvious to me, >> anyway. > > Like this? > /* > * Parse command tag to retrieve the number of affected rows. > * COPY command returns COPY tag. EXECUTE command might return INSERT, > * UPDATE, or DELETE tags, but we cannot retrieve the number of rows > * for SELECT. We assume other commands always return 0 row. > */ I'm confused by the "we cannot retrieve the number of rows for SELECT" part. Can you clarify that? >> It seems to me that the current hook placement does not address this complaint >> >> I don't see why the hook function should have the ability to >> >> editorialize on the behavior of everything about ProcessUtility >> >> *except* the read-only-xact check. > > I moved the initialization code of completionTag as the comment, > but check_xact_readonly() should be called before the hook. > Am I missing something? Beats me. I interpreted Tom's remark to be saying the hook call should come first, but I'm not sure which way is actually better. ...Robert
Robert Haas <robertmhaas@gmail.com> wrote: > > Like this? > > /* > > * Parse command tag to retrieve the number of affected rows. > > * COPY command returns COPY tag. EXECUTE command might return INSERT, > > * UPDATE, or DELETE tags, but we cannot retrieve the number of rows > > * for SELECT. We assume other commands always return 0 row. > > */ > > I'm confused by the "we cannot retrieve the number of rows for SELECT" > part. Can you clarify that? Ah, I meant the SELECT was "EXECUTE of SELECT". If I use internal structure names, the explanation will be: ---- EXECUTE command returns INSERT, UPDATE, DELETE, or SELECT tags. We can retrieve the number of rows from INSERT, UPDATE, and DELETE tags, but cannot from SELECT tag because the tag doesn't contain row numbers and also EState->es_processed is unavailable for EXECUTE commands. ---- Regards, --- Takahiro Itagaki NTT Open Source Software Center
On Wed, Dec 9, 2009 at 10:14 PM, Takahiro Itagaki <itagaki.takahiro@oss.ntt.co.jp> wrote: >> I'm confused by the "we cannot retrieve the number of rows for SELECT" >> part. Can you clarify that? > > Ah, I meant the SELECT was "EXECUTE of SELECT". > > If I use internal structure names, the explanation will be: > ---- > EXECUTE command returns INSERT, UPDATE, DELETE, or SELECT tags. > We can retrieve the number of rows from INSERT, UPDATE, and DELETE tags, > but cannot from SELECT tag because the tag doesn't contain row numbers > and also EState->es_processed is unavailable for EXECUTE commands. > ---- OK, that makes sense. It might read a little better this way: The EXECUTE command returns INSERT, UPDATE, DELETE, or SELECT tags. We can retrieve the number of rows from INSERT, UPDATE, and DELETE tags, but the SELECT doesn't contain row numbers. We also can't get it from EState->es_processed, because that is unavailable for EXECUTE commands. That seems like a rather unfortunate limitation though... ...Robert
Tom Lane wrote: > Greg Smith <greg@2ndquadrant.com> writes: > >> To be clear about which version we're talking about: >> http://archives.postgresql.org/message-id/20091130123456.4A03.52131E4D@oss.ntt.co.jp >> is the candidate for commit that includes the cleanup you've already done. >> > > I suppose this is subject to the same auto_explain problem already > noticed for JSON, but I would suggest that we commit this first and > then fix both together, rather than create merge issues. > > > OK, I've committed the YAML stuff, so who is working on the auto-explain bug? Robert? I'd like that fixed before I add in the query text to auto-explain output. cheers andrew
On Thu, Dec 10, 2009 at 8:39 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > OK, I've committed the YAML stuff, so who is working on the auto-explain > bug? Robert? I'd like that fixed before I add in the query text to > auto-explain output. I'm going to propose fixing this in what seems to me to be the simplest possible way, per the attached. Anyone want to argue? ...Robert
Attachment
Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Dec 10, 2009 at 8:39 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > > OK, I've committed the YAML stuff, so who is working on the auto-explain > > bug? Robert? > > I'm going to propose fixing this in what seems to me to be the > simplest possible way, per the attached. Anyone want to argue? +1 to the fix. Typical usage of explain functions will be: 1. ExplainInitState() 2. (setup ExplainState) 3. ExplainBeginOutput() 4. ExplainXXX() except ExplainQuery() 5. ExplainEndOutput() Regards, --- Takahiro Itagaki NTT Open Source Software Center
I applied this patch after a little bit of editorialization concerning the points mentioned in this discussion: Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Dec 9, 2009 at 9:33 PM, Takahiro Itagaki > <itagaki.takahiro@oss.ntt.co.jp> wrote: >> Robert Haas <robertmhaas@gmail.com> wrote: >>> Why does this patch #ifdef out the _PG_fini code in pg_stat_statements? >> >> That's because _PG_fini is never called in current releases. >> We could remove it completely, but I'd like to leave it for future >> releases where _PG_fini callback is re-enabled. >> Or, removing #ifdef (enabling _PG_fini function) is also harmless. > I guess my vote is for leaving it alone, but I might not know what I'm > talking about. :-) I removed this change. I'm not convinced that taking out _PG_fini is appropriate in the first place, but if it is, we should do it in all contrib modules together, not just randomly as side-effects of unrelated patches. >>> Where you check for INSERT, UPDATE, and DELETE return codes in >>> pgss_ProcessUtility, I think this deserves a comment explaining that >>> these could occur as a result of EXECUTE. �It wasn't obvious to me, >>> anyway. >> >> Like this? >> /* >> �* Parse command tag to retrieve the number of affected rows. >> �* COPY command returns COPY tag. EXECUTE command might return INSERT, >> �* UPDATE, or DELETE tags, but we cannot retrieve the number of rows >> �* for SELECT. We assume other commands always return 0 row. >> �*/ I took this out, too. It's inconsistent and IMHO the wrong thing anyway. If a regular DML statement is EXECUTE'd, the associated rowcounts will be tallied by the executor hooks, so this was double-counting the effects. The only way it wouldn't be double-counting is if you have tracking of nested statements turned off; but if you do, I don't see why you'd expect them to get counted anyway in this one particular case. >>> It seems to me that the current hook placement does not address this complaint: >>>> I don't see why the hook function should have the ability to >>>> editorialize on the behavior of everything about ProcessUtility >>>> *except* the read-only-xact check. Nope, it didn't. The point here is that one of the purposes of a hook is to allow a loadable module to editorialize on the behavior of the hooked function, and that read-only check is part of the behavior of ProcessUtility. I doubt that bypassing the test would be a particularly smart thing for somebody to do, but it is for example reasonable to want to throw a warning or do nothing at all instead of allowing the error to occur. So that should be inside standard_ProcessUtility. regards, tom lane