Thread: Commitfest Status: Sudden Death Overtime

Commitfest Status: Sudden Death Overtime

From
Josh Berkus
Date:
All,

As you can probably tell, we are not ready to end the commitfest.  (I
think Robert gave me this CF to show why even talking about a one-week
mini-fest is a fantasy.  If so, successful.).

I've booted the patches which obviously aren't going to be immediately
ready.  Nine patches are ready for committer.  The rest fall into the
following buckets:

KAGAI's PATCHES

These are complex and need review by advanced hackers.  They are also
often interdependant.  As such, I'd like to automatically move his
patches to the next CF and ask that hackers who are engaged in working
on them continue to do so between CFs.

PATCHES WHICH I MOVED TO THE NEXT CF 9-2011:

* Collect frequency statistics and selectivity estimation for arrays
* Allow multiple Postgres clusters running on the same machine to
distinguish themselves in the event log
* lazy vxid locks

PATCHES WHICH HAVE BEEN UPDATED AND NEED FURTHER REVIEW:

* Cascade Replication

PATCHES STILL UNDER ACTIVE DISCUSSION:

* Add ability to constrain backend temporary file space

PATCHES I CAN'T FIGURE OUT THE STATUS OF:

* pg_comments system view
* Bugfix for XPATH() if expression returns a scalar value

So, down to a fairly limited list, except that we need a lot of committing.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Commitfest Status: Sudden Death Overtime

From
Florian Pflug
Date:
On Jul15, 2011, at 23:05 , Josh Berkus wrote:
> * Bugfix for XPATH() if expression returns a scalar value

Well, Peter Eisentraut seemed to disagree with my approach initially,
and seemed to prefer a separate function for XPATH expressions which
return a scalar value.
 http://archives.postgresql.org/pgsql-hackers/2011-06/msg00401.php

I considered that, but came to the conclusion that it has problems
of it's own, described here:
 http://archives.postgresql.org/pgsql-hackers/2011-06/msg00608.php

Peter stopped responding at that point, so I assumed that my argument
convinced him.

Radoslaw complained about the fact the results of scalar values
come back escaped from XPATH() with the patch applied (without it,
an empty array is returned) and wanted that changed - Basically the
same objection he had to my other patch which made sure text nodes
are properly escaped (The fine print here is that text nodes *aren't*
scalar values, they're nodes. What fun.). He did upgrade that other
patch to "Ready for Committer" despite his objections, though. I
don't know whether he wanted to do the same with this one or not,
and my inquiry was left unanswered
 http://archives.postgresql.org/pgsql-hackers/2011-07/msg00783.php

I also don't know much code review the patch has received. I didn't
receive any complaints, but whether that reflects the quality of
the patch or the quantity of review I leave for someone else to judge.

I dunno what the best way forward is, but I'd hate to see this being
bumped to the next commit-fest.

best regards,
Florian Pflug



Re: Commitfest Status: Sudden Death Overtime

From
Robert Haas
Date:
On Fri, Jul 15, 2011 at 5:05 PM, Josh Berkus <josh@agliodbs.com> wrote:
> As you can probably tell, we are not ready to end the commitfest.  (I
> think Robert gave me this CF to show why even talking about a one-week
> mini-fest is a fantasy.  If so, successful.).

Heh, heh.  Well, it wasn't that premeditated, but I'm always glad to
reach a meeting of the minds.  :-)

> These are complex and need review by advanced hackers.  They are also
> often interdependant.  As such, I'd like to automatically move his
> patches to the next CF and ask that hackers who are engaged in working
> on them continue to do so between CFs.

There are only two patches left and I think we really ought to try to
take a crack at doing something with them.  Yeb is working on the
userspace access vector cache patch, which I think is going drag on
longer than we want keep the CommitFest open, but I'm OK with giving
it a day or two to shake out.  Aside from the benefit of reviewing the
actual patch, I see that Yeb is pushing on KaiGai to do further work
on the documentation, which I think is of great value.

I will review the other patch - on shared object labels - RSN.

> PATCHES WHICH I MOVED TO THE NEXT CF 9-2011:
>
> * Collect frequency statistics and selectivity estimation for arrays
> * Allow multiple Postgres clusters running on the same machine to
> distinguish themselves in the event log
> * lazy vxid locks

I'm not clear on your criteria for moving these patches, as they seem
to be in somewhat different states.  The first one is WIP, and it
doesn't really matter whether you move it to the next CommitFest or
just mark it returned with feedback.  There is active development work
going on there, so it's not going to get committed at this point.  But
the other two are basically just things we didn't get around to
reviewing.

With respect to the lazy vxid lock patch, it looks to me like Jeff has
done a bit of review, and I think the real question here is whether or
not we want to go ahead and make that change (with some stylistic and
cosmetic corrections) or wait until we have a more complex solution to
the spinlock contention problems mapped out.  On a pgbench run with 8
clients on a 32-core machine, I see about a 2% speedup from that patch
on pgbench -S, and it grows to 8% at 32 clients.  At 80 clients (but
still just a 32-core box), the results bounce around more, but taking
the median of three five-minute runs, it's an 11% improvement.  To me,
that's enough to make it worth applying, especially considering that
what is 11% on today's master is, in raw TPS, equivalent to maybe 30%
of yesterday's master (prior to the fast relation lock patch being
applied).  More, it seems pretty clear that this is the conceptually
right thing to do, even if it's going to require some work elsewhere
to file down all the rough edges thus exposed.  If someone objects to
that, then OK, we should talk about that: but so far I don't think
anyone has expressed strong opposition: in which case I'd like to fix
it up and get it in.

With respect to the event-log patch, MauMau has resubmitted that to
the next CommitFest, so that's fine as far as it goes.  But it might
make sense to see if we can twist Magnus's arm into re-reviewing it
now while it's fresh in his mind, rather than waiting another two or
three months.

> PATCHES WHICH HAVE BEEN UPDATED AND NEED FURTHER REVIEW:
>
> * Cascade Replication

No update.

> PATCHES STILL UNDER ACTIVE DISCUSSION:
>
> * Add ability to constrain backend temporary file space

Now committed, by Tom.

> PATCHES I CAN'T FIGURE OUT THE STATUS OF:
>
> * pg_comments system view

I reviewed this and marked it Returned with Feedback.

> * Bugfix for XPATH() if expression returns a scalar value

No update.

> So, down to a fairly limited list, except that we need a lot of committing.

We're down to three patches that fall into this category: two XML
patches by Florian, which have been somewhat derailed by an argument
about whether values of type xml should, in fact, be required to be
valid xml (I think you know my vote on this one...); and an
error-reporting patch that Tom weighed in on over the weekend.  This
last suffers from the issue that it's not quite clear whether Tom is
going to do the work or whether he's expecting the submitter to do it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Commitfest Status: Sudden Death Overtime

From
Simon Riggs
Date:
On Fri, Jul 15, 2011 at 10:05 PM, Josh Berkus <josh@agliodbs.com> wrote:

> PATCHES WHICH HAVE BEEN UPDATED AND NEED FURTHER REVIEW:
>
> * Cascade Replication

Sorry, too busy reviewing to take note of this. I expect to commit
when its tested and ready.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Commitfest Status: Sudden Death Overtime

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> We're down to three patches that fall into this category: two XML
> patches by Florian, which have been somewhat derailed by an argument
> about whether values of type xml should, in fact, be required to be
> valid xml (I think you know my vote on this one...);

Yeah, it's not clear to me either which of those are still reasonable
candidates for committing as-is.

> and an
> error-reporting patch that Tom weighed in on over the weekend.  This
> last suffers from the issue that it's not quite clear whether Tom is
> going to do the work or whether he's expecting the submitter to do it.

If you mean the business about allowing GUCs in postgresql.conf to be
applied even if there are semantic errors elsewhere, I'm just as happy
to let Alexey or Florian have a go at it first, if they want.  The real
question at the moment is do we have consensus about changing that?
Because if we do, the submitted patch is certainly not something to
commit as-is, and should be marked Returned With Feedback.
        regards, tom lane


Re: Commitfest Status: Sudden Death Overtime

From
Josh Berkus
Date:
Simon,

>> * Cascade Replication
> 
> Sorry, too busy reviewing to take note of this. I expect to commit
> when its tested and ready.

So, would that be this commitfest, or next commitfest?


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Commitfest Status: Sudden Death Overtime

From
Josh Berkus
Date:
Robert,

>> * Collect frequency statistics and selectivity estimation for arrays
>> * Allow multiple Postgres clusters running on the same machine to
>> distinguish themselves in the event log
>> * lazy vxid locks
> 
> I'm not clear on your criteria for moving these patches, as they seem
> to be in somewhat different states.

For the array criteria, it took me until the last week of the CF to find
a reviewer. That reviewer found a number of issues, and the submitter
submitted an updated patch.  It looks quite likely that work on that
patch will get updated in the next couple weeks but not immediately.

For the eventlog, MauMau had submitted an updated patch, but all of our
Windows hackers had let me know that they were unavailable for the next
couple weeks for review or commit.

For lazy VXID locks, I'd the impression that we had an ongoing
discussion about whether we were going to apply it or not, and you were
on vacation for the last week of the CF.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Commitfest Status: Sudden Death Overtime

From
Robert Haas
Date:
On Mon, Jul 18, 2011 at 4:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> and an
>> error-reporting patch that Tom weighed in on over the weekend.  This
>> last suffers from the issue that it's not quite clear whether Tom is
>> going to do the work or whether he's expecting the submitter to do it.
>
> If you mean the business about allowing GUCs in postgresql.conf to be
> applied even if there are semantic errors elsewhere, I'm just as happy
> to let Alexey or Florian have a go at it first, if they want.  The real
> question at the moment is do we have consensus about changing that?
> Because if we do, the submitted patch is certainly not something to
> commit as-is, and should be marked Returned With Feedback.

I'm not totally convinced.  The proposed patch is pretty small, and
seems to stand on its own two feet.  I don't hear anyone objecting to
your proposed plan, but OTOH it doesn't strike me as such a good plan
that we should reject all other improvements in the meantime.  Maybe
I'm missing something...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Commitfest Status: Sudden Death Overtime

From
Florian Pflug
Date:
On Jul18, 2011, at 22:19 , Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> We're down to three patches that fall into this category: two XML
>> patches by Florian, which have been somewhat derailed by an argument
>> about whether values of type xml should, in fact, be required to be
>> valid xml (I think you know my vote on this one...);
> 
> Yeah, it's not clear to me either which of those are still reasonable
> candidates for committing as-is.

There are actually three XML-related patches from me in the queue.
I'll try to give an overview of their states - as perceived by me,
of course. If people want to comment on this, I suggest to do that
that either on the existing threads from these patches or on new ones,
but not on this one - lest confusion ensues.

* Bugfix for XPATH() if text or attribute nodes are selected
https://commitfest.postgresql.org/action/patch_view?id=580

Makes XPATH() return valid XML if text or attribute are selected.

I'm not aware of any issues with that one, other than Radoslaw's
unhappiness about the change of behaviour. Since the current behaviour
is very clearly broken (as in dump, reload, ka-Woom), I'm not prepared
to accept that as a show-stopper.

The question about whether values of type XML should or should
not be in fact valid XML is a red herring. That question has long
ago been settles by the XML type's input function, which has a pretty
clear and consistent idea about what constitutes a valid value of type
XML. The patch simply makes XPATH() abide by those rules, instead of
making up rules of it's own pretty nilly-willy.

* Bugfix for XPATH() if expression returns a scalar value https://commitfest.postgresql.org/action/patch_view?id=565

Makes XPATH() support XPath expressions which compute a scalar value,
not a node set (i.e. apply a top-level function such as name()).
Currently, we return an empty array in that case.

Peter Eisentraut initially seemed to prefer creating separate functions
for that (XPATH_STRING, ...). I argued against that, on the grounds that
that'd make it impossible to evaluate user-supplied XPath expression (since
you wouldn't know which function to use). I haven't heard back from him
after that argument. Radoslaw liked the patch, but wanted the quoting
removed - same theme as with the other patch.

* XML error handling improvement to fix XPATH bug https://commitfest.postgresql.org/action/patch_view?id=579

Like that first patch, it fixes a bug where XPATH() returns invalid
XML. The cause is completely different though. Here, libxml is (partially)
at fault - it's parsing methods always return a document instance if
a document is *structurally* valid, even if it contains semantic error
like invalid namespace references. But it isn't fully prepared to
actually handle these inconsistent documents - for example, when asked
to render a namespace URI as text, it unconditionally assumes it doesn't
have to escape it, because it may not contain special characters anway.
Which, if course, isn't necessarily true for structurally valid but
semantically invalid documents...
valid...

Fixing that wasn't possible without a major rewrite of the XML support's
error handling - one must use the modern version of libxml's error handling
infrastructure to actually be able to detect these semantic error reliably
and distinguish them from mere warnings.

Noah Misch has reviewed that patch pretty extensively (Thanks again, Noah!),
and I've resolved all his concerns regarding the implementation. I haven't
seen a single argument *against* applying this so far.

best regards,
Florian Pflug




Re: Commitfest Status: Sudden Death Overtime

From
Florian Pflug
Date:
On Jul18, 2011, at 22:19 , Tom Lane wrote:
>> and an
>> error-reporting patch that Tom weighed in on over the weekend.  This
>> last suffers from the issue that it's not quite clear whether Tom is
>> going to do the work or whether he's expecting the submitter to do it.
> 
> If you mean the business about allowing GUCs in postgresql.conf to be
> applied even if there are semantic errors elsewhere, I'm just as happy
> to let Alexey or Florian have a go at it first, if they want.  The real
> question at the moment is do we have consensus about changing that?
> Because if we do, the submitted patch is certainly not something to
> commit as-is, and should be marked Returned With Feedback.

Just to avoid false expectations here: I'd be happy to review further
versions of this patch, but I won't write them.

best regards,
Florian Pflug



Re: Commitfest Status: Sudden Death Overtime

From
Jeff Davis
Date:
On Mon, 2011-07-18 at 15:59 -0400, Robert Haas wrote:
> On a pgbench run with 8
> clients on a 32-core machine, I see about a 2% speedup from that patch
> on pgbench -S, and it grows to 8% at 32 clients.  At 80 clients (but
> still just a 32-core box), the results bounce around more, but taking
> the median of three five-minute runs, it's an 11% improvement.  To me,
> that's enough to make it worth applying, especially considering that
> what is 11% on today's master is, in raw TPS, equivalent to maybe 30%
> of yesterday's master (prior to the fast relation lock patch being
> applied).  More, it seems pretty clear that this is the conceptually
> right thing to do, even if it's going to require some work elsewhere
> to file down all the rough edges thus exposed.  If someone objects to
> that, then OK, we should talk about that: but so far I don't think
> anyone has expressed strong opposition: in which case I'd like to fix
> it up and get it in.

Agreed. I certainly like the concept of the lazy vxid patch.

Regards,Jeff Davis



Re: Commitfest Status: Sudden Death Overtime

From
Yeb Havinga
Date:
On 2011-07-18 21:59, Robert Haas wrote:
> There are only two patches left and I think we really ought to try to
> take a crack at doing something with them.  Yeb is working on the
> userspace access vector cache patch, which I think is going drag on
> longer than we want keep the CommitFest open, but I'm OK with giving
> it a day or two to shake out.
At the end if this day I'm nearing the 'my head a splode' moment, which 
is more caused by trying to get my brain around selinux and it's 
postgresql policy, than the actual patch to review. I've verified that 
the patch works as indicated by KaiGai-san, but reading and 
understanding the code to say anything useful about it will take a few 
more hours, which will be tomorrow.

regards,
Yeb



Re: Commitfest Status: Sudden Death Overtime

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jul 18, 2011 at 4:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If you mean the business about allowing GUCs in postgresql.conf to be
>> applied even if there are semantic errors elsewhere, I'm just as happy
>> to let Alexey or Florian have a go at it first, if they want. �The real
>> question at the moment is do we have consensus about changing that?
>> Because if we do, the submitted patch is certainly not something to
>> commit as-is, and should be marked Returned With Feedback.

> I'm not totally convinced.  The proposed patch is pretty small, and
> seems to stand on its own two feet.  I don't hear anyone objecting to
> your proposed plan, but OTOH it doesn't strike me as such a good plan
> that we should reject all other improvements in the meantime.  Maybe
> I'm missing something...

To me, the proposed patch adds another layer of contortionism on top of
code that's already logically messy.  I find it pretty ugly, and would
prefer to try to simplify the code before not after we attempt to deal
with the feature the patch wants to add.
        regards, tom lane


Re: Commitfest Status: Sudden Death Overtime

From
Alvaro Herrera
Date:
Excerpts from Tom Lane's message of mar jul 19 12:09:24 -0400 2011:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Mon, Jul 18, 2011 at 4:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> If you mean the business about allowing GUCs in postgresql.conf to be
> >> applied even if there are semantic errors elsewhere, I'm just as happy
> >> to let Alexey or Florian have a go at it first, if they want. The real
> >> question at the moment is do we have consensus about changing that?
> >> Because if we do, the submitted patch is certainly not something to
> >> commit as-is, and should be marked Returned With Feedback.
> 
> > I'm not totally convinced.  The proposed patch is pretty small, and
> > seems to stand on its own two feet.  I don't hear anyone objecting to
> > your proposed plan, but OTOH it doesn't strike me as such a good plan
> > that we should reject all other improvements in the meantime.  Maybe
> > I'm missing something...
> 
> To me, the proposed patch adds another layer of contortionism on top of
> code that's already logically messy.  I find it pretty ugly, and would
> prefer to try to simplify the code before not after we attempt to deal
> with the feature the patch wants to add.

+1.  Alexey stated that he would get back on this patch for reworks.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Commitfest Status: Sudden Death Overtime

From
Tom Lane
Date:
[ having now read the relevant threads a bit more carefully ... ]

Florian Pflug <fgp@phlo.org> writes:
> On Jul18, 2011, at 22:19 , Tom Lane wrote:
>> Yeah, it's not clear to me either which of those are still reasonable
>> candidates for committing as-is.

> There are actually three XML-related patches from me in the queue.
> I'll try to give an overview of their states - as perceived by me,
> of course. If people want to comment on this, I suggest to do that
> that either on the existing threads from these patches or on new ones,
> but not on this one - lest confusion ensues.

> * Bugfix for XPATH() if text or attribute nodes are selected
>   https://commitfest.postgresql.org/action/patch_view?id=580

> Makes XPATH() return valid XML if text or attribute are selected.

> I'm not aware of any issues with that one, other than Radoslaw's
> unhappiness about the change of behaviour. Since the current behaviour
> is very clearly broken (as in dump, reload, ka-Woom), I'm not prepared
> to accept that as a show-stopper.

I think we ought to apply this one.  I agree that returning invalid XML
is not acceptable.

> * Bugfix for XPATH() if expression returns a scalar value
>   https://commitfest.postgresql.org/action/patch_view?id=565

> Makes XPATH() support XPath expressions which compute a scalar value,
> not a node set (i.e. apply a top-level function such as name()).
> Currently, we return an empty array in that case.

> Peter Eisentraut initially seemed to prefer creating separate functions
> for that (XPATH_STRING, ...). I argued against that, on the grounds that
> that'd make it impossible to evaluate user-supplied XPath expression (since
> you wouldn't know which function to use). I haven't heard back from him
> after that argument.

I find your argument against XPATH_STRING() a bit unconvincing.
The use case for that is not where you are trying to evaluate an
unknown, arbitrary XPath expression; it's where you are evaluating an
expression that you *know* returns string (ie, it has name() or some
other string returning function at top level) and you'd like a string,
thanks, not something that you have to de-escape in order to get a
string.

Of course this use-case could also be addressed by providing a de-escape
function, but I don't really think that de_escape(xpath(...)[1]) is a
particularly friendly or efficient notation.

Now, these statements are not arguments against the patch --- as is,
XPATH() is entirely useless for expressions that return scalars, and
with the patch it at least does something usable.  So I think we should
apply it.  But there is room here for more function(s) to provide more
convenient functionality for the special case of expressions returning
strings.  I'd be inclined to provide xpath_number and xpath_bool too,
although those are less essential since a cast will get the job done.

There's some code-style aspects I don't care for in the submitted patch,
but I'll go clean those up and commit it.
        regards, tom lane


Re: Commitfest Status: Sudden Death Overtime

From
Florian Pflug
Date:
On Jul20, 2011, at 23:35 , Tom Lane wrote:
> I find your argument against XPATH_STRING() a bit unconvincing.
> The use case for that is not where you are trying to evaluate an
> unknown, arbitrary XPath expression; it's where you are evaluating an
> expression that you *know* returns string (ie, it has name() or some
> other string returning function at top level) and you'd like a string,
> thanks, not something that you have to de-escape in order to get a
> string.

Hm, maybe I didn't make myself clear. I'm not against having
special-case functions like XPATH_STRING() at all, as long as there's
a general-purpuse function like XPATH() that deal with every expression
you throw at it.

There's a small additional concern, though, which is that there's an
XPath 2.0 spec out there, and it modifies the type system and data model
rather heavily. So before we go adding functions, it'd probably be wise
to check that we're not painting ourselves into a corner.

> Of course this use-case could also be addressed by providing a de-escape
> function, but I don't really think that de_escape(xpath(...)[1]) is a
> particularly friendly or efficient notation.

Yeah, that's pretty ugly. Having XMLESCAPE/XMLUNESCAPE (with types
TEXT -> XML and XML -> TEXT) would probably still be a good idea though,
even if it no replacement for XPATH_STRING().

> Now, these statements are not arguments against the patch --- as is,
> XPATH() is entirely useless for expressions that return scalars, and
> with the patch it at least does something usable. So I think we should
> apply it.

Cool, so we're on the same page.

> But there is room here for more function(s) to provide more
> convenient functionality for the special case of expressions returning
> strings.  I'd be inclined to provide xpath_number and xpath_bool too,
> although those are less essential since a cast will get the job done.

No objection here. I do have a number of XML-related stuff that I'd like
to do for 9.2, actually. I'll write up a proposal once this commit-fest
is over, and I'll include XPATH_STRINGS and friends.

> There's some code-style aspects I don't care for in the submitted patch,
> but I'll go clean those up and commit it.

I'd offer to fix them, but I somehow have the feeling that you're already
in the middle of it ;-)

best regards,
Florian Pflug




Re: Commitfest Status: Sudden Death Overtime

From
Tom Lane
Date:
Florian Pflug <fgp@phlo.org> writes:
> There's a small additional concern, though, which is that there's an
> XPath 2.0 spec out there, and it modifies the type system and data model
> rather heavily. So before we go adding functions, it'd probably be wise
> to check that we're not painting ourselves into a corner.

Good point.  Has anybody here looked at that?
        regards, tom lane


Re: Commitfest Status: Sudden Death Overtime

From
Hannu Krosing
Date:
On Thu, 2011-07-21 at 00:21 +0200, Florian Pflug wrote:
> There's a small additional concern, though, which is that there's an
> XPath 2.0 spec out there, and it modifies the type system and data model
> rather heavily. So before we go adding functions, it'd probably be wise
> to check that we're not painting ourselves into a corner.

Why not just write  XPATH2() function conforming to XPath 2.0 spec if
the new spec is substancially different ?


-- 
-------
Hannu Krosing
PostgreSQL Infinite Scalability and Performance Consultant
PG Admin Book: http://www.2ndQuadrant.com/books/