Thread: CommitFest status summary 2010-01-27

CommitFest status summary 2010-01-27

From
Robert Haas
Date:
We have 20 remaining patches to deal with for the 2010-01 CommitFest.
I've attempted to break down the status of these patches below.  The
good news is that almost half of the remaining patches are either
already marked as Ready for Committer, or have already been reviewed
once and will likely be marked Ready for Committer when they are
re-reviewed.  If you are the reviewer for one of these patches, please
re-review and mark it either Waiting on Author if you find additional
issues or Ready for Committer if it looks good.  The
somewhat-less-good news is that we still have at least four patches
that have not been reviewed at all, and one of those (plpython3) has
been very difficult to find a reviewer for.  Some of the perl patches
have not yet been reviewed either, but I'm a little less concerned
about those because it seems that Andrew is working on those anyway:
still, if anyone feels inclined to volunteer, I believe Andrew has
said he would appreciate another pair of eyes.

Overall, I would say we're doing fairly well at working through this:
but most of the big patches are still ahead of us.

...Robert

Ready for Committer (4)
===================
Faster CREATE DATABASE by delaying fsync
More frame options in window functions
Typed tables
libpq new connect function (PQconnectParams)

Waiting for Re-Review (5)
=====================
Writeable CTEs
Listen / Notify rewrite
Prevent renaming on multiple inherited column
listagg aggregate
quoting psql variables (needs reverse review)

Waiting for Initial Review (4)
==========================
Remove obscure permission checks in FindConversion()
Provide rowcount for utility SELECTs
knngist (WIP)
plpython3 - no reviewer yet

Waiting for Updated Patch (2)
=========================
Remove gcc dependency in definition of "inline" functions
xpath non-nodeset result enabling

Other (5)
=====
Fix large object support in pg_dump
- Tom Lane suggested a new approach, nobody has coded it

rbtree
- Needs both more review and some updates from the author.

Add on_perl_init and proper destruction to plperl
Add on_trusted_init and on_untrusted_init to plperl
Package namespace and Safe init cleanup for plperl
- No reviewer listed, but Andrew Dunstan is working on these.


Re: CommitFest status summary 2010-01-27

From
Andrew Dunstan
Date:

Robert Haas wrote:
> Some of the perl patches
> have not yet been reviewed either, but I'm a little less concerned
> about those because it seems that Andrew is working on those anyway:
> still, if anyone feels inclined to volunteer, I believe Andrew has
> said he would appreciate another pair of eyes.
>
>   

I have to be away from base for a couple of weeks starting Sunday in 
family business. That will impact my ability to deal with those 
remaining patches. In any case, I would still like more eyes on these 
last pieces, which are a bit more complex and certainly more 
controversial than the last few.

cheers

andrew


Re: CommitFest status summary 2010-01-27

From
Greg Smith
Date:
Robert Haas wrote:
> Waiting for Initial Review (4)
> ==========================
> plpython3 - no reviewer yet
>   

This whole feature seems quite interesting, and I'm really impressed at 
how much work James has put into rethinking what a Python PL should look 
like.  But isn't the fact that there isn't even a reviewer for it yet 
evidence it needs more time to develop a bit of a community first before 
being considered for core?  This seems to me like the sort of patch 
you'd want to play with for a month or two before you could really have 
an informed opinion about how working with it flows compared to the 
existing implementation, and that's not going to be possible here.

> Provide rowcount for utility SELECTs

I think I can write a review for this one right now based on the 
discussion that's already happened:

-Multiple people think the feature is valuable and it seems worth exploring
-The right way to handle the protocol change here is still not clear
-There are any number of subtle ways clients might be impacted by this 
change, and nailing that down and determining who might break is an 
extremely wide ranging bit of QA work that's barely been exploring yet

That last part screams "returned with feedback" for something submitted 
to the last CF before beta to me.  As a candidate for 9.1-alpha1 where 
there's plenty of time to figure out what it breaks and revert if that 
turns out to be bad?  That sounds like a much better time to be fiddling 
with something in this area.

I would like to see the following in order to make this patch have a 
shot at being comittable:

1) Provide some examples showing how the feature would work in practice 
and of use-cases for it.
2) To start talking about what's going to break, some notes about what 
this does to the regression tests would be nice.
3) Demonstrate with example sessions the claims that there are no 
backward compatibility issues here.  I read "when you mix old server 
with new clients or new server with old client, it will just work as 
before", but until I see a session proving that I have to assume that's 
based on code inspections rather than actual tests, and therefore not 
necessarily true.  (Nothing personal, Zoltan--just done way too much QA 
in the last year to believe anything I'm told about code without a 
matching demo).
4) Investigate and be explicit about the potential breakage here both 
for libpq clients and at least one additional driver too.  If I saw a 
demonstration that this didn't break the JDBC driver, for example, I'd 
feel a lot better about the patch.

Expecting a community reviewer to do all that just to confirm this code 
change is worthwhile seems a pretty big stretch to me, and I wouldn't 
consider it very likely that set of work could get finished in time for 
this CF.

-- 
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com



Re: CommitFest status summary 2010-01-27

From
Boszormenyi Zoltan
Date:
Hi,

Greg Smith írta:
>> Provide rowcount for utility SELECTs
>
> I think I can write a review for this one right now based on the
> discussion that's already happened:
>
> -Multiple people think the feature is valuable and it seems worth
> exploring
> -The right way to handle the protocol change here is still not clear
> -There are any number of subtle ways clients might be impacted by this
> change, and nailing that down and determining who might break is an
> extremely wide ranging bit of QA work that's barely been exploring yet
>
> That last part screams "returned with feedback" for something
> submitted to the last CF before beta to me.  As a candidate for
> 9.1-alpha1 where there's plenty of time to figure out what it breaks
> and revert if that turns out to be bad?  That sounds like a much
> better time to be fiddling with something in this area.

I understand your position, this is a subtle change that might
turn out to break clients, indeed.

> I would like to see the following in order to make this patch have a
> shot at being comittable:
>
> 1) Provide some examples showing how the feature would work in
> practice and of use-cases for it.

I'll try to explore all the things affected by this change
and reflect them in a regression test.

> 2) To start talking about what's going to break, some notes about what
> this does to the regression tests would be nice.

Is there a way to run a regression test under src/test/regress so the
command status string is also written into the *.out file? It doesn't
seem to me so because all the current *.out files only contain
results for tuple-returning statements and
src/test/regress/pg_regress_main.c
runs psql in quiet mode.

So, first I suggest dropping the "-q" option from the psql command line
in psql_start_test() in pg_regress_main.c to be able to see the command
strings.

> 3) Demonstrate with example sessions the claims that there are no
> backward compatibility issues here.  I read "when you mix old server
> with new clients or new server with old client, it will just work as
> before", but until I see a session proving that I have to assume
> that's based on code inspections rather than actual tests, and
> therefore not necessarily true.  (Nothing personal, Zoltan--just done
> way too much QA in the last year to believe anything I'm told about
> code without a matching demo).
> 4) Investigate and be explicit about the potential breakage here both
> for libpq clients and at least one additional driver too.  If I saw a
> demonstration that this didn't break the JDBC driver, for example, I'd
> feel a lot better about the patch.

I thought the libpq change was obvious.
strncmp(status, "SELECT ", 7) /* one space at the end */
doesn't match "SELECT" (no spaces). So:
1. old server sends "SELECT", the code in the new libpq client  gets a "doesn't match", PQcmdTuples() returns "".
2. new server sends "SELECT N", old libpq client doesn't look  for strings starting with "SELECT", PQcmdTuples()
returns""
 

I looked at the latest JDBC source (currently it's postgresql-jdbc-8.4-701)
these are the places I found where the command status interpreted
in core/v3/QueryExecutorImpl.java:
   private String receiveCommandStatus() throws IOException {       //TODO: better handle the msg len       int l_len =
pgStream.ReceiveInteger4();      //read l_len -5 bytes (-4 for l_len and -1 for trailing \0)       String status =
pgStream.ReceiveString(l_len- 5);       //now read and discard the trailing \0       pgStream.Receive(1);
 
       if (logger.logDebug())           logger.debug(" <=BE CommandStatus(" + status + ")");
       return status;   }

and
   private void interpretCommandStatus(String status, ResultHandler
handler) {       int update_count = 0;       long insert_oid = 0;
       if (status.startsWith("INSERT") || status.startsWith("UPDATE")
|| status.startsWith("DELETE") || status.startsWith("MOVE"))       {           try           {
update_count= Integer.parseInt(status.substring(1 +
 
status.lastIndexOf(' ')));               if (status.startsWith("INSERT"))                   insert_oid =
Long.parseLong(status.substring(1+
 
status.indexOf(' '),                                               status.lastIndexOf(' ')));           }
catch(NumberFormatException nfe)           {               handler.handleError(new PSQLException(GT.tr("Unable to
 
interpret the update count in command completion tag: {0}.", status),
PSQLState.CONNECTION_FAILURE));               return ;           }       }
       handler.handleCommandStatus(status, update_count, insert_oid);   }

receiveCommandStatus() reads everything up to and including
the zero termination byte. interpretCommandStatus() handles
only the old strings expected to carry the rowcount.
This wouldn't break for the new "SELECT N" string as it
falls into the latest (here nonexisting) "else" branch, effectively
ignoring "SELECT" or "SELECT N".

The version of the same in ./core/v2/QueryExecutorImpl.java is:
   protected void processResults(Query originalQuery, ResultHandler
handler, int maxRows, int flags) throws IOException {   ...           case 'C':  // Command Status               String
status= pgStream.ReceiveString();
 
               if (logger.logDebug())                   logger.debug(" <=BE CommandStatus(" + status + ")");
               if (fields != null)               {                   handler.handleResultRows(originalQuery, fields,
tuples, null);                   fields = null;
                   if (bothRowsAndStatus)                       interpretCommandStatus(status, handler);
}              else               {                   interpretCommandStatus(status, handler);               }
 
               break;
...
   private void interpretCommandStatus(String status, ResultHandler
handler) throws IOException {       int update_count = 0;       long insert_oid = 0;
       if (status.equals("BEGIN"))

protoConnection.setTransactionState(ProtocolConnection.TRANSACTION_OPEN);       else if (status.equals("COMMIT") ||
status.equals("ROLLBACK"))

protoConnection.setTransactionState(ProtocolConnection.TRANSACTION_IDLE);       else if (status.startsWith("INSERT")
||
status.startsWith("UPDATE") || status.startsWith("DELETE") ||
status.startsWith("MOVE"))       {           try           {               update_count =
Integer.parseInt(status.substring(1+
 
status.lastIndexOf(' ')));               if (status.startsWith("INSERT"))                   insert_oid =
Long.parseLong(status.substring(1+
 
status.indexOf(' '),                                               status.lastIndexOf(' ')));           }
catch(NumberFormatException nfe)           {               handler.handleError(new PSQLException(GT.tr("Unable to
 
interpret the update count in command completion tag: {0}.", status),
PSQLState.CONNECTION_FAILURE));               return ;           }       }
       handler.handleCommandStatus(status, update_count, insert_oid);   }

This also handles only the old strings expected to carry a rowcount.
Reading the code, both v2 and v3 would return 0 for this patch in
the server. Handling the new case needs a new condition, something like:      || (status.startsWith("SELECT") &&
!status.equals("SELECT"))

The question is whether new versions of psqlODBC and the old
ones shipped in unixODBC handle the change well.

> Expecting a community reviewer to do all that just to confirm this
> code change is worthwhile seems a pretty big stretch to me,

True.

> and I wouldn't consider it very likely that set of work could get
> finished in time for this CF.
>

Also true.

Best regards,
Zoltán Böszörményi


-- 
Bible has answers for everything. Proof:
"But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil." (Matthew 5:37) - basics of digital technology.
"May your kingdom come" - superficial description of plate tectonics

----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/


Re: CommitFest status summary 2010-01-27

From
Robert Haas
Date:
On Wed, Jan 27, 2010 at 11:13 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> Robert Haas wrote:
>> plpython3 - no reviewer yet
>
> This whole feature seems quite interesting, and I'm really impressed at how
> much work James has put into rethinking what a Python PL should look like.
>  But isn't the fact that there isn't even a reviewer for it yet evidence it
> needs more time to develop a bit of a community first before being
> considered for core?

I agree.  I think this needs to live outside of core for the time
being.   I don't think we can commit to maintaining a second PL/python
implementation in core because two or three people are excited about
it.  It may be really great, and if there are some small changes to
core that are needed to support this living outside of core, then I
think we should consider those.  But committing the whole patch does
not seem like a wise idea to me.  We are then on the hook to maintain
it, essentially forever, and it's not clear that there is enough
community support for this for us to be certain of that.  If the
community around this grows, we can certainly revisit for 9.1.

>> Provide rowcount for utility SELECTs
>
> I think I can write a review for this one right now based on the discussion
> that's already happened:
>
> -Multiple people think the feature is valuable and it seems worth exploring
> -The right way to handle the protocol change here is still not clear
> -There are any number of subtle ways clients might be impacted by this
> change, and nailing that down and determining who might break is an
> extremely wide ranging bit of QA work that's barely been exploring yet
>
> That last part screams "returned with feedback" for something submitted to
> the last CF before beta to me.  As a candidate for 9.1-alpha1 where there's
> plenty of time to figure out what it breaks and revert if that turns out to
> be bad?  That sounds like a much better time to be fiddling with something
> in this area.

I feel a bit differently about this.  No matter when we make a change
like this, there will be some risk of breaking clients.  Many of those
clients may be proprietary, closed-source software, and we have no way
of estimating how many such clients there are in total nor what
percentage of them are likely to be broken by this change.   Looking
at a few of the open source clients and trying to judge whether they
will break may be worthwhile, but I think the primary thing we need
here is a better understanding of exactly which commands this change
will affect and some thought about how plausible it is that someone
could be depending on those tags.

In particular, it seems to me that it's rather unlikely that anyone is
depending on the command tag from an operation like CREATE TABLE AS or
SELECT INTO, because isn't it always going to be SELECT?

Furthermore, if we are going to ever change this in an incompatible
way that may break clients, isn't 9.0 exactly the right time to do
that?  If that doesn't scream "big changes coming, proceed with
caution", I don't know what would.

...Robert


Re: CommitFest status summary 2010-01-27

From
Alvaro Herrera
Date:
Robert Haas escribió:

> Furthermore, if we are going to ever change this in an incompatible
> way that may break clients, isn't 9.0 exactly the right time to do
> that?  If that doesn't scream "big changes coming, proceed with
> caution", I don't know what would.

I agree with this -- if we're ever going to change this, it must be now.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: CommitFest status summary 2010-01-27

From
Merlin Moncure
Date:
On Wed, Jan 27, 2010 at 4:05 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Waiting for Re-Review (5)
> =====================
> Writeable CTEs

Set this ready for commit.   For such a small patch, it's a wonderful
feature.  I think it deserves a fair shot on this 'fest.
insert/returning/subquery is far and away one of the most requested
features, and this patch delivers the goods in a very elegant way.

merlin


Re: CommitFest status summary 2010-01-27

From
Boszormenyi Zoltan
Date:
Hi,

Boszormenyi Zoltan írta:
> Greg Smith írta:
>   
>> 4) Investigate and be explicit about the potential breakage here both
>> for libpq clients and at least one additional driver too.  If I saw a
>> demonstration that this didn't break the JDBC driver, for example, I'd
>> feel a lot better about the patch.
>>     
> ... (JDBC discussed to be non-vulnerable)
> The question is whether new versions of psqlODBC and the old
> ones shipped in unixODBC handle the change well.
>   

I looked at the unixODBC PG driver sources. Both the "old" and "new"
versions return rowcount for STMT_TYPE_SELECT as the number of
tuples returned, it doesn't look at the command status. But they both seems
to be broken for INSERTs, as the source interprets the number found
after the first ' ' (space) character, they would return 0 for WITHOUT OIDS
case. I am talking about these files:   unixODBC-x.y.z/Drivers/PostgreSQL/results.c
unixODBC-x.y.z/Drivers/Postgre7.1/results.c
Look at the SQLRowCount() function.

The current psqlODBC driver versions do it in a similar way.
They don't look at the actual command tag, if there is a space character
in the command status string after trimming it, the string after the space
gets interpreted with atoi(). This code also ignores that INSERT returns
2 values, the first value will be returned for rowcount.

This means that the more recent ODBC drivers seem to start returning
rowcount for utility SELECTs with this protocol change.
I haven't tested it though.

So, the latest JDBC won't change behaviour without code changes,
ODBC may or may not, depending on the version.

Best regards,
Zoltán Böszörményi

-- 
Bible has answers for everything. Proof:
"But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil." (Matthew 5:37) - basics of digital technology.
"May your kingdom come" - superficial description of plate tectonics

----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/