Thread: status of remaining patches

status of remaining patches

From
Robert Haas
Date:
Here's an attempt on my part to summarize the status of the remaining patches.

* SE-PostgreSQL.  Generally positive feedback from Heikki.  New
version expected Monday 3/9, with changes to walker.c as requested by
Heikki.  Rest of patch reviewable in the meantime.

http://archives.postgresql.org/pgsql-hackers/2009-03/msg00192.php

* GIN fast insert.  Tom Lane committed some planner changes that make
it possible for an AM to not support index scans, and posted the
remaining patch.  No one other than me has spoken in favor of retaing
support for index scans, so maybe Teodor should just apply the rest of
this (perhaps with the minor wordsmithing I suggested in the second
message linked below, or something similar).  Or if not, then we
should decide that this will wait for 8.5 - it's time to fish or cut
bait.

http://archives.postgresql.org/pgsql-hackers/2009-03/msg00220.php
http://archives.postgresql.org/pgsql-hackers/2009-03/msg00224.php

* B-Tree emulation for GIN.  Teodor posted a new version of this patch
and is awaiting a response to a few questions he had.

http://archives.postgresql.org/pgsql-hackers/2009-03/msg00198.php

* Improve Performance of Multi-Batch Hash Join for Skewed Data Sets.
Tom Lane reviewed this patch, and Ramon Lawrence responded, but it's
not clear to me where we go from here.

http://archives.postgresql.org/pgsql-hackers/2009-03/msg00273.php

* Proposal of PITR performance improvement. Fujii Masao posted an
updated version of this patch.  I believe it has yet to be reviewed by
a committer.

http://archives.postgresql.org/pgsql-hackers/2009-03/msg00064.php

* Reducing some DDL Locks to ShareLock.  A substantial part of this
was committed, and there hasn't been a new version of this patch in
three months, so I think it should be bounced at this point.  But I
don't want to do that myself unless someone at least makes some kind
of noise of agreement.  Can I get a +1, or two?

...Robert


Re: status of remaining patches

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> [ much snipped ]

> * GIN fast insert.  Tom Lane committed some planner changes that make
> it possible for an AM to not support index scans, and posted the
> remaining patch.  No one other than me has spoken in favor of retaing
> support for index scans, so maybe Teodor should just apply the rest of
> this (perhaps with the minor wordsmithing I suggested in the second
> message linked below, or something similar).  Or if not, then we
> should decide that this will wait for 8.5 - it's time to fish or cut
> bait.

My feeling about it is that the insert speed improvement is worth
the loss of simple indexscan support.  Perhaps in 8.5 or later we can
think of some reasonable approach that will let simple indexscans be
put back into GIN, but there's not time left for that for 8.4.

I'm not sure the patch is actually ready to commit --- the documentation
certainly needs more work, and I've not read any of the code except for
the planner bits.  But I think it's probably close, if we can get past
this basic question of what the scope of the patch is.

> * Improve Performance of Multi-Batch Hash Join for Skewed Data Sets.
> Tom Lane reviewed this patch, and Ramon Lawrence responded, but it's
> not clear to me where we go from here.

I don't think this one is that far away either.  I've been holding Bryce
and Ramon's feet to the fire on the issue of possible downside, but so
far there's not really much evidence of any *actual* as opposed to
theoretical downside.  What bothers me more after having looked at the
patch is that it's got the executor taking decisions that rightfully
belong in the planner (mainly because the planner should know whether
IM will be used in order to assign a correct cost estimate).  That
probably won't take much work to fix though, it's just moving some code
and creating more path/plan node fields to carry the results.

> * Proposal of PITR performance improvement. Fujii Masao posted an
> updated version of this patch.  I believe it has yet to be reviewed by
> a committer.

Has it been reviewed by anybody?  There's no trace of reviewing work
on the commitfest page.  Personally I've been ignoring it on the
assumption that someone else should review it first.

> * Reducing some DDL Locks to ShareLock.  A substantial part of this
> was committed, and there hasn't been a new version of this patch in
> three months, so I think it should be bounced at this point.  But I
> don't want to do that myself unless someone at least makes some kind
> of noise of agreement.  Can I get a +1, or two?

Certainly Simon has been given more than enough time to do something
about this patch.  It should probably go to "returned with feedback"
until a new version does surface.  (IIRC, the part that got committed
was just some minor catalog infrastructure tweaking, it wasn't
substantial in terms of addressing the real goals of the patch.)
        regards, tom lane


Re: status of remaining patches

From
Robert Haas
Date:
On Sat, Mar 7, 2009 at 9:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> * Proposal of PITR performance improvement. Fujii Masao posted an
>> updated version of this patch.  I believe it has yet to be reviewed by
>> a committer.
>
> Has it been reviewed by anybody?  There's no trace of reviewing work
> on the commitfest page.  Personally I've been ignoring it on the
> assumption that someone else should review it first.

Yes.

The original patch was submitted by Koichi Suzuki - quite a few other
people have looked at it and provided comments.  Simon Riggs was
assigned as the original reviewer, but for some reason Dave Page
removed his name from the wiki a few days ago (I'm fixing this now).
Actually, this patch has been reviewed by a whole slough of people.

V1:
http://archives.postgresql.org/message-id/87fxmhc7sc.fsf@oxford.xeocode.com
(Greg Stark)
http://archives.postgresql.org/message-id/490703B6.9060203@enterprisedb.com
(Heikki Linnakangas - I forgot about this when I made my previous
statement that it hadn't been looked at by a committer yet; this was a
while ago)
http://archives.postgresql.org/message-id/1225269154.3971.278.camel@ebony.2ndQuadrant
(Simon Riggs)

V2:
http://archives.postgresql.org/message-id/1228145754.20796.311.camel@hp_dx2400_1
(Simon Riggs)
http://archives.postgresql.org/message-id/3f0b79eb0812022255v45cbbfaau292f5320620eddb9@mail.gmail.com
(Fujii Masao)

V3:
http://archives.postgresql.org/message-id/87hc4qrw5e.fsf@oxford.xeocode.com
(quick comment from Greg Stark, no formal review)

V4:
http://archives.postgresql.org/message-id/20090114101659.89CD.52131E4D@oss.ntt.co.jp
(Itagaki Takahiro)
http://archives.postgresql.org/message-id/8763k3lgyy.fsf@oxford.xeocode.com
(Greg Stark)

As far as I can tell the patch author has responded to all comments
and pretty much done everything right.  I haven't even looked at it
enough to understand what it does or why I should care, but AFAICS
it's had more interest and more reviewing than 90% of what was
submitted for this CommitFest.

...Robert


Re: status of remaining patches

From
Robert Haas
Date:
On Sat, Mar 7, 2009 at 9:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> * GIN fast insert.  Tom Lane committed some planner changes that make
>> it possible for an AM to not support index scans, and posted the
>> remaining patch.  No one other than me has spoken in favor of retaing
>> support for index scans, so maybe Teodor should just apply the rest of
>> this (perhaps with the minor wordsmithing I suggested in the second
>> message linked below, or something similar).  Or if not, then we
>> should decide that this will wait for 8.5 - it's time to fish or cut
>> bait.
>
> My feeling about it is that the insert speed improvement is worth
> the loss of simple indexscan support.  Perhaps in 8.5 or later we can
> think of some reasonable approach that will let simple indexscans be
> put back into GIN, but there's not time left for that for 8.4.
>
> I'm not sure the patch is actually ready to commit --- the documentation
> certainly needs more work, and I've not read any of the code except for
> the planner bits.  But I think it's probably close, if we can get past
> this basic question of what the scope of the patch is.

How do we get past that question?

>> * Improve Performance of Multi-Batch Hash Join for Skewed Data Sets.
>> Tom Lane reviewed this patch, and Ramon Lawrence responded, but it's
>> not clear to me where we go from here.
>
> I don't think this one is that far away either.  I've been holding Bryce
> and Ramon's feet to the fire on the issue of possible downside, but so
> far there's not really much evidence of any *actual* as opposed to
> theoretical downside.  What bothers me more after having looked at the
> patch is that it's got the executor taking decisions that rightfully
> belong in the planner (mainly because the planner should know whether
> IM will be used in order to assign a correct cost estimate).  That
> probably won't take much work to fix though, it's just moving some code
> and creating more path/plan node fields to carry the results.

So are you going to do that and commit it, or do you want them to
rework it further?

...Robert


Re: status of remaining patches

From
Alvaro Herrera
Date:
Robert Haas escribió:

> As far as I can tell the patch author has responded to all comments
> and pretty much done everything right.  I haven't even looked at it
> enough to understand what it does or why I should care, but AFAICS
> it's had more interest and more reviewing than 90% of what was
> submitted for this CommitFest.

I noticed but never put in email that it adds a #include postgres.h to
some other header file, which should just be removed.

I also noticed that the followup version posted by someone else than the
OP did not include the readahead.c file (I tried it with the one last
posted by Koichi Suzuki and it seemed to compile with no problems,
although I didn't run it)

I then started wondering about the API; why are all the callers checking
the readahead module if there's space and then adding stuff in?
ISTM they should just attempt to add the element, and readahead.c be in
charge of determining whether there is space or not, internally.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: status of remaining patches

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
>> I don't think this one is that far away either.  I've been holding Bryce
>> and Ramon's feet to the fire on the issue of possible downside, but so
>> far there's not really much evidence of any *actual* as opposed to
>> theoretical downside.  

> What sorts of operations would we test which could potentially show a 
> performance downside?  I have to admit I don't really understand what 
> use-cases this patch is meant to improve.

The patch is meant to improve performance in cases where the outer
relation's key distribution is heavily skewed, by introducing a fast
path for keys matching the outer's most common values (MCVs).  But it
does that by potentially sacrificing performance for non MCV keys.
So the case that's of concern is where the distribution is just skewed
enough to trigger the patch's behavioral change, but you don't actually
get a win because there are too many non-MCV keys.

Note that as it's coded, the outer relation's skew is what triggers the
behavioral change.  It's not real clear to me how skew in the inner
relation's distribution affects things.
        regards, tom lane


Re: status of remaining patches

From
Josh Berkus
Date:
Tom,

> I don't think this one is that far away either.  I've been holding Bryce
> and Ramon's feet to the fire on the issue of possible downside, but so
> far there's not really much evidence of any *actual* as opposed to
> theoretical downside.  

What sorts of operations would we test which could potentially show a 
performance downside?  I have to admit I don't really understand what 
use-cases this patch is meant to improve.

--Josh


Re: status of remaining patches

From
Josh Berkus
Date:
Robert,

> The original patch was submitted by Koichi Suzuki - quite a few other
> people have looked at it and provided comments.  Simon Riggs was
> assigned as the original reviewer, but for some reason Dave Page
> removed his name from the wiki a few days ago (I'm fixing this now).
> Actually, this patch has been reviewed by a whole slough of people.

That's because Simon has said that he no longer has time to review it.

--Josh



Re: status of remaining patches

From
Dave Page
Date:
On Sun, Mar 8, 2009 at 5:03 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> The original patch was submitted by Koichi Suzuki - quite a few other
> people have looked at it and provided comments.  Simon Riggs was
> assigned as the original reviewer, but for some reason Dave Page
> removed his name from the wiki a few days ago (I'm fixing this now).

That was because Simon no longer has time to review it.


--
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com


Re: status of remaining patches

From
Robert Haas
Date:
On Mar 9, 2009, at 4:53 AM, Dave Page <dpage@pgadmin.org> wrote:

> On Sun, Mar 8, 2009 at 5:03 AM, Robert Haas <robertmhaas@gmail.com>  
> wrote:
>>
>> The original patch was submitted by Koichi Suzuki - quite a few other
>> people have looked at it and provided comments.  Simon Riggs was
>> assigned as the original reviewer, but for some reason Dave Page
>> removed his name from the wiki a few days ago (I'm fixing this now).
>
> That was because Simon no longer has time to review it.

Ah, OK, makes sense. Sorry, was not aware.

...Robert