Thread: GSoC 2014 - mentors, students and admins

GSoC 2014 - mentors, students and admins

From
Thom Brown
Date:
Hi all,

Application to Google Summer of Code 2014 can be made as of next
Monday (3rd Feb), and then there will be a 12 day window in which to
submit an application.

I'd like to gauge interest from both mentors and students as to
whether we'll want to do this.

And I'd be fine with being admin again this year, unless there's
anyone else who would like to take up the mantle?

Who would be up for mentoring this year?  And are there any project
ideas folk would like to suggest?

Thanks

Thom


Re: [pgsql-advocacy] GSoC 2014 - mentors, students and admins

From
Atri Sharma
Date:



On Tue, Jan 28, 2014 at 11:16 PM, Atri Sharma <atri.jiit@gmail.com> wrote:
 


On Tue, Jan 28, 2014 at 11:04 PM, Thom Brown <thom@linux.com> wrote:
Hi all,

Application to Google Summer of Code 2014 can be made as of next
Monday (3rd Feb), and then there will be a 12 day window in which to
submit an application.

I'd like to gauge interest from both mentors and students as to
whether we'll want to do this.

And I'd be fine with being admin again this year, unless there's
anyone else who would like to take up the mantle?

Who would be up for mentoring this year?  And are there any project
ideas folk would like to suggest?
 

Hi,

I would like to bring up the addition to MADLIB algorithms again this year.

Also, some work on the foreign table constraints could be helpful.


Hi,

Also, can we consider a project in an extension to be a project in GSoC 2014 as GSoC 2014 under PostgreSQL?

I was thinking of having some support for writable FDW in JDBC_FDW, if possible.

Regards,

Atri
--
Regards,
 
Atri
l'apprenant

Re: [pgsql-advocacy] GSoC 2014 - mentors, students and admins

From
Atri Sharma
Date:



On Tue, Jan 28, 2014 at 11:04 PM, Thom Brown <thom@linux.com> wrote:
Hi all,

Application to Google Summer of Code 2014 can be made as of next
Monday (3rd Feb), and then there will be a 12 day window in which to
submit an application.

I'd like to gauge interest from both mentors and students as to
whether we'll want to do this.

And I'd be fine with being admin again this year, unless there's
anyone else who would like to take up the mantle?

Who would be up for mentoring this year?  And are there any project
ideas folk would like to suggest?
 

Hi,

I would like to bring up the addition to MADLIB algorithms again this year.

Also, some work on the foreign table constraints could be helpful.

Regards,

Atri

--
Regards,
 
Atri
l'apprenant

Re: [pgsql-advocacy] GSoC 2014 - mentors, students and admins

From
Josh Berkus
Date:
On 01/28/2014 09:46 AM, Atri Sharma wrote:
> I would like to bring up the addition to MADLIB algorithms again this year.
> 
> Also, some work on the foreign table constraints could be helpful.

We can only take MADLIB this year if we have confirmed mentors who are
MADLIB committers before the end of the application period (Feb 15).  We
can't have a repeat of last year.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: [pgsql-advocacy] GSoC 2014 - mentors, students and admins

From
Stephen Frost
Date:
Thom,

* Thom Brown (thom@linux.com) wrote:
> Application to Google Summer of Code 2014 can be made as of next
> Monday (3rd Feb), and then there will be a 12 day window in which to
> submit an application.

This is just for PG to be a participating organization, right?  There's
a while before mentors and students get invovled, as I understand it.

> I'd like to gauge interest from both mentors and students as to
> whether we'll want to do this.

Yes.

> And I'd be fine with being admin again this year, unless there's
> anyone else who would like to take up the mantle?

Having you do it works for me. :)

> Who would be up for mentoring this year?  And are there any project
> ideas folk would like to suggest?

I'm interested in mentoring and, unlike previous years, I've been
collecting a personal list of things that I'd like to see worked on for
PG which could be GSoC projects and will provide such in the next few
days to this list (unless there's a different list that people want such
posted to..?).

    Thanks,

        Stephen

Re: [pgsql-advocacy] GSoC 2014 - mentors, students and admins

From
Maxence AHLOUCHE
Date:
Hi all,

I'd very interesting in taking part in the GSoC with PostgreSQL, as a student.
Being quite busy at the moment (exam time), I haven't had time to think of a subject yet, even though I could see some topics I found interesting.

2014-01-28 Josh Berkus <josh@agliodbs.com>
On 01/28/2014 09:46 AM, Atri Sharma wrote:
> I would like to bring up the addition to MADLIB algorithms again this year.
We can only take MADLIB this year if we have confirmed mentors who are
MADLIB committers before the end of the application period (Feb 15).  We
can't have a repeat of last year.

Indeed, I've been unlucky last year when I tried to apply to a subject with Madlib, mostly due to their lack of reactivity. Even though their subjects interest me, I don't intend to try my luck again.

Atri Sharma said:
Also, some work on the foreign table constraints could be helpful.
 
What kind of work do you have in mind?

Stephen Frost said:
I'm interested in mentoring and, unlike previous years, I've been
collecting a personal list of things that I'd like to see worked on for
PG which could be GSoC projects and will provide such in the next few
days to this list (unless there's a different list that people want such
posted to..?).
IMO, the best place would be the wiki page for GSoC (http://wiki.postgresql.org/wiki/GSoC_2014), it would avoid interested students (including me :) ) having to look for possible subjects in lots of different places.


Regards,
Maxence
--
Maxence Ahlouche
06 06 66 97 00

Re: [pgsql-advocacy] GSoC 2014 - mentors, students and admins

From
Thom Brown
Date:
On 28 January 2014 19:43, Stephen Frost <sfrost@snowman.net> wrote:
> Thom,
>
> * Thom Brown (thom@linux.com) wrote:
>> Application to Google Summer of Code 2014 can be made as of next
>> Monday (3rd Feb), and then there will be a 12 day window in which to
>> submit an application.
>
> This is just for PG to be a participating organization, right?  There's
> a while before mentors and students get invovled, as I understand it.

Yes, correct.  Students and mentors don't need to be signed up until April.

>> Who would be up for mentoring this year?  And are there any project
>> ideas folk would like to suggest?
>
> I'm interested in mentoring and, unlike previous years, I've been
> collecting a personal list of things that I'd like to see worked on for
> PG which could be GSoC projects and will provide such in the next few
> days to this list (unless there's a different list that people want such
> posted to..?).

That's great.  I don't see any problem with posting suggestions here,
although I'd suggest refraining from going in-depth as that can come
later.  If there's enough interest and agreement, we'll go ahead and
apply.

Thom


Re: GSoC 2014 - mentors, students and admins

From
Heikki Linnakangas
Date:
On 01/28/2014 07:34 PM, Thom Brown wrote:
> And I'd be fine with being admin again this year, unless there's
> anyone else who would like to take up the mantle?

Please do, thanks!

> Who would be up for mentoring this year?

I can mentor.

- Heikki


Re: [pgsql-advocacy] GSoC 2014 - mentors, students and admins

From
David Fetter
Date:
On Tue, Jan 28, 2014 at 05:34:21PM +0000, Thom Brown wrote:
> Hi all,
>
> Application to Google Summer of Code 2014 can be made as of next
> Monday (3rd Feb), and then there will be a 12 day window in which to
> submit an application.
>
> I'd like to gauge interest from both mentors and students as to
> whether we'll want to do this.
>
> And I'd be fine with being admin again this year, unless there's
> anyone else who would like to take up the mantle?

Thanks for your hard work administering last year, and thanks even
more for taking this on in light of that experience :)

> Who would be up for mentoring this year?  And are there any project
> ideas folk would like to suggest?

I'd be delighted to mentor.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: GSoC 2014 - mentors, students and admins

From
Alexander Korotkov
Date:
Hi!

On Tue, Jan 28, 2014 at 9:34 PM, Thom Brown <thom@linux.com> wrote:
And I'd be fine with being admin again this year, unless there's
anyone else who would like to take up the mantle?
 
Thanks for your work. I would like to see you as admin this year again.

Who would be up for mentoring this year?  And are there any project
ideas folk would like to suggest?
 
I would like to be mentor.

------
With best regards,
Alexander Korotkov.

Re: [pgsql-advocacy] GSoC 2014 - mentors, students and admins

From
Andreas 'ads' Scherbaum
Date:
Hi,

On 01/28/2014 06:46 PM, Atri Sharma wrote:
> On Tue, Jan 28, 2014 at 11:04 PM, Thom Brown <thom@linux.com
> <mailto:thom@linux.com>> wrote:
>
>     Hi all,
>
>     Application to Google Summer of Code 2014 can be made as of next
>     Monday (3rd Feb), and then there will be a 12 day window in which to
>     submit an application.
>
>     I'd like to gauge interest from both mentors and students as to
>     whether we'll want to do this.
>
>     And I'd be fine with being admin again this year, unless there's
>     anyone else who would like to take up the mantle?
>
>     Who would be up for mentoring this year?  And are there any project
>     ideas folk would like to suggest?
>
> I would like to bring up the addition to MADLIB algorithms again this year.

I've spoken with the MADlib team at goivotal and they are ok to support
this proposal. Therefore I offer to mentor this.


Regards,

--
                Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project


Re: [pgsql-advocacy] GSoC 2014 - mentors, students and admins

From
Atri Sharma
Date:
Awesome. I can be an assistant mentor for this one is possible or I could mentor some other project.

On Tuesday, February 25, 2014, Andreas 'ads' Scherbaum <adsmail@wars-nicht.de> wrote:

Hi,

On 01/28/2014 06:46 PM, Atri Sharma wrote:
On Tue, Jan 28, 2014 at 11:04 PM, Thom Brown <thom@linux.com
<mailto:thom@linux.com>> wrote:

    Hi all,

    Application to Google Summer of Code 2014 can be made as of next
    Monday (3rd Feb), and then there will be a 12 day window in which to
    submit an application.

    I'd like to gauge interest from both mentors and students as to
    whether we'll want to do this.

    And I'd be fine with being admin again this year, unless there's
    anyone else who would like to take up the mantle?

    Who would be up for mentoring this year?  And are there any project
    ideas folk would like to suggest?

I would like to bring up the addition to MADLIB algorithms again this year.

I've spoken with the MADlib team at goivotal and they are ok to support this proposal. Therefore I offer to mentor this.


Regards,

--
                                Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project


--
Regards,
 
Atri
l'apprenant

Re: [pgsql-advocacy] GSoC 2014 - mentors, students and admins

From
Thom Brown
Date:
On 25 February 2014 13:28, Andreas 'ads' Scherbaum <adsmail@wars-nicht.de> wrote:

Hi,


On 01/28/2014 06:46 PM, Atri Sharma wrote:
On Tue, Jan 28, 2014 at 11:04 PM, Thom Brown <thom@linux.com
<mailto:thom@linux.com>> wrote:

    Hi all,

    Application to Google Summer of Code 2014 can be made as of next
    Monday (3rd Feb), and then there will be a 12 day window in which to
    submit an application.

    I'd like to gauge interest from both mentors and students as to
    whether we'll want to do this.

    And I'd be fine with being admin again this year, unless there's
    anyone else who would like to take up the mantle?

    Who would be up for mentoring this year?  And are there any project
    ideas folk would like to suggest?

I would like to bring up the addition to MADLIB algorithms again this year.

I've spoken with the MADlib team at goivotal and they are ok to support this proposal. Therefore I offer to mentor this.

Are there any more prospective mentors?  We'll want some folk to act as back-up mentors too to ensure projects can still be completed should any mentor become unavailable.

--
Thom

Re: [pgsql-advocacy] GSoC 2014 - mentors, students and admins

From
David Fetter
Date:
On Thu, Feb 27, 2014 at 07:54:13PM +0000, Thom Brown wrote:
> On 25 February 2014 13:28, Andreas 'ads' Scherbaum <adsmail@wars-nicht.de>wrote:
> > I've spoken with the MADlib team at goivotal and they are ok to
> > support this proposal. Therefore I offer to mentor this.
>
> Are there any more prospective mentors?  We'll want some folk to act
> as back-up mentors too to ensure projects can still be completed
> should any mentor become unavailable.

For MADlib, no.  Are you asking for mentors in general?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: [pgsql-advocacy] GSoC 2014 - mentors, students and admins

From
Thom Brown
Date:
On 27 February 2014 21:08, David Fetter <david@fetter.org> wrote:
On Thu, Feb 27, 2014 at 07:54:13PM +0000, Thom Brown wrote:
> On 25 February 2014 13:28, Andreas 'ads' Scherbaum <adsmail@wars-nicht.de>wrote:
> > I've spoken with the MADlib team at goivotal and they are ok to
> > support this proposal. Therefore I offer to mentor this.
>
> Are there any more prospective mentors?  We'll want some folk to act
> as back-up mentors too to ensure projects can still be completed
> should any mentor become unavailable.

For MADlib, no.  Are you asking for mentors in general?

Ah yes, I should clarify.  Yes, mentors in general.

--
Thom

Re: GSoC 2014 - mentors, students and admins

From
Greg Stark
Date:
On Tue, Jan 28, 2014 at 5:34 PM, Thom Brown <thom@linux.com> wrote:
> Who would be up for mentoring this year?  And are there any project
> ideas folk would like to suggest?

I mentored in the past and felt I didn't do a very good job because I
didn't really understand the project the student was working on.

There's precisely one project that I feel I would be competent to
mentor at this point. Making hash indexes WAL recoverable. This is
something that's easy to define the scope of and easy to determine if
the student is on track and easy to measure when finished. It's
something where as far as I can tell all the mentor work will be
purely technical advice.

Also it's something the project really really needs and is perfectly
sized for a GSOC project IMHO. Also it's a great project for a student
who might be interested in working on Postgres in the future since it
requires learning all our idiosyncratic build and source conventions
but doesn't require huge or controversial architectural changes.

I fear a number of items in the Wiki seem unrealistically large
projects for GSOC IMNSHO.

--
greg


Re: [pgsql-advocacy] GSoC 2014 - mentors, students and admins

From
Karol Trzcionka
Date:
W dniu 27.02.2014 22:25, Thom Brown pisze:
On 27 February 2014 21:08, David Fetter <david@fetter.org> wrote:
For MADlib, no.  Are you asking for mentors in general?

Ah yes, I should clarify.  Yes, mentors in general.
In general I can help but I'm not sure if I'm not too fresh in pgsql ;) However after GSOC as student I can try "the another side".
Regards,
Karol

Re: GSoC 2014 - mentors, students and admins

From
Tan Tran
Date:
Hi Greg, pgsql-advocacy, and pgsql-hackers,

I'm interested in doing my GSoC project on this idea. I'm new to indexing and WAL, which I haven't encountered in my classes, but it sounds interesting and valuable to Postgresql. So here's my draft proposal. Do you mind giving your opinion and corrections? With your help I'll add some technical detail to my plans.

Thanks,
Tan Tran

Introduction
In write-ahead logging (WAL), all modifications to a database are written to a write-ahead log before being flushed to disk at periodic checkpoints. This method saves I/O operations, enables a continuous backup, and, in the case of database failure, guarantees data integrity up until the last saved checkpoint. In Postgresql’s implementation, transactions are written to XLog, which is divided into 16MB files (“segments”) that together comprise a complete history of transactions. Transactions are continually appended to the latest segment, while checkpointing continually archives segments up until the last checkpoint. Internally, a suite of XLog structures and functions interfaces with the various resource managers so they can log a sufficient amount of data to restore data (“redo”) in case of failure.
Another Postgresql feature is the creation of indexes on a invariant custom field; for example, on the LastName of a Person even though the primary key is ID. These custom indexes speed up row lookup. Postgres currently supports four index types: B-tree, GiST, and GIN, and hash. Indexes on the former three are WAL-recoverable, but hashing is not.

2. Proposal
As a GSoC student, I will implement WAL recovery of hash indexes using the other index types’ WAL code as a guide. Roughly, I will:
- Devise a way to store and retrieve hashing data within the XLog data structures. 
- In the existing skeleton for hash_redo(XLogRecPtr lsn, XLogRecord *record) in hash.c, branch to code for the various redo operations: creating an index, inserting into an index, deleting an index, and page operations (split, delete, update?).
- Code each branch by drawing on examples from btree_redo, gin_redo, and gist_redo, the existing XLog code of the other index types.

Benefits
Hash index searching is O(1), which is asymptotically faster than the O(n lg n) searching of a B-tree, and does not require custom indexing functions like GIN and GIST inherently do. Therefore it is desirable for rows that will only be retrieved on an equality or inequality relation. However, two things currently stand in the way of its popular use. From the Postgresql documentation,
“Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash if there were unwritten changes. Also, changes to hash indexes are not replicated over streaming or file-based replication after the initial base backup, so they give wrong answers to queries that subsequently use them. For these reasons, hash index use is presently discouraged.”
My project would solve the first problem, after which I would like to stay on and fix the second.

To be written: Quantifiable Results, Schedule, Completeness Criteria, Bio


On Feb 28, 2014, at 6:21 AM, Greg Stark <stark@mit.edu> wrote:

On Tue, Jan 28, 2014 at 5:34 PM, Thom Brown <thom@linux.com> wrote:
Who would be up for mentoring this year?  And are there any project
ideas folk would like to suggest?

I mentored in the past and felt I didn't do a very good job because I
didn't really understand the project the student was working on.

There's precisely one project that I feel I would be competent to
mentor at this point. Making hash indexes WAL recoverable. This is
something that's easy to define the scope of and easy to determine if
the student is on track and easy to measure when finished. It's
something where as far as I can tell all the mentor work will be
purely technical advice.

Also it's something the project really really needs and is perfectly
sized for a GSOC project IMHO. Also it's a great project for a student
who might be interested in working on Postgres in the future since it
requires learning all our idiosyncratic build and source conventions
but doesn't require huge or controversial architectural changes.

I fear a number of items in the Wiki seem unrealistically large
projects for GSOC IMNSHO.

--
greg


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [pgsql-advocacy] GSoC 2014 - mentors, students and admins

From
Tan Tran
Date:
Earlier I posted an email to this thread that I realize "hijacked" the discussion. Please continue replying to here instead.
 
On Feb 28, 2014, at 6:59 AM, Karol Trzcionka <karlikt@gmail.com> wrote:

W dniu 27.02.2014 22:25, Thom Brown pisze:
On 27 February 2014 21:08, David Fetter <david@fetter.org> wrote:
For MADlib, no.  Are you asking for mentors in general?

Ah yes, I should clarify.  Yes, mentors in general.
In general I can help but I'm not sure if I'm not too fresh in pgsql ;) However after GSOC as student I can try "the another side".
Regards,
Karol

GSoC on WAL-logging hash indexes

From
Tan Tran
Date:
Hi all,

Earlier I posted this in the wrong thread. Please excuse the double posting.

Tan Tran

Begin forwarded message:

From: Tan Tran <tankimtran@gmail.com>
Subject: Re: [HACKERS] GSoC 2014 - mentors, students and admins
Date: March 2, 2014 at 5:03:14 AM PST
To: Greg Stark <stark@mit.edu>
Cc: pgsql-advocacy <pgsql-advocacy@postgresql.org>, PostgreSQL-development <pgsql-hackers@postgresql.org>

Hi Greg, pgsql-advocacy, and pgsql-hackers,

I'm interested in doing my GSoC project on this idea. I'm new to indexing and WAL, which I haven't encountered in my classes, but it sounds interesting and valuable to Postgresql. So here's my draft proposal. Do you mind giving your opinion and corrections? With your help I'll add some technical detail to my plans.

Thanks,
Tan Tran

Introduction
In write-ahead logging (WAL), all modifications to a database are written to a write-ahead log before being flushed to disk at periodic checkpoints. This method saves I/O operations, enables a continuous backup, and, in the case of database failure, guarantees data integrity up until the last saved checkpoint. In Postgresql’s implementation, transactions are written to XLog, which is divided into 16MB files (“segments”) that together comprise a complete history of transactions. Transactions are continually appended to the latest segment, while checkpointing continually archives segments up until the last checkpoint. Internally, a suite of XLog structures and functions interfaces with the various resource managers so they can log a sufficient amount of data to restore data (“redo”) in case of failure.
Another Postgresql feature is the creation of indexes on a invariant custom field; for example, on the LastName of a Person even though the primary key is ID. These custom indexes speed up row lookup. Postgres currently supports four index types: B-tree, GiST, and GIN, and hash. Indexes on the former three are WAL-recoverable, but hashing is not.

2. Proposal
As a GSoC student, I will implement WAL recovery of hash indexes using the other index types’ WAL code as a guide. Roughly, I will:
- Devise a way to store and retrieve hashing data within the XLog data structures. 
- In the existing skeleton for hash_redo(XLogRecPtr lsn, XLogRecord *record) in hash.c, branch to code for the various redo operations: creating an index, inserting into an index, deleting an index, and page operations (split, delete, update?).
- Code each branch by drawing on examples from btree_redo, gin_redo, and gist_redo, the existing XLog code of the other index types.

Benefits
Hash index searching is O(1), which is asymptotically faster than the O(n lg n) searching of a B-tree, and does not require custom indexing functions like GIN and GIST inherently do. Therefore it is desirable for rows that will only be retrieved on an equality or inequality relation. However, two things currently stand in the way of its popular use. From the Postgresql documentation,
“Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash if there were unwritten changes. Also, changes to hash indexes are not replicated over streaming or file-based replication after the initial base backup, so they give wrong answers to queries that subsequently use them. For these reasons, hash index use is presently discouraged.”
My project would solve the first problem, after which I would like to stay on and fix the second.

To be written: Quantifiable Results, Schedule, Completeness Criteria, Bio


On Feb 28, 2014, at 6:21 AM, Greg Stark <stark@mit.edu> wrote:

On Tue, Jan 28, 2014 at 5:34 PM, Thom Brown <thom@linux.com> wrote:
Who would be up for mentoring this year?  And are there any project
ideas folk would like to suggest?

I mentored in the past and felt I didn't do a very good job because I
didn't really understand the project the student was working on.

There's precisely one project that I feel I would be competent to
mentor at this point. Making hash indexes WAL recoverable. This is
something that's easy to define the scope of and easy to determine if
the student is on track and easy to measure when finished. It's
something where as far as I can tell all the mentor work will be
purely technical advice.

Also it's something the project really really needs and is perfectly
sized for a GSOC project IMHO. Also it's a great project for a student
who might be interested in working on Postgres in the future since it
requires learning all our idiosyncratic build and source conventions
but doesn't require huge or controversial architectural changes.

I fear a number of items in the Wiki seem unrealistically large
projects for GSOC IMNSHO.

--
greg


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: GSoC on WAL-logging hash indexes

From
Robert Haas
Date:
On Sun, Mar 2, 2014 at 2:38 PM, Tan Tran <tankimtran@gmail.com> wrote:
> 2. Proposal
> As a GSoC student, I will implement WAL recovery of hash indexes using the
> other index types' WAL code as a guide. Roughly, I will:
> - Devise a way to store and retrieve hashing data within the XLog data
> structures.
> - In the existing skeleton for hash_redo(XLogRecPtr lsn, XLogRecord *record)
> in hash.c, branch to code for the various redo operations: creating an
> index, inserting into an index, deleting an index, and page operations
> (split, delete, update?).
> - Code each branch by drawing on examples from btree_redo, gin_redo, and
> gist_redo, the existing XLog code of the other index types.

Unfortunately, I don't believe that it's possible to do this easily
today because of the way bucket splits are handled.  I wrote about
this previously here, with an idea for solving the problem:

http://www.postgresql.org/message-id/CA+TgmoZyMoJSrFxHXQ06G8jhjXQcsKvDiHB_8z_7nc7hj7iHYQ@mail.gmail.com

Sadly, no one responded.  :-(

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: GSoC on WAL-logging hash indexes

From
Tan Tran
Date:
Thanks for alerting me to your previous idea. While I don't know enough about Postgresql internals to judge its merits yet, I'll write some pseudocode based on it in my proposal; and I'll relegate it to a "reach" proposal alongside a more straightforward one.

Tan

On Mon, Mar 3, 2014 at 8:12 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Sun, Mar 2, 2014 at 2:38 PM, Tan Tran <tankimtran@gmail.com> wrote:
> 2. Proposal
> As a GSoC student, I will implement WAL recovery of hash indexes using the
> other index types' WAL code as a guide. Roughly, I will:
> - Devise a way to store and retrieve hashing data within the XLog data
> structures.
> - In the existing skeleton for hash_redo(XLogRecPtr lsn, XLogRecord *record)
> in hash.c, branch to code for the various redo operations: creating an
> index, inserting into an index, deleting an index, and page operations
> (split, delete, update?).
> - Code each branch by drawing on examples from btree_redo, gin_redo, and
> gist_redo, the existing XLog code of the other index types.

Unfortunately, I don't believe that it's possible to do this easily
today because of the way bucket splits are handled.  I wrote about
this previously here, with an idea for solving the problem:

http://www.postgresql.org/message-id/CA+TgmoZyMoJSrFxHXQ06G8jhjXQcsKvDiHB_8z_7nc7hj7iHYQ@mail.gmail.com

Sadly, no one responded.  :-(

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: GSoC on WAL-logging hash indexes

From
Jeff Janes
Date:
On Mon, Mar 3, 2014 at 8:12 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Unfortunately, I don't believe that it's possible to do this easily
today because of the way bucket splits are handled.  I wrote about
this previously here, with an idea for solving the problem:

http://www.postgresql.org/message-id/CA+TgmoZyMoJSrFxHXQ06G8jhjXQcsKvDiHB_8z_7nc7hj7iHYQ@mail.gmail.com

Sadly, no one responded.  :-(

On Mon, Mar 3, 2014 at 9:39 AM, Tan Tran <tankimtran@gmail.com> wrote:
Thanks for alerting me to your previous idea. While I don't know enough about Postgresql internals to judge its merits yet, I'll write some pseudocode based on it in my proposal; and I'll relegate it to a "reach" proposal alongside a more straightforward one.

Tan


Hi Tan,

I'm not familiar with the inner workings of the GSoC, but I don't know if this can be relegated to a "stretch" goal.  WAL logging is an all or nothing thing.  I think that, to be applied to the codebase (which I assume is the goal of GSoC), all actions need to be implemented.  That is probably why this has remained open so long: there is no incremental way to get the code written.  (But I would like to see it get done, I don't want to discourage that.)

Cheers,

Jeff

Re: GSoC on WAL-logging hash indexes

From
Greg Stark
Date:
On Mon, Mar 3, 2014 at 4:12 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Unfortunately, I don't believe that it's possible to do this easily
> today because of the way bucket splits are handled.  I wrote about
> this previously here, with an idea for solving the problem:

We could just tackle this in the same incomplete, buggy, way that
btrees tackled it for years until Heikki fixed them and the way gin
and gist still do I believe. Namely just emit xlog records for each
page individually and during replay remember when you have an
"incomplete split" and complain if recovery ends with any still
incomplete. That would be unfortunate to be adding new cases of this
just as Heikki and company are making progress eliminating the ones we
already had but that's surely better than having no recovery at all.



--
greg


Re: GSoC on WAL-logging hash indexes

From
Heikki Linnakangas
Date:
(de-CC'ing pgsql-advocacy)

On 03/06/2014 04:03 AM, Greg Stark wrote:
> On Mon, Mar 3, 2014 at 4:12 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Unfortunately, I don't believe that it's possible to do this easily
>> today because of the way bucket splits are handled.  I wrote about
>> this previously here, with an idea for solving the problem:
>
> We could just tackle this in the same incomplete, buggy, way that
> btrees tackled it for years until Heikki fixed them and the way gin
> and gist still do I believe. Namely just emit xlog records for each
> page individually and during replay remember when you have an
> "incomplete split" and complain if recovery ends with any still
> incomplete. That would be unfortunate to be adding new cases of this
> just as Heikki and company are making progress eliminating the ones we
> already had but that's surely better than having no recovery at all.

Grmph. Indeed.

Looking at Robert's idea from November:

> I have thought about doing some work on this, although I don't know
> when I'll ever have the time.  As far as I can see, the basic problem
> is that we need a better way of splitting buckets.  Right now,
> splitting a bucket means locking it, scanning the entire bucket chain
> and moving ~1/2 the tuples to the new bucket, and then unlocking it.
> Since this is a long operation, we have to use heavyweight locks to
> protect the buckets, which is bad for concurrency.  Since it involves
> a modification to an arbitrarily large number of pages that has to be
> atomic, it's not possible to WAL-log it sensibly  -- and in fact, I
> think this is really the major obstacle to being able to implementing
> WAL-logging for hash indexes.

I don't think it's necessary to improve concurrency just to get 
WAL-logging. Better concurrency is a worthy goal of its own, of course, 
but it's a separate concern.

For crash safety, the key is to make sure you perform the whole 
operation in small atomic steps, and you have a way of knowing where to 
continue after crash (this is the same whether you do the cleanup 
immediately at the end of recovery, which I want to avoid, or lazily 
afterwards). But you can hold locks across those small atomic steps, to 
ensure concurrency-safety.

> I think we could work around this problem as follows.   Suppose we
> have buckets 1..N and the split will create bucket N+1.  We need a
> flag that can be set and cleared on the metapage, call it
> split-in-progress, and a flag that can optionally be set on particular
> index tuples, call it moved-by-split.  We will allow scans of all
> buckets and insertions into all buckets while the split is in
> progress, but (as now) we will not allow more than one split to be in
> progress at the same time.

Hmm, unless I'm reading the code wrong, it *does* allow more than one 
split to be in progress at the same time today.

> [description of how to use the moved-by-split to allow scans to run concurrently with the split]

I guess that would work, although you didn't actually describe how to 
continue a split after a crash. But it's a lot simpler if you don't also 
try to make it more concurrent:

---
When you start splitting a bucket, first acquire a heavy-weight lock on 
the old and new buckets. Allocate the required number of pages, before 
changing anything on-disk, so that you can easily back out if you run 
out of disk space. So far, this is how splitting works today.

Then you update the metapage to show that the bucket has been split and 
initialize the new bucket's primary page (as one atomic WAL-logged 
operation). Also mark the new bucket's primary page with a 
split-in-progress flag. That's used later by scans to detect if the 
split was interrupted.

Now you scan the old bucket, and move all the tuples belonging to the 
new bucket. That needs to be done as a series of small atomic WAL-logged 
operations, each involving a small number of old and new pages (one WAL 
record for each moved tuple is the simplest, but you can do some 
batching for efficiency). After you're done, clear the split-in-progress 
flag in the new bucket's primary page (WAL-log that), and release the locks.

In a scan, if the bucket you're about to scan has the split-in-progress 
flag set, that indicates that the split was interrupted by a crash. You 
won't see the flag as set if a concurrent split is in progress, because 
you will block on the lock it's holding on the bucket. If you see the 
flag as set, you share-lock and scan both buckets, the old and the new.

If you see the split-in-progress flag in the bucket you're about to 
insert to, you don't need to do anything special. Just insert the tuple 
to the new bucket as normal. Before splitting the new bucket again, 
however, the previous split needs to be finished, or things will get 
complicated. To do that, acquire the locks on the old and the new 
bucket, and scan the old bucket for any remaining tuples that belong to 
the new bucket and move them, and finally clear the split-in-progress flag.
---

This is similar to your description, as you scan both buckets if you see 
an interrupted split, but doesn't require the per-tuple moved-by-split 
flag, or waiting out scans. Also, I put the split-in-progress flag in 
the new bucket's primary page instead of the metapage. That allows 
multiple splits to be in progress at the same time.

- Heikki



Re: GSoC on WAL-logging hash indexes

From
Robert Haas
Date:
On Thu, Mar 6, 2014 at 8:11 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> I don't think it's necessary to improve concurrency just to get WAL-logging.
> Better concurrency is a worthy goal of its own, of course, but it's a
> separate concern.

To some extent, I agree, but only to some extent.  To make hash
indexes generally usable, we really need to solve both problems.  When
I got rid of just some of the heavyweight locking in commit
76837c1507cb5a5a0048046233568447729e66dd, the results were pretty
dramatic at higher levels of concurrency:

http://www.postgresql.org/message-id/CA+Tgmoaf=nOJxLyzGcbrrY+pe-0VLL0vfHi6tjdM3fFtVwsOmw@mail.gmail.com

But there was still an awful lot of contention inside the heavyweight
lock manager, and I don't think hash indexes are going to be ready for
prime time until we solve that problem.

> This is similar to your description, as you scan both buckets if you see an
> interrupted split, but doesn't require the per-tuple moved-by-split flag, or
> waiting out scans. Also, I put the split-in-progress flag in the new
> bucket's primary page instead of the metapage. That allows multiple splits
> to be in progress at the same time.

Putting the split-in-progress flag in the new bucket's primary page
makes a lot of sense.  I don't have any problem with dumping the rest
of it for a first cut if we have a different long-term plan for how to
improve concurrency, but I don't see much point in going to a lot of
work to implement a system for WAL logging if we're going to end up
having to afterwards throw it out and start from scratch to get rid of
the heavyweight locks - and it's not obvious to me how what you have
here could be extended to do that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: GSoC on WAL-logging hash indexes

From
Jeff Janes
Date:
On Thu, Mar 6, 2014 at 11:34 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Putting the split-in-progress flag in the new bucket's primary page
makes a lot of sense.  I don't have any problem with dumping the rest
of it for a first cut if we have a different long-term plan for how to
improve concurrency, but I don't see much point in going to a lot of
work to implement a system for WAL logging if we're going to end up
having to afterwards throw it out and start from scratch to get rid of
the heavyweight locks - and it's not obvious to me how what you have
here could be extended to do that.

+1   I don't think we have to improve concurrency at the same time as WAL logging, but we at least have to implement WAL logging in a way that doesn't foreclose future improvements to concurrency.

I've been tempted to implement a new type of hash index that allows both WAL and high concurrency, simply by disallowing bucket splits.  At the index creation time you use a storage parameter to specify the number of buckets, and that is that. If you mis-planned, build a new index with more buckets, possibly concurrently, and drop the too-small one.

I think it would be easier to add bucket splitting to something like this than it would be to add WAL logging and concurrency to what we already have--mostly because I think that route would be more amenable to incremental programming efforts, and also because if people had an actually useful hash index type they would use it and see that it worked well (assuming of course that it does work well), and then be motivated to improve it.

If this could be done as an extension I would just go (attempt to) do it.  But since WAL isn't pluggable, it can't go in as an extension.

Cheers,

Jeff

Re: GSoC on WAL-logging hash indexes

From
Robert Haas
Date:
On Thu, Mar 6, 2014 at 3:44 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Thu, Mar 6, 2014 at 11:34 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Putting the split-in-progress flag in the new bucket's primary page
>> makes a lot of sense.  I don't have any problem with dumping the rest
>> of it for a first cut if we have a different long-term plan for how to
>> improve concurrency, but I don't see much point in going to a lot of
>> work to implement a system for WAL logging if we're going to end up
>> having to afterwards throw it out and start from scratch to get rid of
>> the heavyweight locks - and it's not obvious to me how what you have
>> here could be extended to do that.
>
> +1   I don't think we have to improve concurrency at the same time as WAL
> logging, but we at least have to implement WAL logging in a way that doesn't
> foreclose future improvements to concurrency.
>
> I've been tempted to implement a new type of hash index that allows both WAL
> and high concurrency, simply by disallowing bucket splits.  At the index
> creation time you use a storage parameter to specify the number of buckets,
> and that is that. If you mis-planned, build a new index with more buckets,
> possibly concurrently, and drop the too-small one.

Yeah, we could certainly do something like that.  It sort of sucks,
though.  I mean, it's probably pretty easy to know that starting with
the default 2 buckets is not going to be enough; most people will at
least be smart enough to start with, say, 1024.  But are you going to
know whether you need 32768 or 1048576 or 33554432?  A lot of people
won't, and we have more than enough reasons for performance to degrade
over time as it is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: GSoC on WAL-logging hash indexes

From
Greg Stark
Date:
On Thu, Mar 6, 2014 at 11:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I've been tempted to implement a new type of hash index that allows both WAL
>> and high concurrency, simply by disallowing bucket splits.  At the index
>> creation time you use a storage parameter to specify the number of buckets,
>> and that is that. If you mis-planned, build a new index with more buckets,
>> possibly concurrently, and drop the too-small one.
>
> Yeah, we could certainly do something like that.  It sort of sucks,
> though.  I mean, it's probably pretty easy to know that starting with
> the default 2 buckets is not going to be enough; most people will at
> least be smart enough to start with, say, 1024.  But are you going to
> know whether you need 32768 or 1048576 or 33554432?  A lot of people
> won't, and we have more than enough reasons for performance to degrade
> over time as it is.

The other thought I had was that you can do things lazily in vacuum.
So when you probe you need to check multiple pages until vacuum comes
along and rehashes everything.

-- 
greg



Re: GSoC on WAL-logging hash indexes

From
Heikki Linnakangas
Date:
On 03/06/2014 09:34 PM, Robert Haas wrote:
> On Thu, Mar 6, 2014 at 8:11 AM, Heikki Linnakangas
> <hlinnakangas@vmware.com> wrote:
>> I don't think it's necessary to improve concurrency just to get WAL-logging.
>> Better concurrency is a worthy goal of its own, of course, but it's a
>> separate concern.
>
> To some extent, I agree, but only to some extent.  To make hash
> indexes generally usable, we really need to solve both problems.  When
> I got rid of just some of the heavyweight locking in commit
> 76837c1507cb5a5a0048046233568447729e66dd, the results were pretty
> dramatic at higher levels of concurrency:
>
> http://www.postgresql.org/message-id/CA+Tgmoaf=nOJxLyzGcbrrY+pe-0VLL0vfHi6tjdM3fFtVwsOmw@mail.gmail.com
>
> But there was still an awful lot of contention inside the heavyweight
> lock manager, and I don't think hash indexes are going to be ready for
> prime time until we solve that problem.

Hmm. You suggested ensuring that a scan always has at least a pin, and 
split takes a vacuum-lock. That ought to work. There's no need for the 
more complicated maneuvers you described, ISTM that you can just replace 
the heavy-weight share lock with holding a pin on the primary page of 
the bucket, and an exclusive lock with a vacuum-lock. Note that 
_hash_expandtable already takes the exclusive lock conditionally, ie. if 
it doesn't get the lock immediately it just gives up. We could do the 
same with the cleanup lock.

Vacuum could also be enhanced. It currently takes an exclusive lock on 
the bucket, then removes any dead tuples and finally "squeezes" the 
bucket by moving tuples to earlier pages. But you only really need the 
exclusive lock for the squeeze-phase. You could do the dead-tuple 
removal without the bucket-lock, and only grab for the squeeze-phase. 
And the squeezing is optional, so you could just skip that if you can't 
get the lock. But that's a separate patch as well.

One more thing we could do to make hash indexes more scalable, 
independent of the above: Cache the information in the metapage in 
backend-private memory. Then you (usually) wouldn't need to access the 
metapage at all when scanning. Store a copy of the bitmask for that 
bucket in the primary page, and when scanning, check that it matches the 
cached value. If not, refresh the cached metapage and try again.


So, there seems to be a few fairly simple and independent improvements 
to be made:

1. Replace the heavy-weight lock with pin & vacuum-lock.
2. Make it crash-safe, by adding WAL-logging
3. Only acquire the exclusive-lock (vacuum-lock after step 1) in VACUUM 
for the squeeze phase.
4. Cache the metapage.

We still don't know if it's going to be any better than B-tree after all 
that's done, but the only way to find out is to go ahead and implement it.

This seems like a great GSoC project to me. We have a pretty good idea 
of what we want to accomplish. It's uncontroversial: I don't think 
anyone is going to object to improving hash indexes (one could argue 
that it's a waste of time, but that's different from objecting to the 
idea). And it consists of a few mostly independent parts, so it's 
possible to do incrementally which makes it easier to track progress, 
and we'll probably have something useful at the end of the summer even 
if it doesn't all get finished.

- Heikki



Re: GSoC on WAL-logging hash indexes

From
Robert Haas
Date:
On Thu, Mar 6, 2014 at 7:07 PM, Greg Stark <stark@mit.edu> wrote:
> On Thu, Mar 6, 2014 at 11:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> I've been tempted to implement a new type of hash index that allows both WAL
>>> and high concurrency, simply by disallowing bucket splits.  At the index
>>> creation time you use a storage parameter to specify the number of buckets,
>>> and that is that. If you mis-planned, build a new index with more buckets,
>>> possibly concurrently, and drop the too-small one.
>>
>> Yeah, we could certainly do something like that.  It sort of sucks,
>> though.  I mean, it's probably pretty easy to know that starting with
>> the default 2 buckets is not going to be enough; most people will at
>> least be smart enough to start with, say, 1024.  But are you going to
>> know whether you need 32768 or 1048576 or 33554432?  A lot of people
>> won't, and we have more than enough reasons for performance to degrade
>> over time as it is.
>
> The other thought I had was that you can do things lazily in vacuum.
> So when you probe you need to check multiple pages until vacuum comes
> along and rehashes everything.

I was thinking about this, too.  Cleaning up the old bucket after the
split is pretty similar to vacuuming.  And lo and behold, vacuum also
needs to lock the entire bucket.  AFAICT, there are two reasons for
this.  First, when we resume a suspended scan, we assume that the next
match on the page, if any, will occur on the same page at an offset
greater than the offset where we found the previous match.  The code
copes with the possibility of current insertions by refinding the last
item we returned and scanning forward from there, but assumes the
pointer we're refinding can't have moved to a lower offset.  I think
this could be easily fixed - at essentially no cost - by changing
hashgettuple so that, if the forward scan errors out, it tries
scanning backward rather than just giving up.  Second, vacuum compacts
each bucket that it modifies using _hash_squeezebucket, which scans
from the two ends of the index in toward the middle, filling any free
space on earlier pages by pulling tuples from the end of the bucket
chain.  This is a little thornier.  We could change vacuum so that it
only removes TIDs from the individual pages, without actually trying
to free up pages, but that seems undesirable.

However, I think we might be able to get by with making bucket
compaction less aggressive, without eliminating it altogether.
Suppose that whenever we remove items from a page, we consider
consolidating the page with its immediate predecessor and successor in
the bucket chain.  This means our utilization will be a little over
50% in the worst case where we have a full page, a page with one item,
another full page, another page with one item, and so on.  But that's
not all bad, because compacting a bucket chain to the minimum possible
size when it may well be about to suffer more inserts isn't
necessarily a good thing anyway.  Also, doing this means that we don't
need to lock out scans from the entire bucket chain in order to
compact.  It's sufficient to make sure that nobody's in the middle of
scanning the two pages we want to merge.

That turns out not to be possible right now.  When a scan is
suspended, we still hold a pin on the page we're scanning.  But when
_hash_readnext or _hash_readprev walks the bucket chain, it drops both
lock and pin on the current buffer and then pins and locks the new
buffer.  That, however, seems like it could easily be changed: drop
lock on current buffer, acquire pin on new buffer, drop pin on current
buffer, lock new buffer.  Then, if we want to merge two buffers, it's
enough to lock both of them for cleanup.  To avoid any risk of
deadlock, and also to avoid waiting for a long-running suspended scan
to wake up and complete, we can do this conditionally; if we fail to
get either cleanup lock, then just don't merge the pages; take an
exclusive lock only and remove whatever you need to remove, leaving
the rest.  Merging pages is only a performance optimization, so if it
fails now and then, no real harm done.

(A side benefit of this approach is that we could opportunistically
attempt to compact pages containing dead items any time we can manage
a ConditionalLockBufferForCleanup() on the page, sort of like what HOT
pruning does for heap blocks.  We could even, after pruning away dead
items, attempt to merge the page with siblings, so that even without
vacuum the bucket chains can gradually shrink if the index tuples are
discovered to be pointing to dead heap tuples.)

With the above changes, vacuuming a bucket can happen without taking
the heavyweight lock in exclusive mode, leaving bucket splitting as
the only operation that requires it.  And we could perhaps use
basically the same algorithm to clean the tuples out of the old bucket
after the split.  The problem is that, when we're vacuuming, we know
that no scan currently in progress can still care about any of the
tuples we're removing.  I suppose a SnapshotAny scan might care, but
we don't and don't need to support that for hash indexes, I think, or
at least not concurrently.  When we're cleaning up after a bucket
split, however, we don't know whether there are any pre-split scans
still in flight.  Given the locking changes described above, I had the
notion that we could solve that problem by starting a new scan of the
bucket, locking each buffer for cleanup.  If, contrary to present
practice, each scan always kept a pin, when we got to the end of the
bucket chain, we'd know that any remaining scans started after ours,
and thus had the latest information.  There are at least two problems
with this idea.  First, it might make post-split cleanup take a really
long time, if somebody's got a cursor paused in the middle of a bucket
chain somewhere.  Second, hash scans can be backed up, which is either
an undetected-deadlock hazard or a returning-wrong-answers hazard
depending on the exactly how you implement this.

So I'm still a bit stumped about how to implement the "wait out scans"
portion of the design I sketched out previously.  In essence, that's
what the heavyweight locking is implementing today, for both vacuum
and bucket splits, and shuffling the work around between those two
code paths in some way may be a good idea, but the core issue is that
if we're not going to use heavyweight locks for waiting out scans,
then we need some other mechanism.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: GSoC on WAL-logging hash indexes

From
Robert Haas
Date:
[ I just sent a reply to Greg Stark's email which touches on some of
the same points you mentioned here; that was mostly written last
night, and finished this morning before seeing this.  ]

On Fri, Mar 7, 2014 at 4:34 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> Hmm. You suggested ensuring that a scan always has at least a pin, and split
> takes a vacuum-lock. That ought to work. There's no need for the more
> complicated maneuvers you described, ISTM that you can just replace the
> heavy-weight share lock with holding a pin on the primary page of the
> bucket, and an exclusive lock with a vacuum-lock. Note that
> _hash_expandtable already takes the exclusive lock conditionally, ie. if it
> doesn't get the lock immediately it just gives up. We could do the same with
> the cleanup lock.

We could try that.  I assume you mean do *just* what you describe
here, without the split-in-progress or moved-by-split flags I
suggested.  The only issue I see with that is that instead of everyone
piling up on the heavyweight lock, a wait which is interruptible,
they'd all pile up on the buffer content lwlock, a wait which isn't.
And splitting a bucket can involve an arbitrary number of I/O
operations, so that's kind of unappealing.  Even checkpoints would be
blocked until the bucket split completed, which seems unfortunate.

But I like the idea of teaching each scan to retain a pin on the
primary buffer page.  If we then do the split the way I proposed
before, we can implement the "outwait concurrent scans" step by taking
a cleanup lock on the primary buffer page.  In this design, we only
need to acquire and release the cleanup lock.  Once we get the cleanup
lock on the primary buffer page, even for an instant, we know that
there are no remaining scans in the bucket that need the pre-split
tuples that have now been moved to the new bucket.  We can then remove
tuples with a lesser lock (or keep the stronger lock if we wish to
re-pack).

> Vacuum could also be enhanced. It currently takes an exclusive lock on the
> bucket, then removes any dead tuples and finally "squeezes" the bucket by
> moving tuples to earlier pages. But you only really need the exclusive lock
> for the squeeze-phase. You could do the dead-tuple removal without the
> bucket-lock, and only grab for the squeeze-phase. And the squeezing is
> optional, so you could just skip that if you can't get the lock. But that's
> a separate patch as well.

Yeah, I think this would be a useful improvement.

> One more thing we could do to make hash indexes more scalable, independent
> of the above: Cache the information in the metapage in backend-private
> memory. Then you (usually) wouldn't need to access the metapage at all when
> scanning. Store a copy of the bitmask for that bucket in the primary page,
> and when scanning, check that it matches the cached value. If not, refresh
> the cached metapage and try again.

I don't know whether this would be a useful improvement.

> So, there seems to be a few fairly simple and independent improvements to be
> made:
>
> 1. Replace the heavy-weight lock with pin & vacuum-lock.
> 2. Make it crash-safe, by adding WAL-logging
> 3. Only acquire the exclusive-lock (vacuum-lock after step 1) in VACUUM for
> the squeeze phase.
> 4. Cache the metapage.
>
> We still don't know if it's going to be any better than B-tree after all
> that's done, but the only way to find out is to go ahead and implement it.

I'm of the opinion that we ought to start by making splits and
vacuuming more concurrent, and then do that stuff.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: GSoC on WAL-logging hash indexes

From
Heikki Linnakangas
Date:
On 03/07/2014 03:48 PM, Robert Haas wrote:
>> >So, there seems to be a few fairly simple and independent improvements to be
>> >made:
>> >
>> >1. Replace the heavy-weight lock with pin & vacuum-lock.
>> >2. Make it crash-safe, by adding WAL-logging
>> >3. Only acquire the exclusive-lock (vacuum-lock after step 1) in VACUUM for
>> >the squeeze phase.
>> >4. Cache the metapage.
>> >
>> >We still don't know if it's going to be any better than B-tree after all
>> >that's done, but the only way to find out is to go ahead and implement it.
> I'm of the opinion that we ought to start by making splits and
> vacuuming more concurrent, and then do that stuff.

Priorities are a matter of opinion, but note that for many applications 
the concurrency of splits and vacuuming is pretty much irrelevant (at 
least after we do bullet point 3 above). Like, if the index is mostly 
read-only, or at least doesn't grow much. You'd still want it to be 
crash-safe (2), and you might care a lot about the performance of scans 
(1 and 4), but if splits and vacuum hold an exclusive lock for a few 
seconds, that's OK if you only need to do it once in a blue moon.

- Heikki



Re: GSoC on WAL-logging hash indexes

From
"ktm@rice.edu"
Date:
On Thu, Mar 06, 2014 at 06:14:21PM -0500, Robert Haas wrote:
> On Thu, Mar 6, 2014 at 3:44 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> >
> > I've been tempted to implement a new type of hash index that allows both WAL
> > and high concurrency, simply by disallowing bucket splits.  At the index
> > creation time you use a storage parameter to specify the number of buckets,
> > and that is that. If you mis-planned, build a new index with more buckets,
> > possibly concurrently, and drop the too-small one.
> 
> Yeah, we could certainly do something like that.  It sort of sucks,
> though.  I mean, it's probably pretty easy to know that starting with
> the default 2 buckets is not going to be enough; most people will at
> least be smart enough to start with, say, 1024.  But are you going to
> know whether you need 32768 or 1048576 or 33554432?  A lot of people
> won't, and we have more than enough reasons for performance to degrade
> over time as it is.
> 
It would be useful to have a storage parameter for the target size of
the index, even if it is not exact, to use in the initial index build
to avoid the flurry of i/o caused by bucket splits as the index grows.

Regards,
Ken



Re: GSoC on WAL-logging hash indexes

From
Heikki Linnakangas
Date:
On 03/07/2014 03:48 PM, Robert Haas wrote:
> On Fri, Mar 7, 2014 at 4:34 AM, Heikki Linnakangas
> <hlinnakangas@vmware.com>  wrote:
>> >Hmm. You suggested ensuring that a scan always has at least a pin, and split
>> >takes a vacuum-lock. That ought to work. There's no need for the more
>> >complicated maneuvers you described, ISTM that you can just replace the
>> >heavy-weight share lock with holding a pin on the primary page of the
>> >bucket, and an exclusive lock with a vacuum-lock. Note that
>> >_hash_expandtable already takes the exclusive lock conditionally, ie. if it
>> >doesn't get the lock immediately it just gives up. We could do the same with
>> >the cleanup lock.
> We could try that.  I assume you mean do*just*  what you describe
> here, without the split-in-progress or moved-by-split flags I
> suggested.

Yep.

> The only issue I see with that is that instead of everyone
> piling up on the heavyweight lock, a wait which is interruptible,
> they'd all pile up on the buffer content lwlock, a wait which isn't.
> And splitting a bucket can involve an arbitrary number of I/O
> operations, so that's kind of unappealing.  Even checkpoints would be
> blocked until the bucket split completed, which seems unfortunate.

Hmm. I doubt that's a big deal in practice, although I agree it's a bit 
ugly.

Once we solve the crash-safety of splits, we actually have the option of 
doing the split in many small steps, even when there's no crash 
involved. You could for example grab the vacuum-lock, move all the 
tuples in the first 5 pages, and then release the lock to give other 
backends that are queued up a chance to do their scans/insertions. Then 
re-acquire the lock, and continue where you left. Or just bail out and 
let the next vacuum or insertion to finish it later.

- Heikki



Re: GSoC 2014 - mentors, students and admins

From
Thom Brown
Date:
All students and mentors (and backup mentors) should now register to
this year's GSoC.  Students only have until Friday next week (up until
21st March 19:00 UTC) to apply.

Thanks

Thom


Re: [pgsql-advocacy] GSoC 2014 - mentors, students and admins

From
Stephen Frost
Date:
* Stephen Frost (sfrost@snowman.net) wrote:
> I'm interested in mentoring and, unlike previous years, I've been
> collecting a personal list of things that I'd like to see worked on for
> PG which could be GSoC projects and will provide such in the next few
> days to this list (unless there's a different list that people want such
> posted to..?).

Alright, I've updated the wiki page here:

https://wiki.postgresql.org/wiki/GSoC_2014

with my thoughts (finally- sorry about the delay).  Looks like other
folks have been updating it too, which is great.  Hopefully we can
encourage some students to go check it out and try to pick up one of
those projects...

    Thanks,

        Stephen

Attachment

Re: [pgsql-advocacy] GSoC 2014 - mentors, students and admins

From
Fabrízio de Royes Mello
Date:

On Mon, Mar 17, 2014 at 10:43 PM, Stephen Frost <sfrost@snowman.net> wrote:
>
> * Stephen Frost (sfrost@snowman.net) wrote:
> > I'm interested in mentoring and, unlike previous years, I've been
> > collecting a personal list of things that I'd like to see worked on for
> > PG which could be GSoC projects and will provide such in the next few
> > days to this list (unless there's a different list that people want such
> > posted to..?).
>
> Alright, I've updated the wiki page here:
>
> https://wiki.postgresql.org/wiki/GSoC_2014
>
> with my thoughts (finally- sorry about the delay).  Looks like other
> folks have been updating it too, which is great.  Hopefully we can
> encourage some students to go check it out and try to pick up one of
> those projects...
>

Folks if this don't cause trouble I added one more item:
* Allow an unlogged table to be changed to logged

We already discussed a while about this feature [1].

Regards,

[1] http://www.postgresql.org/message-id/CAFcNs+peg3VPG2=v6Lu3vfCDP8mt7cs6-RMMXxjxWNLREgSRVQ@mail.gmail.com

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello

Re: GSoC 2014 - mentors, students and admins

From
Thom Brown
Date:
Hi all,

There is 1 day left to get submissions in, so students should ensure
that they submit their proposals as soon as possible.  No submissions
will be accepted beyond the deadline of 19:00 UTC tomorrow (Friday
21st March).

Regards

Thom


Re: GSoC on WAL-logging hash indexes

From
Peter Geoghegan
Date:
On Mon, Mar 3, 2014 at 8:12 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> As a GSoC student, I will implement WAL recovery of hash indexes using the
>> other index types' WAL code as a guide.

Frankly, I'm skeptical of the idea that hash indexes will ever really
be useful. I realize that that's a counter-intuitive conclusion, but
there are many things we could do to improve B-Tree CPU costs to make
them closer to those of hash indexes, without making them any less
flexible. I myself would much rather work on that, and intend to.

The O(1) cost seems attractive when you consider that that only
requires that we read one index page from disk to service any given
index scan, but in fact B-Trees almost always only require the same.
They are of course also much more flexible. The concurrency
characteristics B-Trees are a lot better understood. I sincerely
suggest that we forget about conventional hash table type indexes. I
fear they're a lost cause.

--
Peter Geoghegan


Re: GSoC on WAL-logging hash indexes

From
"ktm@rice.edu"
Date:
On Wed, Apr 30, 2014 at 12:26:20AM -0700, Peter Geoghegan wrote:
> On Mon, Mar 3, 2014 at 8:12 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> >> As a GSoC student, I will implement WAL recovery of hash indexes using the
> >> other index types' WAL code as a guide.
>
> Frankly, I'm skeptical of the idea that hash indexes will ever really
> be useful. I realize that that's a counter-intuitive conclusion, but
> there are many things we could do to improve B-Tree CPU costs to make
> them closer to those of hash indexes, without making them any less
> flexible. I myself would much rather work on that, and intend to.
>
> The O(1) cost seems attractive when you consider that that only
> requires that we read one index page from disk to service any given
> index scan, but in fact B-Trees almost always only require the same.
> They are of course also much more flexible. The concurrency
> characteristics B-Trees are a lot better understood. I sincerely
> suggest that we forget about conventional hash table type indexes. I
> fear they're a lost cause.
>
> --
> Peter Geoghegan
>
Hi Peter,

I do not think that CPU costs matter as much as the O(1) probe to
get a result value specifically for very large indexes/tables where
even caching the upper levels of a B-tree index would kill your
working set in memory. I know, I know, everyone has so much memory
and can just buy more... but this does matter. I also think that
development of hash indexes has been stalled waiting for WAL
logging. For example, hash indexes can almost trivially become
more space efficient as they grow in size by utilizing the page
number to represent the prefix bits of the hash value for a bucket.

My 2 cents.
Ken


Re: GSoC on WAL-logging hash indexes

From
Peter Geoghegan
Date:
On Wed, Apr 30, 2014 at 5:55 AM, ktm@rice.edu <ktm@rice.edu> wrote:
> I do not think that CPU costs matter as much as the O(1) probe to
> get a result value specifically for very large indexes/tables where
> even caching the upper levels of a B-tree index would kill your
> working set in memory. I know, I know, everyone has so much memory
> and can just buy more... but this does matter.

Have you actually investigated how little memory it takes to store the
inner pages? It's typically less than 1% of the entire index. AFAIK,
hash indexes are not used much in any other system. I think MySQL has
them, and SQL Server 2014 has special in-memory hash table indexes for
in memory tables, but that's all I can find on Google.


--
Peter Geoghegan


Re: GSoC on WAL-logging hash indexes

From
Robert Haas
Date:
On Wed, Apr 30, 2014 at 12:54 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Wed, Apr 30, 2014 at 5:55 AM, ktm@rice.edu <ktm@rice.edu> wrote:
>> I do not think that CPU costs matter as much as the O(1) probe to
>> get a result value specifically for very large indexes/tables where
>> even caching the upper levels of a B-tree index would kill your
>> working set in memory. I know, I know, everyone has so much memory
>> and can just buy more... but this does matter.
>
> Have you actually investigated how little memory it takes to store the
> inner pages? It's typically less than 1% of the entire index. AFAIK,
> hash indexes are not used much in any other system. I think MySQL has
> them, and SQL Server 2014 has special in-memory hash table indexes for
> in memory tables, but that's all I can find on Google.

I thought the theoretical advantage of hash indexes wasn't that they
were smaller but that you avoided a central contention point (the
btree root).

Of course our current hash indexes have *more* not less contention
than btree but I'm pretty comfortable chalking that up to quality of
implementation rather than anything intrinsic.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: GSoC on WAL-logging hash indexes

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I thought the theoretical advantage of hash indexes wasn't that they
> were smaller but that you avoided a central contention point (the
> btree root).

> Of course our current hash indexes have *more* not less contention
> than btree but I'm pretty comfortable chalking that up to quality of
> implementation rather than anything intrinsic.

The long and the short of it is that there are *lots* of implementation
deficiences in our hash indexes.  There's no real way to know whether
they'd be competitive if all those things were rectified, except by doing
the work to fix 'em.  And it's hard to justify putting much effort into
hash indexes so long as there's an elephant in the room of the size of "no
WAL support".  So I'm in favor of getting that fixed, if we have somebody
who's willing to do it.  It might lead to good things later; and even if
it doesn't, the lack of WAL support is an embarrassment.

            regards, tom lane


Re: GSoC on WAL-logging hash indexes

From
Peter Geoghegan
Date:
On Wed, Apr 30, 2014 at 10:11 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> I thought the theoretical advantage of hash indexes wasn't that they
> were smaller but that you avoided a central contention point (the
> btree root).

The B-Tree root isn't really a central contention point at all. The
locking/latching protocol that nbtree uses is remarkably
concurrency-friendly. In the real world, there is pretty much no
exclusive locking of the root page's buffer.

> Of course our current hash indexes have *more* not less contention
> than btree but I'm pretty comfortable chalking that up to quality of
> implementation rather than anything intrinsic.

I am not convinced of that.

--
Peter Geoghegan


Re: GSoC on WAL-logging hash indexes

From
Peter Geoghegan
Date:
On Wed, Apr 30, 2014 at 11:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>  It might lead to good things later; and even if
> it doesn't, the lack of WAL support is an embarrassment.

I don't think it will, but I do agree that the current state of
affairs is an embarrassment.

--
Peter Geoghegan


Re: GSoC on WAL-logging hash indexes

From
Jeff Janes
Date:
On Wed, Apr 30, 2014 at 12:26 AM, Peter Geoghegan <pg@heroku.com> wrote:
On Mon, Mar 3, 2014 at 8:12 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> As a GSoC student, I will implement WAL recovery of hash indexes using the
>> other index types' WAL code as a guide.

Frankly, I'm skeptical of the idea that hash indexes will ever really
be useful. I realize that that's a counter-intuitive conclusion, but
there are many things we could do to improve B-Tree CPU costs to make
them closer to those of hash indexes, without making them any less
flexible. I myself would much rather work on that, and intend to.

If we don't put in the work to make them useful, then they won't ever become useful.

If we do put in the effort (and it would be considerable) then I think they will be.  But you may be correct that the effort required would perhaps be better used in making btree even more better.  I don't think we can conclude that definitively without putting in the work to do the experiment.

One advantage of the hash indexes is that the code is simple enough for someone to actually understand it in a summer. Whether it would still be like that after WAL logging was implemented, I don't know.

The O(1) cost seems attractive when you consider that that only
requires that we read one index page from disk to service any given
index scan, but in fact B-Trees almost always only require the same.
They are of course also much more flexible. The concurrency
characteristics B-Trees are a lot better understood.

Not sure what you mean there.  The concurrency issues of the hash index has a lot less that needs to be understand.  I think I understand it pretty well (unlike B-tree), I just don't know what to with that knowledge.
 
I sincerely
suggest that we forget about conventional hash table type indexes. I
fear they're a lost cause.

I understand that those are the only ones worth fighting for. :)

Cheers,

Jeff

Re: GSoC on WAL-logging hash indexes

From
Josh Berkus
Date:
All,

(dropping pgsql-advocacy off the cc list)

On 04/30/2014 10:11 AM, Robert Haas wrote:
> On Wed, Apr 30, 2014 at 12:54 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> On Wed, Apr 30, 2014 at 5:55 AM, ktm@rice.edu <ktm@rice.edu> wrote:
>>> I do not think that CPU costs matter as much as the O(1) probe to
>>> get a result value specifically for very large indexes/tables where
>>> even caching the upper levels of a B-tree index would kill your
>>> working set in memory. I know, I know, everyone has so much memory
>>> and can just buy more... but this does matter.
>>
>> Have you actually investigated how little memory it takes to store the
>> inner pages? It's typically less than 1% of the entire index. AFAIK,
>> hash indexes are not used much in any other system. I think MySQL has
>> them, and SQL Server 2014 has special in-memory hash table indexes for
>> in memory tables, but that's all I can find on Google.

Hash indexes are more important for MySQL because they have
index-organized tables.

> I thought the theoretical advantage of hash indexes wasn't that they
> were smaller but that you avoided a central contention point (the
> btree root).

Yes. And being smaller isn't insignificant; think of billion-row tables
with fairly random access over the whole table.  Also, *theoretically*,
a hash index could avoid the rebalancing issues which cause our btree
indexes to become bloated and need a REINDEX with certain update patterns.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: GSoC on WAL-logging hash indexes

From
Jeff Janes
Date:
On Wed, Apr 30, 2014 at 11:02 AM, Peter Geoghegan <pg@heroku.com> wrote:
On Wed, Apr 30, 2014 at 10:11 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> I thought the theoretical advantage of hash indexes wasn't that they
> were smaller but that you avoided a central contention point (the
> btree root).

The B-Tree root isn't really a central contention point at all. The
locking/latching protocol that nbtree uses is remarkably
concurrency-friendly. In the real world, there is pretty much no
exclusive locking of the root page's buffer.

I've seen the simple pinning and unpinning of the root page (or the fast root, whatever the first page we bother to pin on a regular basis is called) be a point of contention.  When one index dominates the entire system workload, that one page also drives contention on the spin lock that protects the lwlock that share-protects whichever buffer mapping partition happens to contain it.
 
Cheers,

Jeff

Re: GSoC on WAL-logging hash indexes

From
Andres Freund
Date:
On 2014-04-30 11:10:22 -0700, Jeff Janes wrote:
> I've seen the simple pinning and unpinning of the root page (or the fast root,
> whatever the first page we bother to pin on a regular basis is called) be a
> point of contention.  When one index dominates the entire system workload, that
> one page also drives contention on the spin lock that protects the lwlock that
> share-protects whichever buffer mapping partition happens to contain it.

To quite some degree that's an implementation deficiency of our lwlocks
though. I've seen *massive* improvements with my lwlock patch for that
problem. Additionally we need to get rid of the spinlock around
pin/unpin.
That said, even after those optimizations, there remains a significant
amount of cacheline bouncing. That's much easier to avoid for something
like hash indexes than btrees.

I think another advantage is that hash indexes can be *much* smaller
than btree when the individual rows are wide. I wonder though if we
couldn't solve that better by introducing "transforms" around the looked
up data. E.g. allow to *transparently* use a hash(indexed_column) to be
used. If you currently do that a lot of work has to be done in every
query...

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: GSoC on WAL-logging hash indexes

From
Robert Haas
Date:
On Wed, Apr 30, 2014 at 2:02 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> Of course our current hash indexes have *more* not less contention
>> than btree but I'm pretty comfortable chalking that up to quality of
>> implementation rather than anything intrinsic.
>
> I am not convinced of that.

In commit 76837c1507cb5a5a0048046233568447729e66dd, I remove part
(basically half) of the heavyweight locking used by hash indexes, and
performance improved rather dramatically:

http://www.postgresql.org/message-id/CA+Tgmoaf=nOJxLyzGcbrrY+pe-0VLL0vfHi6tjdM3fFtVwsOmw@mail.gmail.com

I think it's quite reasonable to suppose that getting rid of the
remaining heavyweight locking will help just as much.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: GSoC on WAL-logging hash indexes

From
Peter Geoghegan
Date:
On Wed, Apr 30, 2014 at 11:03 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> If we don't put in the work to make them useful, then they won't ever become
> useful.
>
> If we do put in the effort (and it would be considerable) then I think they
> will be.  But you may be correct that the effort required would perhaps be
> better used in making btree even more better.  I don't think we can conclude
> that definitively without putting in the work to do the experiment.

My argument doesn't hinge on there being more important work to do.
Rather, I simply don't think that there is never going to be a
compelling reason to use hash indexes in production. Apart from the
obvious inflexibility, consider what it takes to make index creation
fast - insertion-style building of indexes is much slower. Consider
multi-key indexes.

Now, I'm not telling anyone what to work on, and if someone wants to
make hash indexes WAL-logged to plug that hole, don't let me stop you.
It probably makes sense as a project to learn more about Postgres
internals. However, it would be unfair to not speak up given my
misgivings around the practical utility of hash indexes.

-- 
Peter Geoghegan



Re: GSoC on WAL-logging hash indexes

From
Greg Stark
Date:
I think the key question was if someone wanted to develop a good hash
index would they start from our current hash index or would they be
better off starting from a fresh codebase? If the former then we
should add WAL logging and throw GSOC and other people who ask for
projects at it. If the latter then we should throw out the current
codebase and let people develop extensions that add new hash index
code until someone comes up with a good design we want to move to
core.

But imnsho doing nothing is a bad idea. We should have long ago either
added WAL logging or removed the index type. We shouldn't have left a
foot-gun that large lying around for so long.

Incidentally something else I've considered would be having a WAL
record type saying "relation corrupted" and emitting one the first
time a hash index is touched after a checkpoint. That could provide a
general mechanism that might be useful for unlogged operations (and
might be combinable with the infrastructure for unlogged tables). But
it seems like a better tool for other objects than hash indexes.

Another quick fix would be having a GUC allow_unrecoverable_relations
which defaulted to false. Creating a hash index (and presumably
unlogged tables) would error out with a hint to set that to true so at
least users who create them would have to know what they were in for.
Right now we're seeing a lot of users who create hash indexes and then
complain about corrupted standbys.

I could do the legwork on either the GUC or moving hash indexes to an
extension if there's a consensus. But I suspect either will be quite
controversial...



Re: GSoC on WAL-logging hash indexes

From
Stephen Frost
Date:
* Greg Stark (stark@mit.edu) wrote:
> But imnsho doing nothing is a bad idea. We should have long ago either
> added WAL logging or removed the index type. We shouldn't have left a
> foot-gun that large lying around for so long.

I certainly encourage you to work on it. :)

> Incidentally something else I've considered would be having a WAL
> record type saying "relation corrupted" and emitting one the first
> time a hash index is touched after a checkpoint. That could provide a
> general mechanism that might be useful for unlogged operations (and
> might be combinable with the infrastructure for unlogged tables). But
> it seems like a better tool for other objects than hash indexes.

Ugh, please do not make unlogged tables suffer for this.  I feel it is
*quite* clear when you say "UNLOGGED" or "TEMP" in the CREATE statement
that you know what you're gonna get.  Perhaps we should require
'UNLOGGED INDEX' to be passed in when someone creates a 'hash' index
instead.

> Another quick fix would be having a GUC allow_unrecoverable_relations
> which defaulted to false. Creating a hash index (and presumably
> unlogged tables) would error out with a hint to set that to true so at
> least users who create them would have to know what they were in for.

-1

> Right now we're seeing a lot of users who create hash indexes and then
> complain about corrupted standbys.

Uh, we are?  If you're saying that $employer is, great, but please
clarify that as the rest of us aren't 'in the loop' as far as that goes
and we see the various list/IRC traffic instead, and I can't remember
ever seeing such a complaint in either place (but I'll admit, my memory
isn't the best).

> I could do the legwork on either the GUC or moving hash indexes to an
> extension if there's a consensus. But I suspect either will be quite
> controversial...

-1 on the GUC.  I'd be alright with finding a way to make it clear, upon
creation of the hash index, that it's not WAL'd (maybe even just throw a
WARNING, it'd be better than nothing..).  Trying to rip out the current
hash index wouldn't be great, imv.  If you'd like to build an extension
which provides hash indexes- I say go for it?  Nothing stopping you as
far as that's concerned.
Thanks,
    Stephen

Re: GSoC on WAL-logging hash indexes

From
Jeff Janes
Date:
On Wed, Apr 30, 2014 at 11:19 AM, Peter Geoghegan <pg@heroku.com> wrote:
On Wed, Apr 30, 2014 at 11:03 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> If we don't put in the work to make them useful, then they won't ever become
> useful.
>
> If we do put in the effort (and it would be considerable) then I think they
> will be.  But you may be correct that the effort required would perhaps be
> better used in making btree even more better.  I don't think we can conclude
> that definitively without putting in the work to do the experiment.

My argument doesn't hinge on there being more important work to do.
Rather, I simply don't think that there is never going to be a
compelling reason to use hash indexes in production.

I have an indexed text column with an average length of 50, a stddev length of 15, and a pronounced right skew.  Currently, the longest value in it is 811.  But inevitably someone will need to insert something longer than 2712.  When that day comes, I will drop the btree index and add a hash index (unless we remove that limitation from btree indexes in the mean time).  It lets me sleep at night knowing that I have that option today, even if it would complicate crash recovery.
 

Apart from the
obvious inflexibility, consider what it takes to make index creation
fast - insertion-style building of indexes is much slower. Consider
multi-key indexes.

I'm pretty sure hash indexes already implement a bulk creation fast path.  In any case, I've never noticed them being slow, and I've tested some pretty big ones.

 
Now, I'm not telling anyone what to work on, and if someone wants to
make hash indexes WAL-logged to plug that hole, don't let me stop you.
It probably makes sense as a project to learn more about Postgres
internals. However, it would be unfair to not speak up given my
misgivings around the practical utility of hash indexes.

Sure, and we all have our own opinions on that.  Should we summarize them somewhere easier to follow than a long email thread but more detailed than a TODO entry?  Whatever happened with the GSOC people?  That should be well under way by now, is anyone working on it?  Are the discussions of their efforts on-list, or is it between them and their mentors?

Cheers,

Jeff

Re: GSoC on WAL-logging hash indexes

From
Jeff Janes
Date:
On Wed, Apr 30, 2014 at 12:16 PM, Greg Stark <stark@mit.edu> wrote:
I think the key question was if someone wanted to develop a good hash
index would they start from our current hash index or would they be
better off starting from a fresh codebase?

If it were me I'd start with the current code.  It would be nice if one could just fork the code to have a new type of index (say "hash2") which is initially identical, but I never figured out how to do that.  

 
If the former then we
should add WAL logging and throw GSOC and other people who ask for
projects at it. If the latter then we should throw out the current
codebase and let people develop extensions that add new hash index
code until someone comes up with a good design we want to move to
core.

Extensions have no hooks into the WAL system, and I'm not optimistic that that will ever change.  Relegating a fundamentally new index to be an extension virtually *guarantees* that it will never be WAL logged.

 

Incidentally something else I've considered would be having a WAL
record type saying "relation corrupted" and emitting one the first
time a hash index is touched after a checkpoint. That could provide a
general mechanism that might be useful for unlogged operations (and
might be combinable with the infrastructure for unlogged tables). But
it seems like a better tool for other objects than hash indexes.

+1.

I often lament that unlogged tables cannot be used on a standby or a test server which were derived from a hot backup.  In the case of unlogged tables, this does mean we would need a way to checkpoint them with a non-shutdown checkpoint, though.  I don't know if that would need a different type of unlogged table, or a different type of checkpoint.

Cheers,

Jeff

Re: [pgsql-advocacy] GSoC on WAL-logging hash indexes

From
Darren Duncan
Date:
Is there a good reason for this thread being copied to the advocacy list?  It
seems to me just on topic for hackers. -- Darren Duncan



Re: GSoC on WAL-logging hash indexes

From
Alvaro Herrera
Date:
Stephen Frost wrote:
> * Greg Stark (stark@mit.edu) wrote:

> > I could do the legwork on either the GUC or moving hash indexes to an
> > extension if there's a consensus. But I suspect either will be quite
> > controversial...

> If you'd like to build an extension which provides hash indexes- I say
> go for it?  Nothing stopping you as far as that's concerned.

Extensions can't write/replay WAL, so it's not clear to me how would this
work.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: GSoC on WAL-logging hash indexes

From
Tom Lane
Date:
Greg Stark <stark@mit.edu> writes:
> But imnsho doing nothing is a bad idea. We should have long ago either
> added WAL logging or removed the index type. We shouldn't have left a
> foot-gun that large lying around for so long.

We can't remove the hash index type, nor move it to an extension,
because it is the operator classes for the built-in hash index AM
that tell the planner and executor how to do hashing for arbitrary
datatypes.  And we certainly do not want to give up hashing-based
query plans, whatever you may think of hash indexes.

We really oughta fix the WAL situation, not just band-aid around it.
        regards, tom lane



Re: GSoC on WAL-logging hash indexes

From
Vik Fearing
Date:
On 04/30/2014 11:41 PM, Tom Lane wrote:
> We really oughta fix the WAL situation, not just band-aid around it.

After re-reading this thread, it is not clear that anyone is going to
work on it so I'll just ask:

Is anyone working on this?

If not, I'd like to put it on my plate.

-- 
Vik




Re: GSoC on WAL-logging hash indexes

From
Michael Paquier
Date:
<div dir="ltr"><br /><div class="gmail_extra"><br /><br /><div class="gmail_quote">On Thu, Jun 19, 2014 at 6:40 PM, Vik
Fearing<span dir="ltr"><<a href="mailto:vik.fearing@dalibo.com"
target="_blank">vik.fearing@dalibo.com</a>></span>wrote:<br /><blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px#ccc solid;padding-left:1ex"><div class="">On 04/30/2014 11:41 PM, Tom Lane wrote:<br /> > We
reallyoughta fix the WAL situation, not just band-aid around it.<br /><br /></div>After re-reading this thread, it is
notclear that anyone is going to<br /> work on it so I'll just ask:<br /><br /> Is anyone working on this?<br /><br />
Ifnot, I'd like to put it on my plate.<br /></blockquote><div class="h5">Vik, did you get time to look at that
finally?<br/></div></div></div><div class="gmail_extra">Regards,<br /></div><div class="gmail_extra">-- <br
/>Michael<br/></div></div> 

Re: GSoC on WAL-logging hash indexes

From
Vik Fearing
Date:
On 08/20/2014 02:43 AM, Michael Paquier wrote:
> 
> 
> 
> On Thu, Jun 19, 2014 at 6:40 PM, Vik Fearing <vik.fearing@dalibo.com
> <mailto:vik.fearing@dalibo.com>> wrote:
> 
>     On 04/30/2014 11:41 PM, Tom Lane wrote:
>     > We really oughta fix the WAL situation, not just band-aid around it.
> 
>     After re-reading this thread, it is not clear that anyone is going to
>     work on it so I'll just ask:
> 
>     Is anyone working on this?
> 
>     If not, I'd like to put it on my plate.
> 
> Vik, did you get time to look at that finally?

Yes, I am (very slowly) working on this.  I've got a decent learning
curve for WAL replay to get over and I figured this can't be urgent
considering how many years it's been like this so I'm sort of taking my
time.
-- 
Vik