Thread: Fast CLUSTER

Fast CLUSTER

From
"Simon Riggs"
Date:
I've used the heap_sync() API call to improve performance of CLUSTER by
avoiding WAL when archive_command is not set.

I've refactored the heap_sync() call very slightly, to hide some of the
complexity that was previously exposed to COPY. It now syncs a TOAST
relation automatically if one exists for the table. That also fixes a
latent bug in CREATE TABLE AS SELECT which would have meant that the
TOAST table wasn't properly synced (my bug), effectively caused by the
original API design.

Patch applies cleanly, passes make check.

No user interface changes. Performance docs updated also to reflect this
change and slightly improve directly related wording.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com


Attachment

Re: Fast CLUSTER

From
Alvaro Herrera
Date:
Simon Riggs wrote:
>
> I've used the heap_sync() API call to improve performance of CLUSTER by
> avoiding WAL when archive_command is not set.

Cool.  I noticed that the SGML seems broken here:

> --- 908,925 ----
>       will perform more slowly when <varname>archive_command</varname>
>       is set, as a result of their needing to write large amounts of WAL.
>       This applies to the following commands:
> !     <itemizedlist>
> !      <listitem><para><command>CREATE TABLE AS SELECT</command></para></listitem>
> !      <listitem><para><command>CREATE INDEX</command></para></listitem>
> !      <listitem><para><command>ALTER TABLE SET TABLESPACE</command></para></listitem>
> !      <listitem><para><command>CLUSTER</command></para></listitem>
> !      <listitem><para><command>COPY</command>, when it is executed after one of
> !       these commands, yet in the same transaction:
> !       <itemizedlist>
> !        <listitem><para><command>CREATE TABLE</command></para></listitem>
> !        <listitem><para><command>TRUNCATE</command></para></listitem>
> !       </itemizedlist>
> !   </itemizedlist>
>     </sect2>

You need to close the <listitem> and <para> opened in the COPY mention.

> +
> + static void
> + heap_sync_relation(Relation rel)
> + {
> +     if (!rel->rd_istemp)

No comment in this function?


--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Fast CLUSTER

From
"Simon Riggs"
Date:
On Tue, 2007-02-20 at 14:38 -0300, Alvaro Herrera wrote:

> Cool.  I noticed that the SGML seems broken here:

Corrected.

> You need to close the <listitem> and <para> opened in the COPY mention.
>
> > +
> > + static void
> > + heap_sync_relation(Relation rel)
> > + {
> > +     if (!rel->rd_istemp)
>
> No comment in this function?

I've added more comments as you suggest.

Thanks for the review.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com


Attachment

Re: Fast CLUSTER

From
"Simon Riggs"
Date:
On Thu, 2007-02-22 at 22:49 +0000, Simon Riggs wrote:
> On Tue, 2007-02-20 at 14:38 -0300, Alvaro Herrera wrote:
>
> > Cool.  I noticed that the SGML seems broken here:
>
> Corrected.
>
> > You need to close the <listitem> and <para> opened in the COPY mention.
> >
> > > +
> > > + static void
> > > + heap_sync_relation(Relation rel)
> > > + {
> > > +     if (!rel->rd_istemp)
> >
> > No comment in this function?
>
> I've added more comments as you suggest.
>
> Thanks for the review.

Could we add this to the unapplied patches queue? It seems to have been
missed off the list. Thanks.

There is a probable conflict with Heikki's recent CLUSTER patch, so I'm
happy to re-write this patch after that has been applied. So its OK to
stick it at the bottom of the queue.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



Re: Fast CLUSTER

From
Bruce Momjian
Date:
Simon Riggs wrote:
> On Thu, 2007-02-22 at 22:49 +0000, Simon Riggs wrote:
> > On Tue, 2007-02-20 at 14:38 -0300, Alvaro Herrera wrote:
> >
> > > Cool.  I noticed that the SGML seems broken here:
> >
> > Corrected.
> >
> > > You need to close the <listitem> and <para> opened in the COPY mention.
> > >
> > > > +
> > > > + static void
> > > > + heap_sync_relation(Relation rel)
> > > > + {
> > > > +     if (!rel->rd_istemp)
> > >
> > > No comment in this function?
> >
> > I've added more comments as you suggest.
> >
> > Thanks for the review.
>
> Could we add this to the unapplied patches queue? It seems to have been
> missed off the list. Thanks.

Done.

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Fast CLUSTER

From
Bruce Momjian
Date:
Your patch has been added to the PostgreSQL unapplied patches list at:

    http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------


Simon Riggs wrote:
> On Tue, 2007-02-20 at 14:38 -0300, Alvaro Herrera wrote:
>
> > Cool.  I noticed that the SGML seems broken here:
>
> Corrected.
>
> > You need to close the <listitem> and <para> opened in the COPY mention.
> >
> > > +
> > > + static void
> > > + heap_sync_relation(Relation rel)
> > > + {
> > > +     if (!rel->rd_istemp)
> >
> > No comment in this function?
>
> I've added more comments as you suggest.
>
> Thanks for the review.
>
> --
>   Simon Riggs
>   EnterpriseDB   http://www.enterprisedb.com
>

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Fast CLUSTER

From
"Simon Riggs"
Date:
On Tue, 2007-03-27 at 10:28 -0400, Bruce Momjian wrote:
> Simon Riggs wrote:
> > On Thu, 2007-02-22 at 22:49 +0000, Simon Riggs wrote:
> > Could we add this to the unapplied patches queue? It seems to have been
> > missed off the list. Thanks.
>
> Done.

Many thanks

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



Re: Fast CLUSTER

From
Tom Lane
Date:
"Simon Riggs" <simon@2ndquadrant.com> writes:
> [ make CLUSTER skip WAL when possible ]

Applied with some editorialization.

            regards, tom lane