Re: Should we warn against using too many partitions? - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: Should we warn against using too many partitions?
Date
Msg-id 20190609042142.GH3079@telsasoft.com
Whole thread Raw
In response to Re: Should we warn against using too many partitions?  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Should we warn against using too many partitions?
List pgsql-hackers
On Sun, Jun 09, 2019 at 01:15:09PM +1200, David Rowley wrote:
> Thanks for having another look.
> 
> On Sat, 8 Jun 2019 at 18:39, Justin Pryzby <pryzby@telsasoft.com> wrote:
> > +    to keep exists in that partition then that means having to resort to using
> > +    <command>DELETE</command> instead of removing the partition.
> > +   </para>
> > +
> > +   <para>
> > +    Choosing the target number of partitions by which the table should be
> > +    divided into is also a critical decision to make.  Not having enough
> >
> > Should be: ".. target number .. into which .. should be divided .."
> 
> I've changed "by" to "into". I think that's what you mean, otherwise,
> you've lost me.

I meant it should say "into which it should be divided" and not "by which it
should be divided INTO", which has too many prepositions.  This is still an
issue:

+    Choosing the target number of partitions into which the table should be
+    divided into is also a critical decision to make.  Not having enough

> > +    partitions may mean that indexes remain too large and that data locality
> > +    remains poor which could result in poor cache hit ratios.  However,
> >
> > Change the 2nd remains to "is" and the second poor to "low" ?

> > +    consumption during both query planning and execution.  It's also important
> > +    to consider what changes may occur in the future when choosing how to
> > +    partition your table.  For example, if you choose to have one partition
> >
> > Remove "when choosing ..."?  Or say:
> 
> I don't see how that would make sense.

I suggested it because otherwise it can read as: "in the future when choosing ...".

> > +    per customer and you currently have a small number of large customers,
> > +    what will the implications be if in several years you obtain a large
> > +    number of small customers.  In this case, it may be better to choose to
> > +    partition by <literal>HASH</literal> and choose a reasonable number of
> > +    partitions rather than trying to partition by <literal>LIST</literal> and
> > +    hoping that the number of customers does not increase significantly over
> > +    time.
> > +   </para>
> >
> > It's an unusual thing for which to hope :)
> 
> I have reworded this slightly which may help with that.

I didn't mean there was any issue with this, just that it's amusing to find
oneself in the unfortunate position of hoping that one's company doesn't end up
with many customers.

> > +    processing time is spent during query execution.  With either of these two
> > +    types of workload it is important to make the right decisions early as
> >
> > early COMMA
> 
> I'm not really sure what you mean here as I don't see any comma in
> that text. I guess you want me to add one? But I'm confused as you
> seemed to ask me to remove a comma there in your previous review.

I meant to add one then and now, like:

|    these two types of workload, it is important to make the right decisions
|    early, as re-partitioning large quantities of data can be ...

Thanks,
Justin



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: be-gssapi-common.h should be located in src/include/libpq/
Next
From: David Rowley
Date:
Subject: Re: Should we warn against using too many partitions?