Re: Should we warn against using too many partitions? - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: Should we warn against using too many partitions? |
Date | |
Msg-id | CAKJS1f_fvWA3yXPJZNiomXGRn3173W3d5YBsBvxtwsN=ffH+5g@mail.gmail.com Whole thread Raw |
In response to | Re: Should we warn against using too many partitions? (Amit Langote <amitlangote09@gmail.com>) |
Responses |
Re: Should we warn against using too many partitions?
|
List | pgsql-hackers |
Thanks for these suggestions. On Fri, 7 Jun 2019 at 19:00, Amit Langote <amitlangote09@gmail.com> wrote: > Some rewording suggestions. > > 1. > > + ... Removal of unwanted data is also a factor to consider when > + planning your partitioning strategy as an entire partition can be removed > + fairly quickly. However, if data that you want to keep exists in that > + partition then that means having to resort to using > + <command>DELETE</command> instead of removing the partition. > > Not sure if the 2nd sentence is necessary or perhaps should be > rewritten in a way that helps to design to benefit from this. > > Maybe: > > ... Removal of unwanted data is also a factor to consider when > planning your partitioning strategy as an entire partition can be > removed fairly quickly, especially if the partition keys are chosen > such that all data that can be deleted together are grouped into > separate partitions. It seems like a good idea to change this to have this mention the benefits rather than the drawbacks. I've reworded it, but not using your exact words as it seems the "especially" means that a partition can be removed faster with properly chosen partition keys, which is not the case. I also split this out into its own paragraph since it's talking about something quite different from the previous paragraph. > 2. > > + ... For example, if you choose to have one partition > + per customer and you currently have a small number of large customers, > + what will the implications be if in several years you obtain a large > + number of small customers. > > The sentence could be rewritten a bit. Maybe as: > > ... For example, choosing a design with one partition per customer, > because you currently have a small number of large customers, will not > scale well several years down the line when you might have a large > number of small customers. > > Btw, doesn't it suffice here to say "large number of customers" > instead of "large number of small customers"? I'm not really trying to imply to plan for business growth here, I'm trying to angle it as "what if your business changes". I've reworded this slightly and it now says "what will the implications be if in several years you instead find yourself with a large number of small customers." > 3. > > + ... In this case, it may be better to choose to > + partition by <literal>RANGE</literal> and choose a reasonable number of > + partitions > > Maybe: > > ... and choose reasonable number of partitions, each containing the > data of a fixed number of customers. Yeah, that seems better. I'll change that for the PG10 version only. > 4. > > + ... It also > + may be undesirable to have a large number of partitions as each partition > + requires metadata about the partition to be stored in each session that > + touches it. If each session touches a large number of partitions over a > + period of time then the memory consumption for this may become > + significant. > > It might be a good idea to reorder the sentences here to put the > problem first and the cause later. Maybe like this: > > Another reason to be concerned about having a large number of > partitions is that the server's memory consumption may grow > significantly over a period of time, especially if many sessions touch > large numbers of partitions. That's because each partition requires > its own metadata that must be loaded into the local memory of each > session that touches it. That seems better. I've taken that text. > 5. > > + With data warehouse type workloads it can make sense to use a larger > + number of partitions than with an OLTP type workload. > > Is there a comma missing between "With data warehouse type workloads" > and the rest of the sentence? I've added one. Patches will follow once I've addressed Justin's review. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: