Re: PostgreSQL suitable? - Mailing list pgsql-general

From Stephen Frost
Subject Re: PostgreSQL suitable?
Date
Msg-id 20171219172403.GO4628@tamriel.snowman.net
Whole thread Raw
In response to Re: PostgreSQL suitable?  (Vincenzo Romano <vincenzo.romano@notorand.it>)
List pgsql-general
Greetings,

* Vincenzo Romano (vincenzo.romano@notorand.it) wrote:
> 2017-12-19 17:32 GMT+01:00 Stephen Frost <sfrost@snowman.net>:
> > * Vincenzo Romano (vincenzo.romano@notorand.it) wrote:
> >> 2017-12-19 16:48 GMT+01:00 Stephen Frost <sfrost@snowman.net>:
> >> > * Vincenzo Romano (vincenzo.romano@notorand.it) wrote:
> >> >> Sorry, my bad: I confused V10 with v11.
> >> >> But accordingly to a discussion with Bruce Momjan, table partitionin V10 is
> >> >> little more than syntactic sugar around old-fashioned table partitioning.
> >> >
> >> > Well, it's a bit more than that since there's tuple-routing, but you're
> >> > right that the partition elimination is the same as it was in earlier
> >> > versions and based on constraint exclusion.  That said, as noted in the
> >> > email you replied to, reasonable numbers of partitions aren't too bad
> >> > even with the planning cost; it's when you have many thousands of
> >> > partitions that you get into cases where planning time for queries is
> >> > really bad.
> >>
> >> When you have to handle a 100TB table, the number of partitions
> >> shouldn't be 10 or 12
> >> as seen in most examples and tests.
> >
> > Probably not, but it would depend on the data and what you're doing with
> > it.  While there are some general rules-of-thumb, there's no one right
> > answer when it comes to the size of individual partitions.
>
> I think partitioning is done in order to make smaller tables or to
> balance table size and table number.
> You currently have to keep in mind a limit to the number of sub-tables.
> This makes partitioning less effective.

Sure, but that doesn't make PG unsuitable for such use-cases, just that
you have to realize these costs and address them.

> >> This is the same type of issues you hit with partial indexes (this is
> >> why I mentioned them earlier).
> >> Sub-table (and partial index) selection algorithm should be
> >> logarithmic or sub-linear.
> >
> > Sure, and work is being done to improve PG in that exact area, but that
> > doesn't mean it can't handle workloads like this today, but you have to
> > be aware of the costs associated with today's partitions.
> >
> >> As long as it'll be linear, you'll hear about "reasonable number of partitions".
> >> One thousand partitions for a 100TB table would make "manageable"
> >> 100GB sub-tables.
> >
> > Yup, and 100G tables are certainly large and a bit awkward but they can
> > be managed.
>
> Of course. But I would bet they are still considered as "very large tables".

They're not small. :)

> >> I could be easily wrong, but this is an are where PG needs improvements.
> >
> > Of course, and work is being done to improve it.  What's important is
> > knowing that there's a cost to having more partitions when querying
> > through the parent when you get to a point where you have thousands of
> > partitions.  That cost may be perfectly fine in some use-cases and in
> > others it might not be, and instead you'd likely have to build logic
> > into the application layer to address it.  That's not ideal, which is
> > why there's work being done to improve PG, but it's not necessairly a
> > big deal either.  Dynamic SQL is another approach.
> >
> >> One could maybe think about multi-level partitioning, though.
> >
> > Not sure exactly what you're referring to here, but a simple multi-level
> > partitioning setup with PG wouldn't actually change the cost for
> > partition elimination today.
>
> It'd be a tree of tables, pointing to logarithmic selection.
> This is why we love tree data structures.

Sure, but that doesn't matter when it comes to how constraint exclusion
works with PostgreSQL today- all partitions under a given parent are
considered when doing partition elimination.

> >> > Also as noted on this thread, PG could handle this data volume, but to
> >> > be efficient there would be work to be done in normalization,
> >> > aggregation, and analyzing the system to ensure you're storing and
> >> > querying on the data efficiently.
> >>
> >> Normalization will grow the number of tables (and later joins) and
> >> you'll will very likely end up with at least a table with a
> >> "gazillion" rows.
> >
> > Natuarlly, but that "gazillion" rows table would be much smaller for
> > having the data normalized- if you don't normalize it then the gazillion
> > row table is a huge amount of duplicated data, making the entire system
> > much larger than necessary.
>
> Partitioning is done for other reasons than de-duplicating data.

Yes, I wasn't suggesting to use partitioning to de-duplicate data, I was
suggesting to use normalization for that.

> The number of rows to be analyzed would be still the same and the
> indexes over those columns would still be rather large.

While true, the indexes would be smaller than they would be without the
data being normalized.

> >> I fear normalization, provided it's really needed, would provide little help.
> >
> > I seriously doubt that's the case.  Normalization might reduce that
> > 100TB down to 10's of TB instead, or perhaps even smaller.
>
> A 10x duplication factor seems very bad to me.

I'm not sure why, it's not actually that uncommon.

> If that's the case then normalization would provide for some
> improvement while eating resources to be done.

It's unclear what you're referring to here regarding 'eating resources'.
Clearly there's some CPU time that's consumed when doing a join, but
that's generally far cheaper than the time required to pull data off of
disk because it's not in memory.

> Storage nowadays isn't a big issue. Querying is.

Sure, and if all you had to pay was the storage cost then no one would
care- but you also have to pay the memory cost and that's a much bigger
deal.  If you have to go out to disk constantly then you've basically
already lost when it comes to query time.

> >> With tables that big I usually do "software partitioning".
> >> I make the software aware of the partition schema so it can direct the
> >> queries to the needed (sub-)tables.
> >
> > Yes, implementing partition elimination at the application level is
> > certainly an approach that can work with PG.  Even then, ideally, you'd
> > normalize the data so that the footprint is much smaller, making more
> > efficient use of the storage and memory available.
>
> Normalizing can save resources that you later need to join tables back.

You were commenting above on how normalization eats resources and now
it's saving them. :)  In my experience, the join cost is nothing
compared to the cost to pull data off of disk because you couldn't fit
it all in memory.

> It's a matter of trade-offs and it strictly depends on the queries to
> be run in my opinion.

I agree that it's a matter of trade-offs and certainly depends on the
queries being run, though none of that information is available here.

> Effective table partitioning, instead, seems to me to have bigger importance.

They're both important, imv.

> My bottom line here is: table partitioning is nice, but not ready yet
> for the big data where it makes the bigger sense.

I agree that there's more work to be done there, certainly, but it's
also getting better and there are ways to work around the existing
limitations.

Thanks!

Stephen

Attachment

pgsql-general by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: AWS Aurora and PG 10
Next
From: Kevin Burke
Date:
Subject: Debugging a function - what's the best way to do this quickly?