Re: Auto Partitioning - Mailing list pgsql-hackers

From Zeugswetter Andreas ADI SD
Subject Re: Auto Partitioning
Date
Msg-id E1539E0ED7043848906A8FF995BDA57901E7B93D@m0143.s-mxs.net
Whole thread Raw
In response to Re: Auto Partitioning  ("Simon Riggs" <simon@2ndquadrant.com>)
Responses Re: Auto Partitioning
Re: Auto Partitioning
List pgsql-hackers
> > (But that sounds rather like pie in the sky, actually.  Which other
> > databases can do that, and how do they do it?)
>
> Oracle does it, by building a big index. Few people use it.

And others allow a different partitioning strategy for each index,
but that has the same problem of how to remove partitions without
a huge amount of index reorganization.

> There are significant problems with this idea that I have already
> raised:
> - how big would the index be?
> - how would you add and remove partitions with any kind of
> performance?
> If we partitioned on date range, that will surely increase over time.
> - the index could almost certainly never be REINDEXed because
> of space requirements and time considerations.
> - if the indexed values were monotonically increasing the RHS
> of the index would become a significant hotspot in load
> performance, assuming high volume inserts into a large table

yes

> My argument is that there are significant real-world
> disadvantages to having this feature, yet there exists a
> reasonable workaround to avoid ever needing it.

I'd say a workaround can mostly be found but not always.
But I agree, that the downsides of one large global index are
substantial enough to not make this path attractive.

> Why would we spend time building and supporting it?

What I think we would like to have is putting the append nodes into an
order that allows removing the sort node whenever that can be done. And
maybe a merge node (that replaces the append and sort node) that can
merge presorted partitions.

I have one real example where I currently need one large non unique
index in Informix.
It is a journal table that is partitioned by client timestamp,
but I need a select first 1000 (of possibly many mio rows) order by
server_timestamp in a range
that naturally sometimes needs more than one partition because client
and server timestamps diverge.
Here the merge facility would allow me to not use the global index and
still avoid sorting
millions of rows (which would not finish in time).
Problem with the global index is, that I have to delete all rows from
the oldest partition before removing it
to avoid rebuilding the global index.

Andreas


pgsql-hackers by date:

Previous
From: "Sander Steffann"
Date:
Subject: Re: What X86/X64 OS's do we need coverage for?
Next
From: Paul Lindner
Date:
Subject: Re: What X86/X64 OS's do we need coverage for?