Home > mailing lists

Re: Declarative partitioning - Mailing list pgsql-hackers

From	Corey Huinker
Subject	Re: Declarative partitioning
Date	February 16, 2016 05:08:28
Msg-id	CADkLM=foVFUb6snTLSvAFZWBCGpT3q20fi9hbsFYyVoqFTcijw@mail.gmail.com Whole thread Raw
In response to	Re: Declarative partitioning (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
List	pgsql-hackers

Tree view

Also, you won't see any optimizer and executor changes. Queries will still
use the same plans as existing inheritance-based partitioned tables,
although as I mentioned, constraint exclusion won't yet kick in. That will
be fixed very shortly.

And of course, comments on syntax are welcome as well.

Thanks,
Amit

Good to know the current limitations/expectations.

Our ETL has a great number of workers that do something like this:

1. grab a file

2. based on some metadata of that file, determine the partition that that would receive ALL of the rows in that file. It's actually multiple tables, all of which are partitioned, all of which fully expect the file data to fit in exactly one partition.

3. \copy into a temp table

4. Transform the data and insert the relevant bits into each of the target partitions derived in #2.

So while ATR is a major feature of true partitioning, it's not something we'd actually need in our current production environment, but I can certainly code it that way to benchmark ATR vs "know the destination partition ahead of time" vs "insane layered range_partitioning trigger + pg_partman trigger".

Currently we don't do anything like table swapping, but I've done that enough in the past that I could probably concoct a test of that too, once it's implemented.

As for the syntax, I'm not quite sure your patch addresses the concerned I voiced earlier: specifically if the VALUES IN works for RANGE as well as LIST, but I figured that would become clearer once I tried to actually use it. Currently we have partitioning on C-collated text ranges (no, they don't ship with postgres, I had to make a custom type) something like this:

part0: (,BIG_CLIENT)
part1: [BIG_CLIENT,BIG_CLIENT]
part2: (BIG_CLIENT,L)
part3: [L,MONSTROUSLY_BIG_CLIENT)
part4: [MONSTROUSLY_BIG_CLIENT,MONSTROUSLY_BIG_CLIENT]
part5: (MONSTROUSLY_BIG_CLIENT,RANDOM_CLIENT_LATE_IN_ALPHABET]
part6: (RANDOM_CLIENT_LATE_IN_ALPHABET,)

I can't implement that with a simple VALUES LESS THAN clause, unless I happen to know 'x' in 'BIG_CLIENTx', where 'x' is the exact first character in the collation sequence, which has to be something unprintable, and that would make those who later read my code to say something unprintable. So yeah, I'm hoping there's some way to cleanly represent such ranges.

pgsql-hackers by date:

From: Amit Langote
Date: 16 February 2016, 04:39:51
Subject: Re: [PROPOSAL] VACUUM Progress Checker.

From: Merlin Moncure
Date: 16 February 2016, 05:09:23
Subject: Re: Remove or weaken hints about "effective resolution of sleep delays is 10 ms"?

Re: Declarative partitioning - Mailing list pgsql-hackers

Previous

Next