Re: Declarative partitioning - Mailing list pgsql-hackers

From Corey Huinker
Subject Re: Declarative partitioning
Date
Msg-id CADkLM=foVFUb6snTLSvAFZWBCGpT3q20fi9hbsFYyVoqFTcijw@mail.gmail.com
Whole thread Raw
In response to Re: Declarative partitioning  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
List pgsql-hackers
Also, you won't see any optimizer and executor changes. Queries will still
use the same plans as existing inheritance-based partitioned tables,
although as I mentioned, constraint exclusion won't yet kick in. That will
be fixed very shortly.

And of course, comments on syntax are welcome as well.

Thanks,
Amit



Good to know the current limitations/expectations. 

Our ETL has a great number of workers that do something like this:
1. grab a file
2. based on some metadata of that file, determine the partition that that would receive ALL of the rows in that file. It's actually multiple tables, all of which are partitioned, all of which fully expect the file data to fit in exactly one partition.
3. \copy into a temp table
4. Transform the data and insert the relevant bits into each of the target partitions derived in #2.

So while ATR is a major feature of true partitioning, it's not something we'd actually need in our current production environment, but I can certainly code it that way to benchmark ATR vs "know the destination partition ahead of time" vs "insane layered range_partitioning trigger + pg_partman trigger".

Currently we don't do anything like table swapping, but I've done that enough in the past that I could probably concoct a test of that too, once it's implemented.

As for the syntax, I'm not quite sure your patch addresses the concerned I voiced earlier: specifically if the VALUES IN works for RANGE as well as LIST,  but I figured that would become clearer once I tried to actually use it. Currently we have partitioning on C-collated text ranges (no, they don't ship with postgres, I had to make a custom type) something like this:

part0: (,BIG_CLIENT)
part1: [BIG_CLIENT,BIG_CLIENT]
part2: (BIG_CLIENT,L)
part3: [L,MONSTROUSLY_BIG_CLIENT)
part4: [MONSTROUSLY_BIG_CLIENT,MONSTROUSLY_BIG_CLIENT]
part5: (MONSTROUSLY_BIG_CLIENT,RANDOM_CLIENT_LATE_IN_ALPHABET]
part6: (RANDOM_CLIENT_LATE_IN_ALPHABET,)

I can't implement that with a simple VALUES LESS THAN clause, unless I happen to know 'x' in 'BIG_CLIENTx', where 'x' is the exact first character in the collation sequence, which has to be something unprintable, and that would make those who later read my code to say something unprintable. So yeah, I'm hoping there's some way to cleanly represent such ranges.

pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: [PROPOSAL] VACUUM Progress Checker.
Next
From: Merlin Moncure
Date:
Subject: Re: Remove or weaken hints about "effective resolution of sleep delays is 10 ms"?