Re: Creating partitions automatically at least on HASH? - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: Creating partitions automatically at least on HASH?
Date
Msg-id alpine.DEB.2.21.1908261935350.9896@lancre
Whole thread Raw
In response to Re: Creating partitions automatically at least on HASH?  (Rafia Sabih <rafia.pghackers@gmail.com>)
Responses Re: Creating partitions automatically at least on HASH?
List pgsql-hackers
Hello Rafia,

>>    CREATE TABLE Stuff (...)
>>      PARTITION BY [HASH | RANGE | LIST] (…)
>>        DO NONE -- this is the default
>>        DO [IMMEDIATE|DEFERRED] USING (…)
>>
>> Where the USING part would be generic keword value pairs, eg:
>>
>> For HASH: (MODULUS 8) and/or (NPARTS 10)
>>
>> For RANGE: (START '1970-01-01', STOP '2020-01-01', INCREMENT '1 year')
>>      and/or (START 1970, STOP 2020, NPARTS 50)
>>
>> And possibly for LIST: (IN (…), IN (…), …), or possibly some other
>> keyword.
>>
>> The "DEFERRED" could be used as an open syntax for dynamic partitioning,
>> if later someone would feel like doing it.
>>
> ISTM that "USING" is better than "WITH" because WITH is already used
>> specifically for HASH and other optional stuff in CREATE TABLE.
>>
>> The text constant would be interpreted depending on the partitioning
>> expression/column type.
>>
>> Any opinion about the overall approach?

> I happen to start a similar discussion [1] being unaware of this one 
> and there Ashutosh Sharma talked about interval partitioning in Oracle. 
> Looking
> closely it looks like we can have this automatic partitioning more
> convenient by having something similar. Basically, it is creating
> partitions on demand or lazy partitioning.

Yep, the "what" of dynamic partitioning is more or less straightforward, 
along the line you are describing.

For me there are really two questions:

  - having a extendable syntax, hence the mail I sent, which would cover
    both automatic static & dynamic partitioning and their parameters,
    given that we already have manual static, automatic static should
    be pretty easy.

  - implementing the stuff, with limited performance impact if possible
    for the dynamic case, which is non trivial.

> To explain a bit more, let's take range partition for example, first 
> parent table is created and it's interval and start and end values are 
> specified and it creates only the parent table just like it works today.

> Now, if there comes a insertion that does not belong to the existing (or 
> any, in the case of first insertion) partition(s), then the 
> corresponding partition is created,

Yep. Now, you also have to deal with race conditions issues, i.e. two 
parallel session inserting tuples that must create the same partition, and 
probably you would like to avoid a deadlock.

> I think it is extensible to other partitioning schemes as well. Also it 
> is likely to have a positive impact on the queries, because there will 
> be required partitions only and would not require to educate 
> planner/executor about many empty partitions.

Yep, but it creates other problems to solve…

-- 
Fabien.

pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: pgbench - allow to create partitioned tables
Next
From: Tom Lane
Date:
Subject: Re: Proposal: Better generation of values in GENERATED columns.