Thread: Suggestion: provide a "TRUNCATE PARTITION" command

Suggestion: provide a "TRUNCATE PARTITION" command

From
Thomas Kellerer
Date:
Hello,

I wonder if it made sense to add a "TRUNCATE PARTITION" command to Postgres?

Especially during bulk loads it's more efficient to TRUNCATE a partition if I know I want to replace all rows, rather
thandoing a DELETE.
 

Currently this requires dynamic SQL which isn't always feasible (and might get complicated quickly).

So I was thinking that a new command to allow truncating partitions by identifying the partitions by "value" rather by
namemight be helpful in that case.
 

Something along the lines of:

     truncate partitions of base_table
     for values in (...);

If the IN part allowed for sub-queries then this could be used to gather the partition keys from e.g. a staging table.
In my naive understanding, I would think the current partition pruning code (e.g. that picks the partitions when
runningDELETE) could be used to identify the target partitions and then this list would be used to TRUNCATE all
resultingpartitions.
 

What does the community think about this?

Thomas



Re: Suggestion: provide a "TRUNCATE PARTITION" command

From
legrand legrand
Date:

Re: Suggestion: provide a "TRUNCATE PARTITION" command

From
Thomas Kellerer
Date:
legrand legrand schrieb am 08.01.2021 um 14:57:> maybe a naïve plpgsql as proposed in
> https://www.postgresql-archive.org/Partitionning-support-for-Truncate-Table-WHERE-td5933642.html
> may be an answer

Yes I am aware of that (and that's what I have used so far) - I just thought it would make life easier if it didn't
requiredynamic SQL 

Thomas




Re: Suggestion: provide a "TRUNCATE PARTITION" command

From
Michael Lewis
Date:
On Fri, Jan 8, 2021 at 2:36 AM Thomas Kellerer <shammat@gmx.net> wrote:
Hello,

I wonder if it made sense to add a "TRUNCATE PARTITION" command to Postgres?

Especially during bulk loads it's more efficient to TRUNCATE a partition if I know I want to replace all rows, rather than doing a DELETE.

Currently this requires dynamic SQL which isn't always feasible (and might get complicated quickly).

So I was thinking that a new command to allow truncating partitions by identifying the partitions by "value" rather by name might be helpful in that case.

Something along the lines of:

     truncate partitions of base_table
     for values in (...);

If the IN part allowed for sub-queries then this could be used to gather the partition keys from e.g. a staging table.

For me, it seems too easily error prone such that a single typo in the IN clause may result in an entire partition being removed that wasn't supposed to be targeted. Given the user still needs to manually generate that list somehow, I don't see it as a huge effort to query the partitions and run individual commands to truncate or detach several partitions manually.

Unless it is in the SQL standard, or allows users to do something that cannot be easily done otherwise, I see it as a bell / whistle that would unnecessarily complicate the code. Writing a function that finds the partition table names and gives back a string with the text of the DDL commands that needs to be run is simple, and would encourage the user to review which tables are targeted for truncate command.

Re: Suggestion: provide a "TRUNCATE PARTITION" command

From
Thomas Kellerer
Date:
Michael Lewis schrieb am 08.01.2021 um 16:32:
> On Fri, Jan 8, 2021 at 2:36 AM Thomas Kellerer <shammat@gmx.net <mailto:shammat@gmx.net>> wrote:
>
>     Hello,
>
>     I wonder if it made sense to add a "TRUNCATE PARTITION" command to Postgres?
>
>     Especially during bulk loads it's more efficient to TRUNCATE a partition if I know I want to replace all rows,
ratherthan doing a DELETE. 
>
>     Currently this requires dynamic SQL which isn't always feasible (and might get complicated quickly).
>
>     So I was thinking that a new command to allow truncating partitions by identifying the partitions by "value"
ratherby name might be helpful in that case. 
>
>     Something along the lines of:
>
>           truncate partitions of base_table
>           for values in (...);
>
>     If the IN part allowed for sub-queries then this could be used to gather the partition keys from e.g. a staging
table.
>
>
> For me, it seems too easily error prone such that a single typo in
> the IN clause may result in an entire partition being removed that
> wasn't supposed to be targeted.

I don't see how this is more dangerous then:

     delete from base_table
     where partition_key in (...);

which would serve the same purpose, albeit less efficient.

> Given the user still needs to
> manually generate that list somehow, I don't see it as a huge effort
> to query the partitions and run individual commands to truncate or
> detach several partitions manually.

Well, the list could come from e.g. a staging table, e.g. "for values IN (select some_column from staging_table)"




Re: Suggestion: provide a "TRUNCATE PARTITION" command

From
Michael Lewis
Date:
On Fri, Jan 8, 2021 at 9:38 AM Thomas Kellerer <shammat@gmx.net> wrote:
Michael Lewis schrieb am 08.01.2021 um 16:32:
> On Fri, Jan 8, 2021 at 2:36 AM Thomas Kellerer <shammat@gmx.net <mailto:shammat@gmx.net>> wrote:
>
>     Hello,
>
>     I wonder if it made sense to add a "TRUNCATE PARTITION" command to Postgres?
>
>     Especially during bulk loads it's more efficient to TRUNCATE a partition if I know I want to replace all rows, rather than doing a DELETE.
>
>     Currently this requires dynamic SQL which isn't always feasible (and might get complicated quickly).
>
>     So I was thinking that a new command to allow truncating partitions by identifying the partitions by "value" rather by name might be helpful in that case.
>
>     Something along the lines of:
>
>           truncate partitions of base_table
>           for values in (...);
>
>     If the IN part allowed for sub-queries then this could be used to gather the partition keys from e.g. a staging table.
>
>
> For me, it seems too easily error prone such that a single typo in
> the IN clause may result in an entire partition being removed that
> wasn't supposed to be targeted.

I don't see how this is more dangerous then:

     delete from base_table
     where partition_key in (...);

which would serve the same purpose, albeit less efficient.

Delete has a rollback option, and you can dry-run to see impacted rows effectively. Truncate does not.

With delete, you are being more explicit about which rows match and need removal. By looking at the command, you know exactly what is expected to happen. With the request to find partitions based on values, you may be impacting MUCH more data than you meant to. If you think you have monthly range partitions and actually have year partitions, the truncate could be a disaster with removing more data than you intended.

It just seems like a foot gun to me, and not one that is particularly needed since the same result can be achieved easily in two steps. One to generate to explicit commands that will be run, and one to run them.

Re: Suggestion: provide a "TRUNCATE PARTITION" command

From
Thomas Kellerer
Date:
Michael Lewis schrieb am 08.01.2021 um 17:47:
>      > For me, it seems too easily error prone such that a single typo in
>      > the IN clause may result in an entire partition being removed that
>      > wasn't supposed to be targeted.
>
>     I don't see how this is more dangerous then:
>
>           delete from base_table
>           where partition_key in (...);
>
>     which would serve the same purpose, albeit less efficient.
>
>
> Delete has a rollback option, and you can dry-run to see impacted rows effectively. Truncate does not.

TRUNCATE can be rolled back as well.



Re: Suggestion: provide a "TRUNCATE PARTITION" command

From
legrand legrand
Date:
What is interesting here with the TRUNCATE WHERE (and in the proposed
plpgsql) is to offer the end user a way to perform a transparent truncate or
delete totally independent of the partitioning scheme (if any, or even if it
has changed).



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html



Re: Suggestion: provide a "TRUNCATE PARTITION" command

From
Michael Lewis
Date:


On Fri, Jan 8, 2021 at 10:12 AM Thomas Kellerer <shammat@gmx.net> wrote:
Michael Lewis schrieb am 08.01.2021 um 17:47:
>      > For me, it seems too easily error prone such that a single typo in
>      > the IN clause may result in an entire partition being removed that
>      > wasn't supposed to be targeted.
>
>     I don't see how this is more dangerous then:
>
>           delete from base_table
>           where partition_key in (...);
>
>     which would serve the same purpose, albeit less efficient.
>
>
> Delete has a rollback option, and you can dry-run to see impacted rows effectively. Truncate does not.

TRUNCATE can be rolled back as well.

My apologies. There are other concerns with concurrent transactions, but you are correct that it can be rolled back.

Still, no feedback on the effect that a truncate call is having on the DB and may be doing more than intended fairly easily. I am not in the hackers group so I couldn't say this feature would not be implemented. It just seems unlikely given the philosophies of that group.

Re: Suggestion: provide a "TRUNCATE PARTITION" command

From
Thiemo Kellner
Date:

Quoting Michael Lewis <mlewis@entrata.com>:


Still, no feedback on the effect that a truncate call is having on the DB and may be doing more than intended fairly easily. I am not in the hackers group so I couldn't say this feature would not be implemented. It just seems unlikely given the philosophies of that group.

I would not feel bad to have a more efficient option but possibly a more dangerous one. Projects/application could setup policies about what may be done in which way and what not.

S/MIME Public Key: https://oc.gelassene-pferde.biz/index.php/s/eJuAUFONag6ofnH
Signal (Safer than WhatsApp): +49 1578 7723737
Threema (Safer than WhatsApp): A76MKH3J
Handys: +41 78 947 36 21 | +49 1578 772 37 37
Attachment