Suggestion: provide a "TRUNCATE PARTITION" command

Started by Thomas Kellererabout 5 years ago10 messagesgeneral
Jump to latest
#1Thomas Kellerer
shammat@gmx.net

Hello,

I wonder if it made sense to add a "TRUNCATE PARTITION" command to Postgres?

Especially during bulk loads it's more efficient to TRUNCATE a partition if I know I want to replace all rows, rather than doing a DELETE.

Currently this requires dynamic SQL which isn't always feasible (and might get complicated quickly).

So I was thinking that a new command to allow truncating partitions by identifying the partitions by "value" rather by name might be helpful in that case.

Something along the lines of:

truncate partitions of base_table
for values in (...);

If the IN part allowed for sub-queries then this could be used to gather the partition keys from e.g. a staging table.
In my naive understanding, I would think the current partition pruning code (e.g. that picks the partitions when running DELETE) could be used to identify the target partitions and then this list would be used to TRUNCATE all resulting partitions.

What does the community think about this?

Thomas

#2legrand legrand
legrand_legrand@hotmail.com
In reply to: Thomas Kellerer (#1)
Re: Suggestion: provide a "TRUNCATE PARTITION" command
#3Thomas Kellerer
shammat@gmx.net
In reply to: legrand legrand (#2)
Re: Suggestion: provide a "TRUNCATE PARTITION" command

legrand legrand schrieb am 08.01.2021 um 14:57:> maybe a naïve plpgsql as proposed in

https://www.postgresql-archive.org/Partitionning-support-for-Truncate-Table-WHERE-td5933642.html
may be an answer

Yes I am aware of that (and that's what I have used so far) - I just thought it would make life easier if it didn't require dynamic SQL

Thomas

#4Michael Lewis
mlewis@entrata.com
In reply to: Thomas Kellerer (#1)
Re: Suggestion: provide a "TRUNCATE PARTITION" command

On Fri, Jan 8, 2021 at 2:36 AM Thomas Kellerer <shammat@gmx.net> wrote:

Hello,

I wonder if it made sense to add a "TRUNCATE PARTITION" command to
Postgres?

Especially during bulk loads it's more efficient to TRUNCATE a partition
if I know I want to replace all rows, rather than doing a DELETE.

Currently this requires dynamic SQL which isn't always feasible (and might
get complicated quickly).

So I was thinking that a new command to allow truncating partitions by
identifying the partitions by "value" rather by name might be helpful in
that case.

Something along the lines of:

truncate partitions of base_table
for values in (...);

If the IN part allowed for sub-queries then this could be used to gather
the partition keys from e.g. a staging table.

For me, it seems too easily error prone such that a single typo in the IN
clause may result in an entire partition being removed that wasn't supposed
to be targeted. Given the user still needs to manually generate that list
somehow, I don't see it as a huge effort to query the partitions and run
individual commands to truncate or detach several partitions manually.

Unless it is in the SQL standard, or allows users to do something that
cannot be easily done otherwise, I see it as a bell / whistle that would
unnecessarily complicate the code. Writing a function that finds the
partition table names and gives back a string with the text of the DDL
commands that needs to be run is simple, and would encourage the user to
review which tables are targeted for truncate command.

#5Thomas Kellerer
shammat@gmx.net
In reply to: Michael Lewis (#4)
Re: Suggestion: provide a "TRUNCATE PARTITION" command

Michael Lewis schrieb am 08.01.2021 um 16:32:

On Fri, Jan 8, 2021 at 2:36 AM Thomas Kellerer <shammat@gmx.net <mailto:shammat@gmx.net>> wrote:

Hello,

I wonder if it made sense to add a "TRUNCATE PARTITION" command to Postgres?

Especially during bulk loads it's more efficient to TRUNCATE a partition if I know I want to replace all rows, rather than doing a DELETE.

Currently this requires dynamic SQL which isn't always feasible (and might get complicated quickly).

So I was thinking that a new command to allow truncating partitions by identifying the partitions by "value" rather by name might be helpful in that case.

Something along the lines of:

     truncate partitions of base_table
     for values in (...);

If the IN part allowed for sub-queries then this could be used to gather the partition keys from e.g. a staging table.

For me, it seems too easily error prone such that a single typo in
the IN clause may result in an entire partition being removed that
wasn't supposed to be targeted.

I don't see how this is more dangerous then:

delete from base_table
where partition_key in (...);

which would serve the same purpose, albeit less efficient.

Given the user still needs to
manually generate that list somehow, I don't see it as a huge effort
to query the partitions and run individual commands to truncate or
detach several partitions manually.

Well, the list could come from e.g. a staging table, e.g. "for values IN (select some_column from staging_table)"

#6Michael Lewis
mlewis@entrata.com
In reply to: Thomas Kellerer (#5)
Re: Suggestion: provide a "TRUNCATE PARTITION" command

On Fri, Jan 8, 2021 at 9:38 AM Thomas Kellerer <shammat@gmx.net> wrote:

Michael Lewis schrieb am 08.01.2021 um 16:32:

On Fri, Jan 8, 2021 at 2:36 AM Thomas Kellerer <shammat@gmx.net <mailto:

shammat@gmx.net>> wrote:

Hello,

I wonder if it made sense to add a "TRUNCATE PARTITION" command to

Postgres?

Especially during bulk loads it's more efficient to TRUNCATE a

partition if I know I want to replace all rows, rather than doing a DELETE.

Currently this requires dynamic SQL which isn't always feasible (and

might get complicated quickly).

So I was thinking that a new command to allow truncating partitions

by identifying the partitions by "value" rather by name might be helpful in
that case.

Something along the lines of:

truncate partitions of base_table
for values in (...);

If the IN part allowed for sub-queries then this could be used to

gather the partition keys from e.g. a staging table.

For me, it seems too easily error prone such that a single typo in
the IN clause may result in an entire partition being removed that
wasn't supposed to be targeted.

I don't see how this is more dangerous then:

delete from base_table
where partition_key in (...);

which would serve the same purpose, albeit less efficient.

Delete has a rollback option, and you can dry-run to see impacted rows
effectively. Truncate does not.

With delete, you are being more explicit about which rows match and need
removal. By looking at the command, you know exactly what is expected to
happen. With the request to find partitions based on values, you may be
impacting MUCH more data than you meant to. If you think you have monthly
range partitions and actually have year partitions, the truncate could be a
disaster with removing more data than you intended.

It just seems like a foot gun to me, and not one that is particularly
needed since the same result can be achieved easily in two steps. One to
generate to explicit commands that will be run, and one to run them.

#7Thomas Kellerer
shammat@gmx.net
In reply to: Michael Lewis (#6)
Re: Suggestion: provide a "TRUNCATE PARTITION" command

Michael Lewis schrieb am 08.01.2021 um 17:47:

For me, it seems too easily error prone such that a single typo in
the IN clause may result in an entire partition being removed that
wasn't supposed to be targeted.

I don't see how this is more dangerous then:

     delete from base_table
     where partition_key in (...);

which would serve the same purpose, albeit less efficient.

Delete has a rollback option, and you can dry-run to see impacted rows effectively. Truncate does not.

TRUNCATE can be rolled back as well.

#8legrand legrand
legrand_legrand@hotmail.com
In reply to: Thomas Kellerer (#7)
Re: Suggestion: provide a "TRUNCATE PARTITION" command

What is interesting here with the TRUNCATE WHERE (and in the proposed
plpgsql) is to offer the end user a way to perform a transparent truncate or
delete totally independent of the partitioning scheme (if any, or even if it
has changed).

--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

#9Michael Lewis
mlewis@entrata.com
In reply to: Thomas Kellerer (#7)
Re: Suggestion: provide a "TRUNCATE PARTITION" command

On Fri, Jan 8, 2021 at 10:12 AM Thomas Kellerer <shammat@gmx.net> wrote:

Michael Lewis schrieb am 08.01.2021 um 17:47:

For me, it seems too easily error prone such that a single typo in
the IN clause may result in an entire partition being removed that
wasn't supposed to be targeted.

I don't see how this is more dangerous then:

delete from base_table
where partition_key in (...);

which would serve the same purpose, albeit less efficient.

Delete has a rollback option, and you can dry-run to see impacted rows

effectively. Truncate does not.

TRUNCATE can be rolled back as well.

My apologies. There are other concerns with concurrent transactions, but
you are correct that it can be rolled back.

Still, no feedback on the effect that a truncate call is having on the DB
and may be doing more than intended fairly easily. I am not in the hackers
group so I couldn't say this feature would not be implemented. It just
seems unlikely given the philosophies of that group.

#10Thiemo Kellner
thiemo@gelassene-pferde.biz
In reply to: Michael Lewis (#9)
Re: Suggestion: provide a "TRUNCATE PARTITION" command

Quoting Michael Lewis <mlewis@entrata.com>:

Still, no feedback on the effect that a truncate call is having on
the DB and may be doing more than intended fairly easily. I am not
in the hackers group so I couldn't say this feature would not be
implemented. It just seems unlikely given the philosophies of that
group.

I would not feel bad to have a more efficient option but possibly a
more dangerous one. Projects/application could setup policies about
what may be done in which way and what not.
S/MIME Public Key:
https://oc.gelassene-pferde.biz/index.php/s/eJuAUFONag6ofnH
Signal (Safer than WhatsApp): +49 1578 7723737
Threema (Safer than WhatsApp): A76MKH3J
Handys: +41 78 947 36 21 | +49 1578 772 37 37

Attachments:

smime.p7sapplication/pkcs7-signature; name=smime.p7sDownload