PoC: Simplify recovery after dropping a table by LOGGING the restore LSN
Hackers,
The concept was driven by an all too common support request. A user
accidentally dropped the wrong table. (this could also be applied to
dropping a database, etc).
If we had the LSN before the drop, this would be easier. So we actually
log the LSN when the lock is required so that we have an accurate LSN and
the recovery is much simpler.
All we are doing is inserting a simple LOG message:
Acquired drop table lock on table <relname>. Restore at <LSN>
This is a rough patch, very simple and effective. We are looking for
feedback.
Comments are appreciated!
Should we ALSO consider this for:
- DROP DATABASE
- TRUNCATE TABLE
- DELETE (only when it is without a WHERE clause?)
- UPDATE (only when it is without a WHERE clause?)
Regards
Andrey, Nikolay, Kirk
Attachments:
vPoC-0001-Simplify-recovery-after-dropping-a-table-by-giv.patchapplication/x-patch; name=vPoC-0001-Simplify-recovery-after-dropping-a-table-by-giv.patchDownload+8-1
Hi Andrey, Nikolay, and Kirk
On 08.11.24 04:46, Kirk Wolak wrote:
Hackers,
The concept was driven by an all too common support request. A user
accidentally dropped the wrong table. (this could also be applied to
dropping a database, etc).If we had the LSN before the drop, this would be easier. So we
actually log the LSN when the lock is required so that we have an
accurate LSN and the recovery is much simpler.
I've tested the patch it it works as described.
postgres=# CREATE TABLE t();
postgres=# SELECT pg_current_wal_lsn( );
pg_current_wal_lsn
--------------------
0/2A0677D8
(1 row)
postgres=# DROP TABLE t;
2024-11-26 14:12:28.915 CET [338209] LOG: Aquired drop table lock on
table t. Restore at 0/2A0677D8
2024-11-26 14:12:28.915 CET [338209] STATEMENT: DROP TABLE t;
A few comments:
1) non "permanent" tables
Since TEMPORARY TABLES are automatically dropped in the end of a
session, simply leaving a session that contains a TEMPORARY TABLE also
logs the LSN:
postgres=# CREATE TEMPORARY TABLE t_temporary AS SELECT
generate_series(1,100);
SELECT 100
postgres=# quit
2024-11-26 14:34:26.596 CET [386777] LOG: Aquired drop table lock on
table t_temporary. Restore at 0/2A149EC0
Dropping the TEMPORARY TABLE also triggers this log message
postgres=# CREATE TEMPORARY TABLE t_temporary AS SELECT
generate_series(1,100);
postgres=# DROP TABLE t_temporary;
2024-11-26 14:35:13.174 CET [387378] LOG: Aquired drop table lock on
table t_temporary. Restore at 0/2A1664E8
2024-11-26 14:35:13.174 CET [387378] STATEMENT: DROP TABLE t_temporary;
The same applies for UNLOGGED TABLES:
postgres=# CREATE UNLOGGED TABLE t_unlogged AS SELECT
generate_series(1,100);
postgres=# SELECT * FROM page_header(get_raw_page('public.t_unlogged',0));
lsn | checksum | flags | lower | upper | special | pagesize | version |
prune_xid
-----+----------+-------+-------+-------+---------+----------+---------+-----------
0/0 | 0 | 0 | 424 | 4992 | 8192 | 8192 | 4
| 0
(1 row)
postgres=# DROP TABLE t_unlogged;
2024-11-26 14:33:00.397 CET [338209] LOG: Aquired drop table lock on
table t_unlogged. Restore at 0/2A143DC8
2024-11-26 14:33:00.397 CET [338209] STATEMENT: DROP TABLE t_unlogged;
2) log message readability
For better readability, perhaps the table name should be logged with
double quotes, as in :
2024-11-26 13:25:35.525 CET [338209] ERROR: table "x" does not exist
2024-11-26 13:25:35.525 CET [338209] STATEMENT: DROP TABLE x;
3) format of existing log messages containing LSN
I haven't looked too deep into it, but the log format used by checkpoint
might be helpful for applications that need to parse log files. For
instance, a log entry triggered after a DROP DATABASE:
postgres=# CREATE DATABASE db;
postgres=# DROP DATABASE db;
2024-11-26 13:30:08.166 CET [329713] LOG: checkpoint starting:
immediate force wait
2024-11-26 13:30:08.177 CET [329713] LOG: checkpoint complete: wrote 3
buffers (0.0%), wrote 1 SLRU buffers; 0 WAL file(s) added, 0 removed, 1
recycled; write=0.002 s, sync=0.002 s, total=0.011 s; sync files=4,
longest=0.001 s, average=0.001 s; distance=4306 kB, estimate=4308 kB;
lsn=0/318AFD8, redo lsn=0/318AF80
This is a rough patch, very simple and effective. We are looking
for feedback.Comments are appreciated!
Should we ALSO consider this for:
- DROP DATABASE
- TRUNCATE TABLE
- DELETE (only when it is without a WHERE clause?)
- UPDATE (only when it is without a WHERE clause?)
+1
Thanks!
Best, Jim
On Fri, 8 Nov 2024 at 08:47, Kirk Wolak <wolakk@gmail.com> wrote:
Hackers,
The concept was driven by an all too common support request. A user accidentally dropped the wrong table. (this could also be applied to dropping a database, etc).If we had the LSN before the drop, this would be easier. So we actually log the LSN when the lock is required so that we have an accurate LSN and the recovery is much simpler.
All we are doing is inserting a simple LOG message:
Acquired drop table lock on table <relname>. Restore at <LSN>
This is indeed useful for the number of accidental data loss recovery.
Comments are appreciated!
Should we ALSO consider this for:
- DROP DATABASE
- TRUNCATE TABLE
- DELETE (only when it is without a WHERE clause?)
- UPDATE (only when it is without a WHERE clause?)
options 1 & 2 looks sane, but logging lsn for DELETE/UPDATE looks extra.
I am not convinced this change is necessary to be done inside
PostgreSQL. What stops us from logging all the same inside object
access hook defined by extension? This way we can define any rule on
when to log this.
There are a number of cases to consider, pointed out by Jim, such as
the TEMP table and the UNLOGGED table. [0]/messages/by-id/3bda6b56-16bd-48fe-8e23-5ef58a6a4e34@uni-muenster.de -- Best regards, Kirill Reshke
I want to highlight that we are logging the current WAL insert
pointer, which can be arbitrarily less than the actual LSN of the
commit record that deletes the table in case of high load. We first
acquire a deletion lock, then we assemble the xlog record and only
then we insert this record into the WAL file. So, the log message
should be something like 'Restore at lsn xxx or later'.
[0]: /messages/by-id/3bda6b56-16bd-48fe-8e23-5ef58a6a4e34@uni-muenster.de -- Best regards, Kirill Reshke
--
Best regards,
Kirill Reshke
Subject: Re: PoC: Simplify recovery after dropping a table by LOGGING the
restore LSN
Hi all,
I've implemented an improved version of this feature that addresses
Kirill's concern about logging the wrong LSN [1]/messages/by-id/CALdSSPiRDvbuPgZKAB1+BDb3Nfe-i3PYkiy=ScMZ7tU5f0toKQ@mail.gmail.com -- Best regards, Dmitry Lebedev.
The key difference: instead of logging WAL insert pointer at lock time,
this patch logs the actual commit LSN by extending XactCallback to pass
the commit LSN from the commit record.
Main improvements:
- Logs real commit LSN, not an earlier position
- Handles ROLLBACK correctly (no log for aborted drops)
- Supports SAVEPOINT, ROLLBACK TO, COMMIT AND CHAIN
- New GUC: log_object_drops (default: off)
- Includes DROP DATABASE support
- Comprehensive TAP tests
This makes the LSN reliable for PITR recovery regardless of system load.
Patch attached. Feedback welcome!
[1]: /messages/by-id/CALdSSPiRDvbuPgZKAB1+BDb3Nfe-i3PYkiy=ScMZ7tU5f0toKQ@mail.gmail.com -- Best regards, Dmitry Lebedev
/messages/by-id/CALdSSPiRDvbuPgZKAB1+BDb3Nfe-i3PYkiy=ScMZ7tU5f0toKQ@mail.gmail.com
--
Best regards,
Dmitry Lebedev
пт, 28 нояб. 2025 г. в 16:39, Kirill Reshke <reshkekirill@gmail.com>:
Show quoted text
On Fri, 8 Nov 2024 at 08:47, Kirk Wolak <wolakk@gmail.com> wrote:
Hackers,
The concept was driven by an all too common support request. A useraccidentally dropped the wrong table. (this could also be applied to
dropping a database, etc).If we had the LSN before the drop, this would be easier. So we
actually log the LSN when the lock is required so that we have an accurate
LSN and the recovery is much simpler.All we are doing is inserting a simple LOG message:
Acquired drop table lock on table <relname>. Restore at <LSN>This is indeed useful for the number of accidental data loss recovery.
Comments are appreciated!
Should we ALSO consider this for:
- DROP DATABASE
- TRUNCATE TABLE
- DELETE (only when it is without a WHERE clause?)
- UPDATE (only when it is without a WHERE clause?)options 1 & 2 looks sane, but logging lsn for DELETE/UPDATE looks extra.
I am not convinced this change is necessary to be done inside
PostgreSQL. What stops us from logging all the same inside object
access hook defined by extension? This way we can define any rule on
when to log this.There are a number of cases to consider, pointed out by Jim, such as
the TEMP table and the UNLOGGED table. [0]I want to highlight that we are logging the current WAL insert
pointer, which can be arbitrarily less than the actual LSN of the
commit record that deletes the table in case of high load. We first
acquire a deletion lock, then we assemble the xlog record and only
then we insert this record into the WAL file. So, the log message
should be something like 'Restore at lsn xxx or later'.[0]
/messages/by-id/3bda6b56-16bd-48fe-8e23-5ef58a6a4e34@uni-muenster.de
--
Best regards,
Kirill Reshke
Attachments:
log_object_drops.patchapplication/octet-stream; name=log_object_drops.patchDownload+800-20
On 1 Dec 2025, at 12:11, Дмитрий Лебедев <idemonlebedev@gmail.com> wrote:
I've implemented an improved version of this feature that addresses
Kirill's concern about logging the wrong LSN [1].
Cool! On a first glance patch looks good. I'll test it more later, for now I've fixed indentation and rebased on master to fix conflicts with recent "wait for LSN feature".
Please also avoid top posting :) Post answers below cited text. And do not post unrelevant citations at all. Mail archives a read more frequently than written.
Also, it's a good idea to number patch versions so reviewers can post feedback for some specific version. Here I used version number 5.
Thanks!
Best regards, Andrey Borodin.