vacuumlo
Hi All,
I need to run the vacuumlo command against our production database.
Being a PostgresQL database utility, it should be 100% safe to run and should not delete/drop active data.
I have run it in our QA environment with success, but now they are having a few application issues and I have told them that the issues cannot be related to the vacuumlo utility as it is a PostgresQL utility.
Has anyone had issues before running this utility?
Regards
Ian
Disclaimer
The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.
This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.
Hi,
On Tue, Aug 17, 2021 at 7:52 PM Ian Dauncey <Ian.Dauncey@bankzero.co.za> wrote:
I need to run the vacuumlo command against our production database.
Being a PostgresQL database utility, it should be 100% safe to run and should not delete/drop active data.
It's safe as long as you're aware of what this tool is doing. As
mentioned in https://www.postgresql.org/docs/current/vacuumlo.html:
vacuumlo is a simple utility program that will remove any “orphaned” large objects from a PostgreSQL database. An orphaned large object (LO) is considered to be any LO whose OID does not appear in any oid or lo data column of the database.
So:
I have run it in our QA environment with success, but now they are having a few application issues and I have told them that the issues cannot be related to the vacuumlo utility as it is a PostgresQL utility.
The most likely explanation is that your database somehow has large
object that are not referenced in an "oid" or "lo" column. If that's
the case, vacuumlo will delete some of your data, as you didn't you
your part of the contract required to use that tool, which is to
properly reference large objects reference.
Hi,
Just an update on my vacuumlo issue.
I did run the vacuumlo against the pg_largeobject table without any issues but afterwards I ran a vacuum full against this table which caused lots of issues.
Because the vacuum full takes an exclusive lock (which was my first mistake as I did not stop the applications accessing the database) on the table I had all the applications hanging. The next issue was it started writing out WAL logs and in the end the file system which housed the Wal logs filled up causing the vacuum to fail.
Now the issue I have here is that the vacuum full created a temporary table , and when it crashed this temporary table did not get deleted. I did rerun the vacuum full against the pg_largeobject table (and yes, I did stop all the applications first). It did complete successfully but it did not drop the previous temporary table. This table is taking close to 100 Gig of disk space.
If I backup and restore the database onto a different server this temporary table does not get restored.
My question here is.
1. How do I get rid of this temporary table without a backup and restore as this is our Prod system?
2. Is there a way of finding out the name of this temp table and matching it up to files on disk?
Any help will be appreciated
Regards
Ian.
From: Julien Rouhaud <rjuju123@gmail.com>
Sent: Tuesday, 17 August 2021 14:18
To: Ian Dauncey <Ian.Dauncey@bankzero.co.za>
Cc: pgsql-admin@lists.postgresql.org
Subject: Re: vacuumlo
External email - treat with caution
Hi,
On Tue, Aug 17, 2021 at 7:52 PM Ian Dauncey <Ian.Dauncey@bankzero.co.za<mailto:Ian.Dauncey@bankzero.co.za>> wrote:
I need to run the vacuumlo command against our production database.
Being a PostgresQL database utility, it should be 100% safe to run and should not delete/drop active data.
It's safe as long as you're aware of what this tool is doing. As
mentioned in https://www.postgresql.org/docs/current/vacuumlo.html<https://www.postgresql.org/docs/current/vacuumlo.html>
vacuumlo is a simple utility program that will remove any “orphaned” large objects from a PostgreSQL database. An orphaned large object (LO) is considered to be any LO whose OID does not appear in any oid or lo data column of the database.
So:
I have run it in our QA environment with success, but now they are having a few application issues and I have told them that the issues cannot be related to the vacuumlo utility as it is a PostgresQL utility.
The most likely explanation is that your database somehow has large
object that are not referenced in an "oid" or "lo" column. If that's
the case, vacuumlo will delete some of your data, as you didn't you
your part of the contract required to use that tool, which is to
properly reference large objects reference.
Disclaimer
The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.
This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.
Hi
Reposting my query here
Regards
Ian
From: Ian Dauncey <Ian.Dauncey@bankzero.co.za>
Sent: Monday, 30 August 2021 17:09
To: pgsql-admin@lists.postgresql.org
Subject: RE: vacuumlo
External email - treat with caution
Hi,
Just an update on my vacuumlo issue.
I did run the vacuumlo against the pg_largeobject table without any issues but afterwards I ran a vacuum full against this table which caused lots of issues.
Because the vacuum full takes an exclusive lock (which was my first mistake as I did not stop the applications accessing the database) on the table I had all the applications hanging. The next issue was it started writing out WAL logs and in the end the file system which housed the Wal logs filled up causing the vacuum to fail.
Now the issue I have here is that the vacuum full created a temporary table , and when it crashed this temporary table did not get deleted. I did rerun the vacuum full against the pg_largeobject table (and yes, I did stop all the applications first). It did complete successfully but it did not drop the previous temporary table. This table is taking close to 100 Gig of disk space.
If I backup and restore the database onto a different server this temporary table does not get restored.
My question here is.
1. How do I get rid of this temporary table without a backup and restore as this is our Prod system?
2. Is there a way of finding out the name of this temp table and matching it up to files on disk?
Any help will be appreciated
Regards
Ian.
From: Julien Rouhaud <rjuju123@gmail.com<mailto:rjuju123@gmail.com>>
Sent: Tuesday, 17 August 2021 14:18
To: Ian Dauncey <Ian.Dauncey@bankzero.co.za<mailto:Ian.Dauncey@bankzero.co.za>>
Cc: pgsql-admin@lists.postgresql.org<mailto:pgsql-admin@lists.postgresql.org>
Subject: Re: vacuumlo
External email - treat with caution
Hi,
On Tue, Aug 17, 2021 at 7:52 PM Ian Dauncey <Ian.Dauncey@bankzero.co.za<mailto:Ian.Dauncey@bankzero.co.za>> wrote:
I need to run the vacuumlo command against our production database.
Being a PostgresQL database utility, it should be 100% safe to run and should not delete/drop active data.
It's safe as long as you're aware of what this tool is doing. As
mentioned in https://www.postgresql.org/docs/current/vacuumlo.html<https://www.postgresql.org/docs/current/vacuumlo.html>
vacuumlo is a simple utility program that will remove any “orphaned” large objects from a PostgreSQL database. An orphaned large object (LO) is considered to be any LO whose OID does not appear in any oid or lo data column of the database.
So:
I have run it in our QA environment with success, but now they are having a few application issues and I have told them that the issues cannot be related to the vacuumlo utility as it is a PostgresQL utility.
The most likely explanation is that your database somehow has large
object that are not referenced in an "oid" or "lo" column. If that's
the case, vacuumlo will delete some of your data, as you didn't you
your part of the contract required to use that tool, which is to
properly reference large objects reference.
Disclaimer
The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.
This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.
Disclaimer
The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.
This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.
Ian Dauncey <Ian.Dauncey@bankzero.co.za> writes:
I did run the vacuumlo against the pg_largeobject table without any issues but afterwards I ran a vacuum full against this table which caused lots of issues.
Because the vacuum full takes an exclusive lock (which was my first mistake as I did not stop the applications accessing the database) on the table I had all the applications hanging. The next issue was it started writing out WAL logs and in the end the file system which housed the Wal logs filled up causing the vacuum to fail.
Now the issue I have here is that the vacuum full created a temporary table , and when it crashed this temporary table did not get deleted. I did rerun the vacuum full against the pg_largeobject table (and yes, I did stop all the applications first). It did complete successfully but it did not drop the previous temporary table. This table is taking close to 100 Gig of disk space.
I think you mean "file", not "temporary table". You're going to have
to remove the file by hand, likely, as there is not (I think) any live
reference to it in the catalogs. Do
select pg_relation_filenode(oid) from pg_class;
and then match up the numbers it prints out with the filenames you
find in the database's directory. You should find matches to everything
except the problem file(s). Once you've identified which is the orphaned
file, you can remove it. If there seem to be a lot of orphaned files
with different base names, STOP ... you probably are looking at the
wrong database or some other mistake. But if there's just one base name
that's not accounted for, and the sum of the sizes of the files with that
base name looks about right, then you've probably got it right.
I strongly suggest reading
https://www.postgresql.org/docs/current/storage.html
before you go messing with any files manually, so you know what
you are looking at.
regards, tom lane
On Mon, 2021-08-30 at 15:08 +0000, Ian Dauncey wrote:
I did run the vacuumlo against the pg_largeobject table without any issues but afterwards
I ran a vacuum full against this table which caused lots of issues.
Because the vacuum full takes an exclusive lock (which was my first mistake as I
did not stop the applications accessing the database) on the table I had all the
applications hanging.
Right.
The next issue was it started writing out WAL logs and in the end the file system
which housed the Wal logs filled up causing the vacuum to fail.
Now the issue I have here is that the vacuum full created a temporary table , and when
it crashed this temporary table did not get deleted. I did rerun the vacuum full against
the pg_largeobject table (and yes, I did stop all the applications first).
It did complete successfully but it did not drop the previous temporary table.
This table is taking close to 100 Gig of disk space.
VACUUM (FULL) will write a compact copy of the table, then delete the old one,
so it temporarily needs more space. It may be a good idea to put "pg_wal" into
a different file system, so that WAL space cannot run out even if the data files
fill the file system. Then PostgreSQL won't crash, and the files will be removed.
If I backup and restore the database onto a different server this temporary table does not get restored.
My question here is.
1. How do I get rid of this temporary table without a backup and restore as this is our Prod system?
You cannot :^(
If you know PostgreSQL well, and you know what you are doing, you might be able
to identify those files and manually delete them. But that's definitely not
something for the casual user: delete the wrong file, and you have to restore
the backup.
2. Is there a way of finding out the name of this temp table and matching it up to files on disk?
No, unfortunately not.
You'd have to identify the files that do *not* belong to any table, and that
is tricky.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
That can be verified by checking RELFILENODE of the table. VACUUM FULL
changes RELFILENODE:
|mgogala=# select relfilenode from pg_class where relname='emp';
relfilenode
-------------
52178
(1 row)
mgogala=# vacuum full emp;
VACUUM
mgogala=# select relfilenode from pg_class where relname='emp';
relfilenode
-------------
52182
(1 row)
|
According to unsubstantiated rumors from
https://www.postgresql.org/docs/13/catalog-pg-class.html, the definition
of RELFILENODE column is as follows:
relfilenode oid
Name of the on-disk file of this relation; zero means this is a “mapped”
relation whose disk file name is determined by low-level state
||Change of the RELFILENODE means that the relation has got a new file.
And that means that vacuum full rewrites the table, just as Laurenz has
said. Note that OID itself is NOT changed.
||
On 8/30/21 1:15 PM, Laurenz Albe wrote:
VACUUM (FULL) will write a compact copy of the table, then delete the old one,
so it temporarily needs more space. It may be a good idea to put "pg_wal" into
a different file system, so that WAL space cannot run out even if the data files
fill the file system. Then PostgreSQL won't crash, and the files will be removed.
--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com
On Mon, 2021-08-30 at 13:38 -0400, Mladen Gogala wrote:
According to unsubstantiated rumors from https://www.postgresql.org/docs/13/catalog-pg-class.html,
the definition of RELFILENODE column is as follows:relfilenode oid
Name of the on-disk file of this relation; zero means this is a “mapped” relation whose disk
file name is determined by low-level state
Change of the RELFILENODE means that the relation has got a new file. And that means that vacuum full
rewrites the table, just as Laurenz has said. Note that OID itself is NOT changed.
Please don't top-post on these lists.
I am not sure what you mean with "unsubstantiated".
Note that this will be 0 for certain crucial system tables (essentially those that have to be accessed
*before* regular metadata queries can be executed). So it is better to use the "pg_relation_filenode()"
function to find the file that belongs to a certain table.
You can use that to find which files do *not* belong to any table, index, sequence or
materialized view and then delete those. But I would think twice before manually deleting
files in the data directory.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
Hi,
You may restart the postgres services. That temp file will be deleted
automatically then.
Regards
Atul
On Monday, August 30, 2021, Ian Dauncey <Ian.Dauncey@bankzero.co.za> wrote:
Show quoted text
Hi,
Just an update on my vacuumlo issue.
I did run the vacuumlo against the pg_largeobject table without any
issues but afterwards I ran a vacuum full against this table which caused
lots of issues.Because the vacuum full takes an exclusive lock (which was my first
mistake as I did not stop the applications accessing the database) on the
table I had all the applications hanging. The next issue was it started
writing out WAL logs and in the end the file system which housed the Wal
logs filled up causing the vacuum to fail.Now the issue I have here is that the vacuum full created a temporary
table , and when it crashed this temporary table did not get deleted. I did
rerun the vacuum full against the pg_largeobject table (and yes, I did stop
all the applications first). It did complete successfully but it did not
drop the previous temporary table. This table is taking close to 100 Gig of
disk space.If I backup and restore the database onto a different server this
temporary table does not get restored.My question here is.
1. How do I get rid of this temporary table without a backup and
restore as this is our Prod system?
2. Is there a way of finding out the name of this temp table and
matching it up to files on disk?Any help will be appreciated
Regards
Ian.
*From:* Julien Rouhaud <rjuju123@gmail.com>
*Sent:* Tuesday, 17 August 2021 14:18
*To:* Ian Dauncey <Ian.Dauncey@bankzero.co.za>
*Cc:* pgsql-admin@lists.postgresql.org
*Subject:* Re: vacuumloExternal email - treat with caution
Hi,
On Tue, Aug 17, 2021 at 7:52 PM Ian Dauncey <Ian.Dauncey@bankzero.co.za>
wrote:I need to run the vacuumlo command against our production database.
Being a PostgresQL database utility, it should be 100% safe to run and
should not delete/drop active data.
It's safe as long as you're aware of what this tool is doing. As
mentioned in https://www.postgresql.org/docs/current/vacuumlo.htmlvacuumlo is a simple utility program that will remove any “orphaned”
large objects from a PostgreSQL database. An orphaned large object (LO) is
considered to be any LO whose OID does not appear in any oid or lo data
column of the database.So:
I have run it in our QA environment with success, but now they are
having a few application issues and I have told them that the issues cannot
be related to the vacuumlo utility as it is a PostgresQL utility.The most likely explanation is that your database somehow has large
object that are not referenced in an "oid" or "lo" column. If that's
the case, vacuumlo will delete some of your data, as you didn't you
your part of the contract required to use that tool, which is to
properly reference large objects reference.*Disclaimer*
The information contained in this communication from the sender is
confidential. It is intended solely for use by the recipient and others
authorized to receive it. If you are not the recipient, you are hereby
notified that any disclosure, copying, distribution or taking action in
relation of the contents of this information is strictly prohibited and may
be unlawful.This email has been scanned for viruses and malware, and may have been
automatically archived by Mimecast, a leader in email security and cyber
resilience. Mimecast integrates email defenses with brand protection,
security awareness training, web security, compliance and other essential
capabilities. Mimecast helps protect large and small organizations from
malicious activity, human error and technology failure; and to lead the
movement toward building a more resilient world. To find out more, visit
our website.
On Tue, 2021-08-31 at 01:11 +0530, Atul Kumar wrote:
You may restart the postgres services. That temp file will be deleted automatically then.
No - orphaned files left behind from a crash will not be removed during a restart.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
Ok, but after restart it is safe to remove such files.
On Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
Show quoted text
On Tue, 2021-08-31 at 01:11 +0530, Atul Kumar wrote:
You may restart the postgres services. That temp file will be deleted
automatically then.
No - orphaned files left behind from a crash will not be removed during a
restart.Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
On Tue, 2021-08-31 at 11:15 +0530, Atul Kumar wrote:
On Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Tue, 2021-08-31 at 01:11 +0530, Atul Kumar wrote:
You may restart the postgres services. That temp file will be deleted automatically then.No - orphaned files left behind from a crash will not be removed during a restart.
Ok, but after restart it is safe to remove such files.
It is always save to remove those files.
The problem is to identify them.
Remove the wrong file, and your cluster is toast.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
Hi,
Don’t you think it as a drawback of postgres that such temporary files
don’t get deleted automatically once the transaction gets committed or
rolled back .
Could you suggest a good permanent solution for such case.
Regards
Atul
On Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
Show quoted text
On Tue, 2021-08-31 at 11:15 +0530, Atul Kumar wrote:
On Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at>
wrote:
On Tue, 2021-08-31 at 01:11 +0530, Atul Kumar wrote:
You may restart the postgres services. That temp file will be deletedautomatically then.
No - orphaned files left behind from a crash will not be removed
during a restart.
Ok, but after restart it is safe to remove such files.
It is always save to remove those files.
The problem is to identify them.
Remove the wrong file, and your cluster is toast.Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
You are right
On Thu, Sep 2, 2021 at 10:56 AM Atul Kumar <akumar14871@gmail.com> wrote:
Hi,
Don’t you think it as a drawback of postgres that such temporary files
don’t get deleted automatically once the transaction gets committed or
rolled back .Could you suggest a good permanent solution for such case.
Regards
AtulOn Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at>
wrote:On Tue, 2021-08-31 at 11:15 +0530, Atul Kumar wrote:
On Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at>
wrote:
On Tue, 2021-08-31 at 01:11 +0530, Atul Kumar wrote:
You may restart the postgres services. That temp file will be deletedautomatically then.
No - orphaned files left behind from a crash will not be removed
during a restart.
Ok, but after restart it is safe to remove such files.
It is always save to remove those files.
The problem is to identify them.
Remove the wrong file, and your cluster is toast.Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com--
Shaish Guni
Le jeu. 2 sept. 2021 à 07:11, Atul Kumar <akumar14871@gmail.com> a écrit :
Hi,
Don’t you think it as a drawback of postgres that such temporary files
don’t get deleted automatically once the transaction gets committed or
rolled back .
They are automatically removed in case of a proper commit or rollback. They
aren't in case of a crash.
Could you suggest a good permanent solution for such case.
You should try https://github.com/bdrouvot/pg_orphaned to find the list of
orphaned files. It does a pretty good job at it.
--
Guillaume.
Morning.
Thanks for all the replies.
What I did to remove these files was to backup of the DB, drop the DB and then I restored the DB.
Regards
Ian
From: Guillaume Lelarge <guillaume@lelarge.info>
Sent: Thursday, 02 September 2021 09:33
To: Atul Kumar <akumar14871@gmail.com>
Cc: Laurenz Albe <laurenz.albe@cybertec.at>; Ian Dauncey <Ian.Dauncey@bankzero.co.za>; pgsql-admin@lists.postgresql.org
Subject: Re: vacuumlo
External email - treat with caution
Le jeu. 2 sept. 2021 à 07:11, Atul Kumar <akumar14871@gmail.com<mailto:akumar14871@gmail.com>> a écrit :
Hi,
Don’t you think it as a drawback of postgres that such temporary files don’t get deleted automatically once the transaction gets committed or rolled back .
They are automatically removed in case of a proper commit or rollback. They aren't in case of a crash.
Could you suggest a good permanent solution for such case.
You should try https://github.com/bdrouvot/pg_orphaned<https://github.com/bdrouvot/pg_orphaned> to find the list of orphaned files. It does a pretty good job at it.
--
Guillaume.
Disclaimer
The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.
This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.