effects of nullifying bytea column on storage

Started by David Gauthieralmost 4 years ago5 messagesgeneral
Jump to latest
#1David Gauthier
dfgpostgres@gmail.com

Hi:
psql (11.5, server 11.3) on linux

I have a table with a bytea column which, of course, contains binary data.
After 60 days, I no longer need the binary data but want to retain the rest
of the record. Of course it's easy to just update the bytea column to null
for the older records. But I can almost imagine this record on disk with a
big "hole" in the middle where the bytea data used to be. Is there a PG
daemon (the vacuum ?) that will "heal the hole" in time?

Thanks !

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: David Gauthier (#1)
Re: effects of nullifying bytea column on storage

On Wednesday, May 11, 2022, David Gauthier <dfgpostgres@gmail.com> wrote:

Hi:
psql (11.5, server 11.3) on linux

I have a table with a bytea column which, of course, contains binary
data. After 60 days, I no longer need the binary data but want to retain
the rest of the record. Of course it's easy to just update the bytea
column to null for the older records. But I can almost imagine this record
on disk with a big "hole" in the middle where the bytea data used to be.
Is there a PG daemon (the vacuum ?) that will "heal the hole" in time?

The freshly written record will not have a hole - either by virtue of
variable width fields taking up basically their actual data space and also
the fact that null is treated specially in the record format.

For the old row, yes you should read up on the how and why of the vacuum
command.

David J.

#3David Gauthier
dfgpostgres@gmail.com
In reply to: David G. Johnston (#2)
Re: effects of nullifying bytea column on storage

Doesn't vacuum run automatically (or can it be set to run automatically) ?

On Wed, May 11, 2022 at 8:05 PM David G. Johnston <
david.g.johnston@gmail.com> wrote:

Show quoted text

On Wednesday, May 11, 2022, David Gauthier <dfgpostgres@gmail.com> wrote:

Hi:
psql (11.5, server 11.3) on linux

I have a table with a bytea column which, of course, contains binary
data. After 60 days, I no longer need the binary data but want to retain
the rest of the record. Of course it's easy to just update the bytea
column to null for the older records. But I can almost imagine this record
on disk with a big "hole" in the middle where the bytea data used to be.
Is there a PG daemon (the vacuum ?) that will "heal the hole" in time?

The freshly written record will not have a hole - either by virtue of
variable width fields taking up basically their actual data space and also
the fact that null is treated specially in the record format.

For the old row, yes you should read up on the how and why of the vacuum
command.

David J.

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: David Gauthier (#3)
Re: effects of nullifying bytea column on storage

On Wednesday, May 11, 2022, David Gauthier <dfgpostgres@gmail.com> wrote:

Doesn't vacuum run automatically (or can it be set to run automatically) ?

https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM

David J.

#5Hugh Ranalli
hugh@whtc.ca
In reply to: David Gauthier (#1)
Re: effects of nullifying bytea column on storage

On Wed, 11 May 2022 at 20:02, David Gauthier <dfgpostgres@gmail.com> wrote:

I have a table with a bytea column which, of course, contains binary data. After 60 days, I no longer need the binary data but want to retain the rest of the record. Of course it's easy to just update the bytea column to null for the older records. But I can almost imagine this record on disk with a big "hole" in the middle where the bytea data used to be. Is there a PG daemon (the vacuum ?) that will "heal the hole" in time?

We have a similar situation, and run a weekly task that backs up the
table, sets the outdated records to NULL, truncates the original
table, then copies the records back in. With this approach the disk
space is reclaimed immediately. The table is only written to by
automated processes, so we have a weekly maintenance window where
these aren't running. This might be a possible approach.