When do vacuumed pages/tuples become available for reuse?
If an autovacuum job on a huge table runs for 5-6 hours, do its freed
pages/tuples become available for reuse immediately when they are marked
as free, or only at the end of the multi-hour vacuum operation?
rihad <rihad@mail.ru> writes:
If an autovacuum job on a huge table runs for 5-6 hours, do its freed
pages/tuples become available for reuse immediately when they are marked
as free, or only at the end of the multi-hour vacuum operation?
They'll be freed in batches, where the size of a batch depends on the
autovacuum_work_mem or maintenance_work_mem setting. The basic
work cycle is
* scan table to find dead tuples, save their TIDs in working memory;
continue until end of table or working memory full
* scan indexes to find index entries matching those TIDs, remove 'em
* go back to table and remove the previously-found tuples
* if not end of table, repeat
So a larger work-mem setting means fewer passes over the indexes,
but a longer time until space is reclaimed.
regards, tom lane
On 04/11/2019 05:48 PM, Tom Lane wrote:
rihad <rihad@mail.ru> writes:
If an autovacuum job on a huge table runs for 5-6 hours, do its freed
pages/tuples become available for reuse immediately when they are marked
as free, or only at the end of the multi-hour vacuum operation?They'll be freed in batches, where the size of a batch depends on the
autovacuum_work_mem or maintenance_work_mem setting. The basic
work cycle is* scan table to find dead tuples, save their TIDs in working memory;
continue until end of table or working memory full
* scan indexes to find index entries matching those TIDs, remove 'em
* go back to table and remove the previously-found tuples
* if not end of table, repeatSo a larger work-mem setting means fewer passes over the indexes,
but a longer time until space is reclaimed.regards, tom lane
.
Thanks! Our autovacuum_work_mem = 1GB, so this probably means any space
would be available for reuse only at the end of the vacuum? Are there
any downsides in decreasing it to, say, 64MB? I see only pluses )
On 2019-Apr-11, rihad wrote:
Thanks! Our autovacuum_work_mem = 1GB, so this probably means any space
would be available for reuse only at the end of the vacuum? Are there any
downsides in decreasing it to, say, 64MB? I see only pluses )
Yes, each vacuum will take longer and will use much more I/O.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 04/11/2019 06:09 PM, Alvaro Herrera wrote:
On 2019-Apr-11, rihad wrote:
Thanks! Our autovacuum_work_mem = 1GB, so this probably means any space
would be available for reuse only at the end of the vacuum? Are there any
downsides in decreasing it to, say, 64MB? I see only pluses )Yes, each vacuum will take longer and will use much more I/O.
I see, thanks.
rihad <rihad@mail.ru> writes:
Thanks! Our autovacuum_work_mem = 1GB, so this probably means any space
would be available for reuse only at the end of the vacuum?
It's six bytes per dead tuple, last I checked ... you do the math.
Are there
any downsides in decreasing it to, say, 64MB? I see only pluses )
Well, usually people prefer to minimize the number of passes over
the indexes.
regards, tom lane
On 04/11/2019 06:20 PM, Tom Lane wrote:
rihad <rihad@mail.ru> writes:
Thanks! Our autovacuum_work_mem = 1GB, so this probably means any space
would be available for reuse only at the end of the vacuum?It's six bytes per dead tuple, last I checked ... you do the math.
Are there
any downsides in decreasing it to, say, 64MB? I see only pluses )Well, usually people prefer to minimize the number of passes over
the indexes.regards, tom lane
.
Yup, it's just that n_dead_tuples grows by several hundred thousand (the
table sees much much more updates than inserts) and disk usage grows
constantly between several hour long vacuum runs. Running vacuum full
isn't an option.
On 2019-Apr-11, rihad wrote:
On 04/11/2019 06:20 PM, Tom Lane wrote:
rihad <rihad@mail.ru> writes:
Thanks! Our autovacuum_work_mem = 1GB, so this probably means any space
would be available for reuse only at the end of the vacuum?It's six bytes per dead tuple, last I checked ... you do the math.
Are there
any downsides in decreasing it to, say, 64MB? I see only pluses )Well, usually people prefer to minimize the number of passes over
the indexes.
Yup, it's just that n_dead_tuples grows by several hundred thousand (the
table sees much much more updates than inserts) and disk usage grows
constantly between several hour long vacuum runs. Running vacuum full isn't
an option.
Perhaps it'd be better to vacuum this table much more often.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 04/11/2019 06:41 PM, Alvaro Herrera wrote:
On 2019-Apr-11, rihad wrote:
On 04/11/2019 06:20 PM, Tom Lane wrote:
rihad <rihad@mail.ru> writes:
Thanks! Our autovacuum_work_mem = 1GB, so this probably means any space
would be available for reuse only at the end of the vacuum?It's six bytes per dead tuple, last I checked ... you do the math.
Are there
any downsides in decreasing it to, say, 64MB? I see only pluses )Well, usually people prefer to minimize the number of passes over
the indexes.Yup, it's just that n_dead_tuples grows by several hundred thousand (the
table sees much much more updates than inserts) and disk usage grows
constantly between several hour long vacuum runs. Running vacuum full isn't
an option.Perhaps it'd be better to vacuum this table much more often.
Each run took 5-6 hours, now it takes 2-3 hours after I've tweaked some
cost-based vacuum knobs.
On 2019-Apr-11, rihad wrote:
On 04/11/2019 06:41 PM, Alvaro Herrera wrote:
Perhaps it'd be better to vacuum this table much more often.
Each run took 5-6 hours, now it takes 2-3 hours after I've tweaked some
cost-based vacuum knobs.
But how often does it run?
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 04/11/2019 07:04 PM, Alvaro Herrera wrote:
On 2019-Apr-11, rihad wrote:
On 04/11/2019 06:41 PM, Alvaro Herrera wrote:
Perhaps it'd be better to vacuum this table much more often.
Each run took 5-6 hours, now it takes 2-3 hours after I've tweaked some
cost-based vacuum knobs.But how often does it run?
One after another. Enough n_dead_tuples accumulate in between runs to
easily trigger that.
autovacuum_vacuum_scale_factor = 0.01
utovacuum_vacuum_threshold = 50
which means to run autovac when 1% of table size + 50 rows have been
updated or deleted.
But we can't make each autovacuum run run faster )
Currently I lowered this from 20ms:
autovacuum_vacuum_cost_delay = 10ms
And increased this from 200:
autovacuum_vacuum_cost_limit = 400
to make it finish in 1 hour rather than 3 hours.
Anything more than that and we risk impacting the performance of user
queries.
On Thu, Apr 11, 2019 at 10:28 AM rihad <rihad@mail.ru> wrote:
Yup, it's just that n_dead_tuples grows by several hundred thousand (the
table sees much much more updates than inserts) and disk usage grows
constantly between several hour long vacuum runs. Running vacuum full
isn't an option.
The disk usage doesn't reach a steady state after one or two autovacs? Or
it does, but you are just unhappy about the ratio between the steady state
size and the theoretical fully packed size?
Cheers,
Jeff
On 04/11/2019 07:40 PM, Jeff Janes wrote:
On Thu, Apr 11, 2019 at 10:28 AM rihad <rihad@mail.ru
<mailto:rihad@mail.ru>> wrote:Yup, it's just that n_dead_tuples grows by several hundred
thousand (the
table sees much much more updates than inserts) and disk usage grows
constantly between several hour long vacuum runs. Running vacuum full
isn't an option.The disk usage doesn't reach a steady state after one or two
autovacs? Or it does, but you are just unhappy about the ratio
between the steady state size and the theoretical fully packed size?Cheers,
Jeff
Since we dump&restore production DB daily into staging environment, the
difference in size (as reported by psql's \l+) is 11GB in a freshly
restored DB as opposed to 70GB in production.
On Thu, Apr 11, 2019 at 11:14 AM rihad <rihad@mail.ru> wrote:
autovacuum_vacuum_scale_factor = 0.01
autovacuum_vacuum_threshold = 50
This seems counterproductive. You need to make the vacuum more efficient,
not more frantic.
autovacuum_vacuum_cost_delay = 10ms
autovacuum_vacuum_cost_limit = 400
Anything more than that and we risk impacting the performance of user
queries.
Well, unbounded bloat will also impact the user queries--eventually. Was
this an experimental determination? Can you tell what about the autovac
most impacts the user queries, the reading or the writing?
You might just have more workload than your hardware can handle. There is
always going to be some fundamental limit, and while you can tune your way
up to that limit, you can't tune your way past it.
Cheers,
Jeff
On Thu, Apr 11, 2019 at 11:44 AM rihad <rihad@mail.ru> wrote:
On 04/11/2019 07:40 PM, Jeff Janes wrote:
The disk usage doesn't reach a steady state after one or two autovacs? Or
it does, but you are just unhappy about the ratio between the steady state
size and the theoretical fully packed size?Cheers,
Jeff
Since we dump&restore production DB daily into staging environment, the
difference in size (as reported by psql's \l+) is 11GB in a freshly
restored DB as opposed to 70GB in production.
Yeah, that seems like a problem. Do you have long lived
transactions/snapshots that are preventing vacuuming from removing dead
tuples? You can run a manual "vacuum verbose" and see how many dead but
nonremovable tuples there were, or set log_autovacuum_min_duration to some
non-negative value less than the autovac takes, and do the same.
(Indeed, those dumps you take daily might be the source of those long-lived
snapshots. How long does a dump take?)
Also, what does pg_freespace (
https://www.postgresql.org/docs/current/pgfreespacemap.html) show about the
available of space in the table? How about pgstattuple (
https://www.postgresql.org/docs/current/pgstattuple.html)
Cheers,
Jeff
On 04/11/2019 08:09 PM, Jeff Janes wrote:
On Thu, Apr 11, 2019 at 11:44 AM rihad <rihad@mail.ru
<mailto:rihad@mail.ru>> wrote:On 04/11/2019 07:40 PM, Jeff Janes wrote:
The disk usage doesn't reach a steady state after one or two
autovacs? Or it does, but you are just unhappy about the ratio
between the steady state size and the theoretical fully packed size?Cheers,
Jeff
Since we dump&restore production DB daily into staging
environment, the difference in size (as reported by psql's \l+) is
11GB in a freshly restored DB as opposed to 70GB in production.Yeah, that seems like a problem. Do you have long lived
transactions/snapshots that are preventing vacuuming from removing
dead tuples? You can run a manual "vacuum verbose" and see how many
dead but nonremovable tuples there were, or
set log_autovacuum_min_duration to some non-negative value less than
the autovac takes, and do the same.
vacuum frees tuples just fine. It's just that by the time each run
finishes many more accumulate due to table update activity, ad nauseum.
So this unused space constantly grows. Here's a sample autovacuum run:
2019-04-11 19:39:44.450841500 [] LOG: automatic vacuum of table
"foo.public.bar": index scans: 1
2019-04-11 19:39:44.450843500 pages: 0 removed, 472095 remain, 4
skipped due to pins, 39075 skipped frozen
2019-04-11 19:39:44.450844500 tuples: 19150 removed, 2725811 remain,
465 are dead but not yet removable
2019-04-11 19:39:44.450845500 buffer usage: 62407557 hits, 6984769
misses, 116409 dirtied
2019-04-11 19:39:44.450846500 avg read rate: 16.263 MB/s, avg write
rate: 0.271 MB/s
2019-04-11 19:39:44.450847500 system usage: CPU 59.05s/115.26u sec
elapsed 3355.28 sec
(Indeed, those dumps you take daily might be the source of those
long-lived snapshots. How long does a dump take?)
The daily dumps are taken daily from the slave server as part of stock
FreeBSD postgres port activity.
I don't think it impacts the master server.
Also, what does pg_freespace
(https://www.postgresql.org/docs/current/pgfreespacemap.html) show
about the available of space in the table? How about pgstattuple
(https://www.postgresql.org/docs/current/pgstattuple.html)
Thanks, I'll try those. But as I said freshly restored DB is only 11GB
in size, not 70 (only public schema is used).
Show quoted text
Cheers,
Jeff
vacuum frees tuples just fine. It's just that by the time each run
finishes many more accumulate due to table update activity, ad nauseum. So
this unused space constantly grows. Here's a sample autovacuum run:2019-04-11 19:39:44.450841500 [] LOG: automatic vacuum of table
"foo.public.bar": index scans: 1
2019-04-11 19:39:44.450843500 pages: 0 removed, 472095 remain, 4 skipped
due to pins, 39075 skipped frozen
2019-04-11 19:39:44.450844500 tuples: 19150 removed, 2725811 remain, 465
are dead but not yet removable
2019-04-11 19:39:44.450845500 buffer usage: 62407557 hits, 6984769
misses, 116409 dirtied
2019-04-11 19:39:44.450846500 avg read rate: 16.263 MB/s, avg write
rate: 0.271 MB/s
2019-04-11 19:39:44.450847500 system usage: CPU 59.05s/115.26u sec
elapsed 3355.28 sec
Maybe I am off base, but those read/write rates seem very low. Is this
running on spinny disks? Also, less than half a million rows remain and 2.7
million dead but not removed in this auto vacuum. It seems to indicate that
auto vacuum is not as aggressive as it needs to be. Have you verified that
it blocks normal activity when autovacuum does more work each pass? If you
are hardware bound, could you use partitioning to allow auto vacuum to work
on the partitions separately and perhaps keep up?
On 2019-Apr-11, rihad wrote:
2019-04-11 19:39:44.450844500�� tuples: 19150 removed, 2725811 remain, 465 are dead but not yet removable
What Jeff said. This vacuum spent a lot of time, only to remove miserly
19k tuples, but 2.7M dead tuples remained ... probably because you have
long-running transactions preventing vacuum from removing them. Maybe
that was pg_dump.
I suspect what you really need is keep an eye on the age of your oldest
transactions. Don't leave anything "idle in transaction".
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
2019-04-11 19:39:44.450844500 tuples: 19150 removed, 2725811 remain,
465 are dead but not yet removable
What Jeff said. This vacuum spent a lot of time, only to remove miserly
19k tuples, but 2.7M dead tuples remained ... probably because you have
long-running transactions preventing vacuum from removing them. Maybe
that was pg_dump.
Wouldn't "dead but not yet removable" be high if there were long running
transactions holding onto old row versions?
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
On 2019-Apr-11, rihad wrote:
2019-04-11 19:39:44.450844500 tuples: 19150 removed, 2725811 remain, 465 are dead but not yet removable
What Jeff said. This vacuum spent a lot of time, only to remove miserly
19k tuples, but 2.7M dead tuples remained ... probably because you have
long-running transactions preventing vacuum from removing them.
I think you misread it --- I'm pretty sure "N remain" is referring
to live tuples. Maybe we should adjust the wording to make that
clearer?
regards, tom lane