cost based vacuum (parallel)

Started by Amit Kapilaover 6 years ago44 messageshackers
Jump to latest
#1Amit Kapila
amit.kapila16@gmail.com

For parallel vacuum [1]https://commitfest.postgresql.org/25/1774/, we were discussing what is the best way to
divide the cost among parallel workers but we didn't get many inputs
apart from people who are very actively involved in patch development.
I feel that we need some more inputs before we finalize anything, so
starting a new thread.

The initial version of the patch has a very rudimentary way of doing
it which means each parallel vacuum worker operates independently
w.r.t vacuum delay and cost. This will lead to more I/O in the system
than the user has intended to do. Assume that the overall I/O allowed
for vacuum operation is X after which it will sleep for some time,
reset the balance and continue. In the patch, each worker will be
allowed to perform X before which it can sleep and also there is no
coordination for the same with master backend which would have done
some I/O for the heap. So, in the worst-case scenario, there can be n
times more I/O where n is the number of workers doing the parallel
operation. This is somewhat similar to a memory usage problem with a
parallel query where each worker is allowed to use up to work_mem of
memory. We can say that the users using parallel operation can expect
more system resources to be used as they want to get the operation
done faster, so we are fine with this. However, I am not sure if that
is the right thing, so we should try to come up with some solution for
it and if the solution is too complex, then probably we can think of
documenting such behavior.

The two approaches to solve this problem being discussed in that
thread [1]https://commitfest.postgresql.org/25/1774/ are as follows:
(a) Allow the parallel workers and master backend to have a shared
view of vacuum cost related parameters (mainly VacuumCostBalance) and
allow each worker to update it and then based on that decide whether
it needs to sleep. Sawada-San has done the POC for this approach.
See v32-0004-PoC-shared-vacuum-cost-balance in email [2]/messages/by-id/CAD21AoAqT17QwKJ_sWOqRxNvg66wMw1oZZzf9Rt-E-zD+XOh_Q@mail.gmail.com. One
drawback of this approach could be that we allow the worker to sleep
even though the I/O has been performed by some other worker.

(b) The other idea could be that we split the I/O among workers
something similar to what we do for auto vacuum workers (see
autovac_balance_cost). The basic idea would be that before launching
workers, we need to compute the remaining I/O (heap operation would
have used something) after which we need to sleep and split it equally
across workers. Here, we are primarily thinking of dividing
VacuumCostBalance and VacuumCostLimit parameters. Once the workers
are finished, they need to let master backend know how much I/O they
have consumed and then master backend can add it to it's current I/O
consumed. I think we also need to rebalance the cost of remaining
workers once some of the worker's exit. Dilip has prepared a POC
patch for this, see 0002-POC-divide-vacuum-cost-limit in email [3]/messages/by-id/CAFiTN-thU-z8f04jO7xGMu5yUUpTpsBTvBrFW6EhRf-jGvEz=g@mail.gmail.com.

I think approach-2 is better in throttling the system as it doesn't
have the drawback of the first approach, but it might be a bit tricky
to implement.

As of now, the POC for both the approaches has been developed and we
see similar results for both approaches, but we have tested simpler
cases where each worker has similar amount of I/O to perform.

Thoughts?

[1]: https://commitfest.postgresql.org/25/1774/
[2]: /messages/by-id/CAD21AoAqT17QwKJ_sWOqRxNvg66wMw1oZZzf9Rt-E-zD+XOh_Q@mail.gmail.com
[3]: /messages/by-id/CAFiTN-thU-z8f04jO7xGMu5yUUpTpsBTvBrFW6EhRf-jGvEz=g@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In reply to: Amit Kapila (#1)
Re: cost based vacuum (parallel)

This is somewhat similar to a memory usage problem with a
parallel query where each worker is allowed to use up to work_mem of
memory. We can say that the users using parallel operation can expect
more system resources to be used as they want to get the operation
done faster, so we are fine with this. However, I am not sure if that
is the right thing, so we should try to come up with some solution for
it and if the solution is too complex, then probably we can think of
documenting such behavior.

In cloud environments (Amazon + gp2) there's a budget on input/output
operations. If you cross it for long time, everything starts looking like
you work with a floppy disk.

For the ease of configuration, I would need a "max_vacuum_disk_iops" that
would limit number of input-output operations by all of the vacuums in the
system. If I set it to less than value of budget refill, I can be sure than
that no vacuum runs too fast to impact any sibling query.

There's also value in non-throttled VACUUM for smaller tables. On gp2 such
things will be consumed out of surge budget, and its size is known to
sysadmin. Let's call it "max_vacuum_disk_surge_iops" - if a relation has
less blocks than this value and it's a blocking in any way situation
(antiwraparound, interactive console, ...) - go on and run without
throttling.

For how to balance the cost: if we know a number of vacuum processes that
were running in the previous second, we can just divide a slot for this
iteration by that previous number.

To correct for overshots, we can subtract the previous second's overshot
from next one's. That would also allow to account for surge budget usage
and let it refill, pausing all autovacuum after a manual one for some time.

Precision of accounting limiting count of operations more than once a
second isn't beneficial for this use case.

Please don't forget that processing one page can become several iops (read,
write, wal).

Does this make sense? :)

#3Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#1)
Re: cost based vacuum (parallel)

On Mon, Nov 4, 2019 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

I think approach-2 is better in throttling the system as it doesn't
have the drawback of the first approach, but it might be a bit tricky
to implement.

I might be missing something but I think that there could be the
drawback of the approach-1 even on approach-2 depending on index pages
loaded on the shared buffer and the vacuum delay setting. Is it right?

Regards,

---
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#4Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#3)
Re: cost based vacuum (parallel)

On Mon, Nov 4, 2019 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Nov 4, 2019 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

I think approach-2 is better in throttling the system as it doesn't
have the drawback of the first approach, but it might be a bit tricky
to implement.

I might be missing something but I think that there could be the
drawback of the approach-1 even on approach-2 depending on index pages
loaded on the shared buffer and the vacuum delay setting.

Can you be a bit more specific about this?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#5Amit Kapila
amit.kapila16@gmail.com
In reply to: Darafei "Komяpa" Praliaskouski (#2)
Re: cost based vacuum (parallel)

On Mon, Nov 4, 2019 at 1:03 PM Darafei "Komяpa" Praliaskouski
<me@komzpa.net> wrote:

This is somewhat similar to a memory usage problem with a
parallel query where each worker is allowed to use up to work_mem of
memory. We can say that the users using parallel operation can expect
more system resources to be used as they want to get the operation
done faster, so we are fine with this. However, I am not sure if that
is the right thing, so we should try to come up with some solution for
it and if the solution is too complex, then probably we can think of
documenting such behavior.

In cloud environments (Amazon + gp2) there's a budget on input/output operations. If you cross it for long time, everything starts looking like you work with a floppy disk.

For the ease of configuration, I would need a "max_vacuum_disk_iops" that would limit number of input-output operations by all of the vacuums in the system. If I set it to less than value of budget refill, I can be sure than that no vacuum runs too fast to impact any sibling query.

There's also value in non-throttled VACUUM for smaller tables. On gp2 such things will be consumed out of surge budget, and its size is known to sysadmin. Let's call it "max_vacuum_disk_surge_iops" - if a relation has less blocks than this value and it's a blocking in any way situation (antiwraparound, interactive console, ...) - go on and run without throttling.

I think the need for these things can be addressed by current
cost-based-vacuum parameters. See docs [1]https://www.postgresql.org/docs/devel/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-VACUUM-COST. For example, if you set
vacuum_cost_delay as zero, it will allow the operation to be performed
without throttling.

For how to balance the cost: if we know a number of vacuum processes that were running in the previous second, we can just divide a slot for this iteration by that previous number.

To correct for overshots, we can subtract the previous second's overshot from next one's. That would also allow to account for surge budget usage and let it refill, pausing all autovacuum after a manual one for some time.

Precision of accounting limiting count of operations more than once a second isn't beneficial for this use case.

I think it is better if we find a way to rebalance the cost on some
worker exit rather than every second as anyway it won't change unless
any worker exits.

[1]: https://www.postgresql.org/docs/devel/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-VACUUM-COST

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#6Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#4)
Re: cost based vacuum (parallel)

On Mon, 4 Nov 2019 at 19:26, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 4, 2019 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Nov 4, 2019 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

I think approach-2 is better in throttling the system as it doesn't
have the drawback of the first approach, but it might be a bit tricky
to implement.

I might be missing something but I think that there could be the
drawback of the approach-1 even on approach-2 depending on index pages
loaded on the shared buffer and the vacuum delay setting.

Can you be a bit more specific about this?

Suppose there are two indexes: one index is loaded at all while
another index isn't. One vacuum worker who processes the former index
hits all pages on the shared buffer but another worker who processes
the latter index read all pages from either OS page cache or disk.
Even if both the cost limit and the cost balance are split evenly
among workers because the cost of page hits and page misses are
different it's possible that one vacuum worker sleeps while other
workers doing I/O.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#7Jeff Janes
jeff.janes@gmail.com
In reply to: Amit Kapila (#1)
Re: cost based vacuum (parallel)

On Mon, Nov 4, 2019 at 1:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

For parallel vacuum [1], we were discussing what is the best way to
divide the cost among parallel workers but we didn't get many inputs
apart from people who are very actively involved in patch development.
I feel that we need some more inputs before we finalize anything, so
starting a new thread.

Maybe a I just don't have experience in the type of system that parallel
vacuum is needed for, but if there is any meaningful IO throttling which is
active, then what is the point of doing the vacuum in parallel in the first
place?

Cheers,

Jeff

#8Andres Freund
andres@anarazel.de
In reply to: Amit Kapila (#1)
Re: cost based vacuum (parallel)

Hi,

On 2019-11-04 12:24:35 +0530, Amit Kapila wrote:

For parallel vacuum [1], we were discussing what is the best way to
divide the cost among parallel workers but we didn't get many inputs
apart from people who are very actively involved in patch development.
I feel that we need some more inputs before we finalize anything, so
starting a new thread.

The initial version of the patch has a very rudimentary way of doing
it which means each parallel vacuum worker operates independently
w.r.t vacuum delay and cost.

Yea, that seems not ok for cases where vacuum delay is active.

There's also the question of when/why it is beneficial to use
parallelism when you're going to encounter IO limits in all likelihood.

This will lead to more I/O in the system
than the user has intended to do. Assume that the overall I/O allowed
for vacuum operation is X after which it will sleep for some time,
reset the balance and continue. In the patch, each worker will be
allowed to perform X before which it can sleep and also there is no
coordination for the same with master backend which would have done
some I/O for the heap. So, in the worst-case scenario, there can be n
times more I/O where n is the number of workers doing the parallel
operation. This is somewhat similar to a memory usage problem with a
parallel query where each worker is allowed to use up to work_mem of
memory. We can say that the users using parallel operation can expect
more system resources to be used as they want to get the operation
done faster, so we are fine with this. However, I am not sure if that
is the right thing, so we should try to come up with some solution for
it and if the solution is too complex, then probably we can think of
documenting such behavior.

I mean for parallel query the problem wasn't really introduced in
parallel query, it existed before - and does still - for non-parallel
queries. And there's a complex underlying planning issue. I don't think
this is a good excuse for VACUUM, where none of the complex "number of
paths considered" issues etc apply.

The two approaches to solve this problem being discussed in that
thread [1] are as follows:
(a) Allow the parallel workers and master backend to have a shared
view of vacuum cost related parameters (mainly VacuumCostBalance) and
allow each worker to update it and then based on that decide whether
it needs to sleep. Sawada-San has done the POC for this approach.
See v32-0004-PoC-shared-vacuum-cost-balance in email [2]. One
drawback of this approach could be that we allow the worker to sleep
even though the I/O has been performed by some other worker.

I don't understand this drawback.

(b) The other idea could be that we split the I/O among workers
something similar to what we do for auto vacuum workers (see
autovac_balance_cost). The basic idea would be that before launching
workers, we need to compute the remaining I/O (heap operation would
have used something) after which we need to sleep and split it equally
across workers. Here, we are primarily thinking of dividing
VacuumCostBalance and VacuumCostLimit parameters. Once the workers
are finished, they need to let master backend know how much I/O they
have consumed and then master backend can add it to it's current I/O
consumed. I think we also need to rebalance the cost of remaining
workers once some of the worker's exit. Dilip has prepared a POC
patch for this, see 0002-POC-divide-vacuum-cost-limit in email [3].

(b) doesn't strike me as advantageous. It seems quite possible that you
end up with one worker that has a lot more IO than others, leading to
unnecessary sleeps, even though the actually available IO budget has not
been used up. Quite easy to see how that'd lead to parallel VACUUM
having a lower throughput than a single threaded one.

Greetings,

Andres Freund

#9Andres Freund
andres@anarazel.de
In reply to: Jeff Janes (#7)
Re: cost based vacuum (parallel)

Hi,

On 2019-11-04 12:59:02 -0500, Jeff Janes wrote:

On Mon, Nov 4, 2019 at 1:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

For parallel vacuum [1], we were discussing what is the best way to
divide the cost among parallel workers but we didn't get many inputs
apart from people who are very actively involved in patch development.
I feel that we need some more inputs before we finalize anything, so
starting a new thread.

Maybe a I just don't have experience in the type of system that parallel
vacuum is needed for, but if there is any meaningful IO throttling which is
active, then what is the point of doing the vacuum in parallel in the first
place?

I am wondering the same - but to be fair, it's pretty easy to run into
cases where VACUUM is CPU bound. E.g. because most pages are in
shared_buffers, and compared to the size of the indexes number of tids
that need to be pruned is fairly small (also [1]I don't think the patch addresses this, IIUC it's only running index vacuums in parallel, but it's very easy to run into being CPU bottlenecked when vacuuming a busily updated table. heap_hot_prune can be really expensive, especially with longer update chains (I think it may have an O(n^2) worst case even).). That means a lot of
pages need to be scanned, without a whole lot of IO going on. The
problem with that is just that the defaults for vacuum throttling will
also apply here, I've never seen anybody tune vacuum_cost_page_hit = 0,
vacuum_cost_page_dirty=0 or such (in contrast, the latter is the highest
cost currently). Nor do we reduce the cost of vacuum_cost_page_dirty
for unlogged tables.

So while it doesn't seem unreasonable to want to use cost limiting to
protect against vacuum unexpectedly causing too much, especially read,
IO, I'm doubtful it has current practical relevance.

I'm wondering how much of the benefit of parallel vacuum really is just
to work around vacuum ringbuffers often massively hurting performance
(see e.g. [2]/messages/by-id/20160406105716.fhk2eparljthpzp6@alap3.anarazel.de). Surely not all, but I'd be very unsurprised if it were a
large fraction.

Greetings,

Andres Freund

[1]: I don't think the patch addresses this, IIUC it's only running index vacuums in parallel, but it's very easy to run into being CPU bottlenecked when vacuuming a busily updated table. heap_hot_prune can be really expensive, especially with longer update chains (I think it may have an O(n^2) worst case even).
vacuums in parallel, but it's very easy to run into being CPU
bottlenecked when vacuuming a busily updated table. heap_hot_prune
can be really expensive, especially with longer update chains (I
think it may have an O(n^2) worst case even).
[2]: /messages/by-id/20160406105716.fhk2eparljthpzp6@alap3.anarazel.de

#10Stephen Frost
sfrost@snowman.net
In reply to: Jeff Janes (#7)
Re: cost based vacuum (parallel)

Greetings,

* Jeff Janes (jeff.janes@gmail.com) wrote:

On Mon, Nov 4, 2019 at 1:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

For parallel vacuum [1], we were discussing what is the best way to
divide the cost among parallel workers but we didn't get many inputs
apart from people who are very actively involved in patch development.
I feel that we need some more inputs before we finalize anything, so
starting a new thread.

Maybe a I just don't have experience in the type of system that parallel
vacuum is needed for, but if there is any meaningful IO throttling which is
active, then what is the point of doing the vacuum in parallel in the first
place?

With parallelization across indexes, you could have a situation where
the individual indexes are on different tablespaces with independent
i/o, therefore the parallelization ends up giving you an increase in i/o
throughput, not just additional CPU time.

Thanks,

Stephen

#11Andres Freund
andres@anarazel.de
In reply to: Stephen Frost (#10)
Re: cost based vacuum (parallel)

Hi,

On 2019-11-04 14:06:19 -0500, Stephen Frost wrote:

* Jeff Janes (jeff.janes@gmail.com) wrote:

On Mon, Nov 4, 2019 at 1:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

For parallel vacuum [1], we were discussing what is the best way to
divide the cost among parallel workers but we didn't get many inputs
apart from people who are very actively involved in patch development.
I feel that we need some more inputs before we finalize anything, so
starting a new thread.

Maybe a I just don't have experience in the type of system that parallel
vacuum is needed for, but if there is any meaningful IO throttling which is
active, then what is the point of doing the vacuum in parallel in the first
place?

With parallelization across indexes, you could have a situation where
the individual indexes are on different tablespaces with independent
i/o, therefore the parallelization ends up giving you an increase in i/o
throughput, not just additional CPU time.

How's that related to IO throttling being active or not?

Greetings,

Andres Freund

#12Stephen Frost
sfrost@snowman.net
In reply to: Andres Freund (#11)
Re: cost based vacuum (parallel)

Greetings,

* Andres Freund (andres@anarazel.de) wrote:

On 2019-11-04 14:06:19 -0500, Stephen Frost wrote:

* Jeff Janes (jeff.janes@gmail.com) wrote:

On Mon, Nov 4, 2019 at 1:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

For parallel vacuum [1], we were discussing what is the best way to
divide the cost among parallel workers but we didn't get many inputs
apart from people who are very actively involved in patch development.
I feel that we need some more inputs before we finalize anything, so
starting a new thread.

Maybe a I just don't have experience in the type of system that parallel
vacuum is needed for, but if there is any meaningful IO throttling which is
active, then what is the point of doing the vacuum in parallel in the first
place?

With parallelization across indexes, you could have a situation where
the individual indexes are on different tablespaces with independent
i/o, therefore the parallelization ends up giving you an increase in i/o
throughput, not just additional CPU time.

How's that related to IO throttling being active or not?

You might find that you have to throttle the IO down when operating
exclusively against one IO channel, but if you have multiple IO channels
then the acceptable IO utilization could be higher as it would be
spread across the different IO channels.

In other words, the overall i/o allowance for a given operation might be
able to be higher if it's spread across multiple i/o channels, as it
wouldn't completely consume the i/o resources of any of them, whereas
with a higher allowance and a single i/o channel, there would likely be
an impact to other operations.

As for if this is really relevant only when it comes to parallel
operations is a bit of an interesting question- these considerations
might not require actual parallel operations as a single process might
be able to go through multiple indexes concurrently and still hit the
i/o limit that was set for it overall across the tablespaces. I don't
know that it would actually be interesting or useful to spend the effort
to make that work though, so, from a practical perspective, it's
probably only interesting to think about this when talking about
parallel vacuum.

I've been wondering if the accounting system should consider the cost
per tablespace when there's multiple tablespaces involved, instead of
throttling the overall process without consideration for the
per-tablespace utilization.

Thanks,

Stephen

#13Andres Freund
andres@anarazel.de
In reply to: Stephen Frost (#12)
Re: cost based vacuum (parallel)

Hi,

On 2019-11-04 14:33:41 -0500, Stephen Frost wrote:

* Andres Freund (andres@anarazel.de) wrote:

On 2019-11-04 14:06:19 -0500, Stephen Frost wrote:

With parallelization across indexes, you could have a situation where
the individual indexes are on different tablespaces with independent
i/o, therefore the parallelization ends up giving you an increase in i/o
throughput, not just additional CPU time.

How's that related to IO throttling being active or not?

You might find that you have to throttle the IO down when operating
exclusively against one IO channel, but if you have multiple IO channels
then the acceptable IO utilization could be higher as it would be
spread across the different IO channels.

In other words, the overall i/o allowance for a given operation might be
able to be higher if it's spread across multiple i/o channels, as it
wouldn't completely consume the i/o resources of any of them, whereas
with a higher allowance and a single i/o channel, there would likely be
an impact to other operations.

As for if this is really relevant only when it comes to parallel
operations is a bit of an interesting question- these considerations
might not require actual parallel operations as a single process might
be able to go through multiple indexes concurrently and still hit the
i/o limit that was set for it overall across the tablespaces. I don't
know that it would actually be interesting or useful to spend the effort
to make that work though, so, from a practical perspective, it's
probably only interesting to think about this when talking about
parallel vacuum.

But you could just apply different budgets for different tablespaces?
That's quite doable independent of parallelism, as we don't have tables
or indexes spanning more than one tablespace. True, you could then make
the processing of an individual vacuum faster by allowing to utilize
multiple tablespace budgets at the same time.

I've been wondering if the accounting system should consider the cost
per tablespace when there's multiple tablespaces involved, instead of
throttling the overall process without consideration for the
per-tablespace utilization.

This all seems like a feature proposal, or two, independent of the
patch/question at hand. I think there's a good argument to be had that
we should severely overhaul the current vacuum cost limiting - it's way
way too hard to understand the bandwidth that it's allowed to
consume. But unless one of the proposals makes that measurably harder or
easier, I think we don't gain anything by entangling an already complex
patchset with something new.

Greetings,

Andres Freund

#14Stephen Frost
sfrost@snowman.net
In reply to: Andres Freund (#13)
Re: cost based vacuum (parallel)

Greetings,

* Andres Freund (andres@anarazel.de) wrote:

On 2019-11-04 14:33:41 -0500, Stephen Frost wrote:

* Andres Freund (andres@anarazel.de) wrote:

On 2019-11-04 14:06:19 -0500, Stephen Frost wrote:

With parallelization across indexes, you could have a situation where
the individual indexes are on different tablespaces with independent
i/o, therefore the parallelization ends up giving you an increase in i/o
throughput, not just additional CPU time.

How's that related to IO throttling being active or not?

You might find that you have to throttle the IO down when operating
exclusively against one IO channel, but if you have multiple IO channels
then the acceptable IO utilization could be higher as it would be
spread across the different IO channels.

In other words, the overall i/o allowance for a given operation might be
able to be higher if it's spread across multiple i/o channels, as it
wouldn't completely consume the i/o resources of any of them, whereas
with a higher allowance and a single i/o channel, there would likely be
an impact to other operations.

As for if this is really relevant only when it comes to parallel
operations is a bit of an interesting question- these considerations
might not require actual parallel operations as a single process might
be able to go through multiple indexes concurrently and still hit the
i/o limit that was set for it overall across the tablespaces. I don't
know that it would actually be interesting or useful to spend the effort
to make that work though, so, from a practical perspective, it's
probably only interesting to think about this when talking about
parallel vacuum.

But you could just apply different budgets for different tablespaces?

Yes, that would be one approach to addressing this, though it would
change the existing meaning of those cost parameters. I'm not sure if
we think that's an issue or not- if we only have this in the case of a
parallel vacuum then it's probably fine, I'm less sure if it'd be
alright to change that on an upgrade.

That's quite doable independent of parallelism, as we don't have tables
or indexes spanning more than one tablespace. True, you could then make
the processing of an individual vacuum faster by allowing to utilize
multiple tablespace budgets at the same time.

Yes, it's possible to do independent of parallelism, but what I was
trying to get at above is that it might not be worth the effort. When
it comes to parallel vacuum though, I'm not sure that you can just punt
on this question since you'll naturally end up spanning multiple
tablespaces concurrently, at least if the heap+indexes are spread across
multiple tablespaces and you're operating against more than one of those
relations at a time (which, I admit, I'm not 100% sure is actually
happening with this proposed patch set- if it isn't, then this isn't
really an issue, though that would be pretty unfortunate as then you
can't leverage multiple i/o channels concurrently and therefore Jeff's
question about why you'd be doing parallel vacuum with IO throttling is
a pretty good one).

Thanks,

Stephen

#15Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#8)
Re: cost based vacuum (parallel)

On Mon, Nov 4, 2019 at 11:42 PM Andres Freund <andres@anarazel.de> wrote:

The two approaches to solve this problem being discussed in that
thread [1] are as follows:
(a) Allow the parallel workers and master backend to have a shared
view of vacuum cost related parameters (mainly VacuumCostBalance) and
allow each worker to update it and then based on that decide whether
it needs to sleep. Sawada-San has done the POC for this approach.
See v32-0004-PoC-shared-vacuum-cost-balance in email [2]. One
drawback of this approach could be that we allow the worker to sleep
even though the I/O has been performed by some other worker.

I don't understand this drawback.

I think the problem could be that the system is not properly throttled
when it is supposed to be. Let me try by a simple example, say we
have two workers w-1 and w-2. The w-2 is primarily doing the I/O and
w-1 is doing very less I/O but unfortunately whenever w-1 checks it
finds that cost_limit has exceeded and it goes for sleep, but w-1
still continues. Now in such a situation even though we have made one
of the workers slept for a required time but ideally the worker which
was doing I/O should have slept. The aim is to make the system stop
doing I/O whenever the limit has exceeded, so that might not work in
the above situation.

(b) The other idea could be that we split the I/O among workers
something similar to what we do for auto vacuum workers (see
autovac_balance_cost). The basic idea would be that before launching
workers, we need to compute the remaining I/O (heap operation would
have used something) after which we need to sleep and split it equally
across workers. Here, we are primarily thinking of dividing
VacuumCostBalance and VacuumCostLimit parameters. Once the workers
are finished, they need to let master backend know how much I/O they
have consumed and then master backend can add it to it's current I/O
consumed. I think we also need to rebalance the cost of remaining
workers once some of the worker's exit. Dilip has prepared a POC
patch for this, see 0002-POC-divide-vacuum-cost-limit in email [3].

(b) doesn't strike me as advantageous. It seems quite possible that you
end up with one worker that has a lot more IO than others, leading to
unnecessary sleeps, even though the actually available IO budget has not
been used up.

Yeah, this is possible, but to an extent, this is possible in the
current design as well where we balance the cost among autovacuum
workers. Now, it is quite possible that the current design itself is
not good and we don't want to do the same thing at another place, but
at least we will be consistent and can explain the overall behavior.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#16Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#9)
Re: cost based vacuum (parallel)

On Mon, Nov 4, 2019 at 11:58 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2019-11-04 12:59:02 -0500, Jeff Janes wrote:

On Mon, Nov 4, 2019 at 1:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

For parallel vacuum [1], we were discussing what is the best way to
divide the cost among parallel workers but we didn't get many inputs
apart from people who are very actively involved in patch development.
I feel that we need some more inputs before we finalize anything, so
starting a new thread.

Maybe a I just don't have experience in the type of system that parallel
vacuum is needed for, but if there is any meaningful IO throttling which is
active, then what is the point of doing the vacuum in parallel in the first
place?

I am wondering the same - but to be fair, it's pretty easy to run into
cases where VACUUM is CPU bound. E.g. because most pages are in
shared_buffers, and compared to the size of the indexes number of tids
that need to be pruned is fairly small (also [1]). That means a lot of
pages need to be scanned, without a whole lot of IO going on. The
problem with that is just that the defaults for vacuum throttling will
also apply here, I've never seen anybody tune vacuum_cost_page_hit = 0,
vacuum_cost_page_dirty=0 or such (in contrast, the latter is the highest
cost currently). Nor do we reduce the cost of vacuum_cost_page_dirty
for unlogged tables.

So while it doesn't seem unreasonable to want to use cost limiting to
protect against vacuum unexpectedly causing too much, especially read,
IO, I'm doubtful it has current practical relevance.

IIUC, you mean to say that it is of not much practical use to do
parallel vacuum if I/O throttling is enabled for an operation, is that
right?

I'm wondering how much of the benefit of parallel vacuum really is just
to work around vacuum ringbuffers often massively hurting performance
(see e.g. [2]).

Yeah, it is a good thing to check, but if anything, I think a parallel
vacuum will further improve the performance with larger ring buffers
as it will make it more CPU bound.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#17Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#13)
Re: cost based vacuum (parallel)

On Tue, Nov 5, 2019 at 1:12 AM Andres Freund <andres@anarazel.de> wrote:

On 2019-11-04 14:33:41 -0500, Stephen Frost wrote:

I've been wondering if the accounting system should consider the cost
per tablespace when there's multiple tablespaces involved, instead of
throttling the overall process without consideration for the
per-tablespace utilization.

This all seems like a feature proposal, or two, independent of the
patch/question at hand. I think there's a good argument to be had that
we should severely overhaul the current vacuum cost limiting - it's way
way too hard to understand the bandwidth that it's allowed to
consume. But unless one of the proposals makes that measurably harder or
easier, I think we don't gain anything by entangling an already complex
patchset with something new.

+1. I think even if we want something related to per-tablespace
costing for vacuum (parallel), it should be done as a separate patch.
It is a whole new area where we need to define what is the appropriate
way to achieve. It is going to change the current vacuum costing
system in a big way which I don't think is reasonable to do as part of
a parallel vacuum patch.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#18Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#16)
Re: cost based vacuum (parallel)

On Tue, Nov 5, 2019 at 2:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Nov 4, 2019 at 11:58 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2019-11-04 12:59:02 -0500, Jeff Janes wrote:

On Mon, Nov 4, 2019 at 1:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

For parallel vacuum [1], we were discussing what is the best way to
divide the cost among parallel workers but we didn't get many inputs
apart from people who are very actively involved in patch development.
I feel that we need some more inputs before we finalize anything, so
starting a new thread.

Maybe a I just don't have experience in the type of system that parallel
vacuum is needed for, but if there is any meaningful IO throttling which is
active, then what is the point of doing the vacuum in parallel in the first
place?

I am wondering the same - but to be fair, it's pretty easy to run into
cases where VACUUM is CPU bound. E.g. because most pages are in
shared_buffers, and compared to the size of the indexes number of tids
that need to be pruned is fairly small (also [1]). That means a lot of
pages need to be scanned, without a whole lot of IO going on. The
problem with that is just that the defaults for vacuum throttling will
also apply here, I've never seen anybody tune vacuum_cost_page_hit = 0,
vacuum_cost_page_dirty=0 or such (in contrast, the latter is the highest
cost currently). Nor do we reduce the cost of vacuum_cost_page_dirty
for unlogged tables.

So while it doesn't seem unreasonable to want to use cost limiting to
protect against vacuum unexpectedly causing too much, especially read,
IO, I'm doubtful it has current practical relevance.

IIUC, you mean to say that it is of not much practical use to do
parallel vacuum if I/O throttling is enabled for an operation, is that
right?

I'm wondering how much of the benefit of parallel vacuum really is just
to work around vacuum ringbuffers often massively hurting performance
(see e.g. [2]).

Yeah, it is a good thing to check, but if anything, I think a parallel
vacuum will further improve the performance with larger ring buffers
as it will make it more CPU bound.

I have tested the same and the results prove that increasing the ring
buffer size we can see the performance gain. And, the gain is much
more with the parallel vacuum.

Test case:
create table test(a int, b int, c int, d int, e int, f int, g int, h int);
create index idx1 on test(a);
create index idx2 on test(b);
create index idx3 on test(c);
create index idx4 on test(d);
create index idx5 on test(e);
create index idx6 on test(f);
create index idx7 on test(g);
create index idx8 on test(h);
insert into test select i,i,i,i,i,i,i,i from generate_series(1,1000000) as i;
delete from test where a < 300000;

( I have tested the parallel vacuum and non-parallel vacuum with
different ring buffer size)

8 index
ring buffer size 246kb-> non-parallel: 7.6 seconds parallel (2
worker): 3.9 seconds
ring buffer size 256mb-> non-parallel: 6.1 seconds parallel (2
worker): 3.2 seconds

4 index
ring buffer size 246kb -> non-parallel: 4.8 seconds parallel (2
worker): 3.2 seconds
ring buffer size 256mb -> non-parallel: 3.8 seconds parallel (2
worker): 2.6 seconds

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#19Andres Freund
andres@anarazel.de
In reply to: Dilip Kumar (#18)
Re: cost based vacuum (parallel)

Hi,

On November 5, 2019 7:16:41 AM PST, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Nov 5, 2019 at 2:40 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Mon, Nov 4, 2019 at 11:58 PM Andres Freund <andres@anarazel.de>

wrote:

Hi,

On 2019-11-04 12:59:02 -0500, Jeff Janes wrote:

On Mon, Nov 4, 2019 at 1:54 AM Amit Kapila

<amit.kapila16@gmail.com> wrote:

For parallel vacuum [1], we were discussing what is the best

way to

divide the cost among parallel workers but we didn't get many

inputs

apart from people who are very actively involved in patch

development.

I feel that we need some more inputs before we finalize

anything, so

starting a new thread.

Maybe a I just don't have experience in the type of system that

parallel

vacuum is needed for, but if there is any meaningful IO

throttling which is

active, then what is the point of doing the vacuum in parallel in

the first

place?

I am wondering the same - but to be fair, it's pretty easy to run

into

cases where VACUUM is CPU bound. E.g. because most pages are in
shared_buffers, and compared to the size of the indexes number of

tids

that need to be pruned is fairly small (also [1]). That means a lot

of

pages need to be scanned, without a whole lot of IO going on. The
problem with that is just that the defaults for vacuum throttling

will

also apply here, I've never seen anybody tune vacuum_cost_page_hit

= 0,

vacuum_cost_page_dirty=0 or such (in contrast, the latter is the

highest

cost currently). Nor do we reduce the cost of

vacuum_cost_page_dirty

for unlogged tables.

So while it doesn't seem unreasonable to want to use cost limiting

to

protect against vacuum unexpectedly causing too much, especially

read,

IO, I'm doubtful it has current practical relevance.

IIUC, you mean to say that it is of not much practical use to do
parallel vacuum if I/O throttling is enabled for an operation, is

that

right?

I'm wondering how much of the benefit of parallel vacuum really is

just

to work around vacuum ringbuffers often massively hurting

performance

(see e.g. [2]).

Yeah, it is a good thing to check, but if anything, I think a

parallel

vacuum will further improve the performance with larger ring buffers
as it will make it more CPU bound.

I have tested the same and the results prove that increasing the ring
buffer size we can see the performance gain. And, the gain is much
more with the parallel vacuum.

Test case:
create table test(a int, b int, c int, d int, e int, f int, g int, h
int);
create index idx1 on test(a);
create index idx2 on test(b);
create index idx3 on test(c);
create index idx4 on test(d);
create index idx5 on test(e);
create index idx6 on test(f);
create index idx7 on test(g);
create index idx8 on test(h);
insert into test select i,i,i,i,i,i,i,i from generate_series(1,1000000)
as i;
delete from test where a < 300000;

( I have tested the parallel vacuum and non-parallel vacuum with
different ring buffer size)

Thanks!

8 index
ring buffer size 246kb-> non-parallel: 7.6 seconds parallel (2
worker): 3.9 seconds
ring buffer size 256mb-> non-parallel: 6.1 seconds parallel (2
worker): 3.2 seconds

4 index
ring buffer size 246kb -> non-parallel: 4.8 seconds parallel (2
worker): 3.2 seconds
ring buffer size 256mb -> non-parallel: 3.8 seconds parallel (2
worker): 2.6 seconds

What about the case of just disabling the ring buffer logic?

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

#20Dilip Kumar
dilipbalaut@gmail.com
In reply to: Andres Freund (#19)
Re: cost based vacuum (parallel)

On Tue, Nov 5, 2019 at 8:49 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On November 5, 2019 7:16:41 AM PST, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Nov 5, 2019 at 2:40 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Mon, Nov 4, 2019 at 11:58 PM Andres Freund <andres@anarazel.de>

wrote:

Hi,

On 2019-11-04 12:59:02 -0500, Jeff Janes wrote:

On Mon, Nov 4, 2019 at 1:54 AM Amit Kapila

<amit.kapila16@gmail.com> wrote:

For parallel vacuum [1], we were discussing what is the best

way to

divide the cost among parallel workers but we didn't get many

inputs

apart from people who are very actively involved in patch

development.

I feel that we need some more inputs before we finalize

anything, so

starting a new thread.

Maybe a I just don't have experience in the type of system that

parallel

vacuum is needed for, but if there is any meaningful IO

throttling which is

active, then what is the point of doing the vacuum in parallel in

the first

place?

I am wondering the same - but to be fair, it's pretty easy to run

into

cases where VACUUM is CPU bound. E.g. because most pages are in
shared_buffers, and compared to the size of the indexes number of

tids

that need to be pruned is fairly small (also [1]). That means a lot

of

pages need to be scanned, without a whole lot of IO going on. The
problem with that is just that the defaults for vacuum throttling

will

also apply here, I've never seen anybody tune vacuum_cost_page_hit

= 0,

vacuum_cost_page_dirty=0 or such (in contrast, the latter is the

highest

cost currently). Nor do we reduce the cost of

vacuum_cost_page_dirty

for unlogged tables.

So while it doesn't seem unreasonable to want to use cost limiting

to

protect against vacuum unexpectedly causing too much, especially

read,

IO, I'm doubtful it has current practical relevance.

IIUC, you mean to say that it is of not much practical use to do
parallel vacuum if I/O throttling is enabled for an operation, is

that

right?

I'm wondering how much of the benefit of parallel vacuum really is

just

to work around vacuum ringbuffers often massively hurting

performance

(see e.g. [2]).

Yeah, it is a good thing to check, but if anything, I think a

parallel

vacuum will further improve the performance with larger ring buffers
as it will make it more CPU bound.

I have tested the same and the results prove that increasing the ring
buffer size we can see the performance gain. And, the gain is much
more with the parallel vacuum.

Test case:
create table test(a int, b int, c int, d int, e int, f int, g int, h
int);
create index idx1 on test(a);
create index idx2 on test(b);
create index idx3 on test(c);
create index idx4 on test(d);
create index idx5 on test(e);
create index idx6 on test(f);
create index idx7 on test(g);
create index idx8 on test(h);
insert into test select i,i,i,i,i,i,i,i from generate_series(1,1000000)
as i;
delete from test where a < 300000;

( I have tested the parallel vacuum and non-parallel vacuum with
different ring buffer size)

Thanks!

8 index
ring buffer size 246kb-> non-parallel: 7.6 seconds parallel (2
worker): 3.9 seconds
ring buffer size 256mb-> non-parallel: 6.1 seconds parallel (2
worker): 3.2 seconds

4 index
ring buffer size 246kb -> non-parallel: 4.8 seconds parallel (2
worker): 3.2 seconds
ring buffer size 256mb -> non-parallel: 3.8 seconds parallel (2
worker): 2.6 seconds

What about the case of just disabling the ring buffer logic?

Repeated the same test by disabling the ring buffer logic. Results
are almost same as increasing the ring buffer size.

Tested with 4GB shared buffers:

8 index
use shared buffers -> non-parallel: 6.2seconds parallel (2 worker): 3.3seconds

4 index
use shared buffer -> non-parallel: 3.8seconds parallel (2 worker): 2.7seconds

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#21Amit Kapila
amit.kapila16@gmail.com
In reply to: Stephen Frost (#14)
#22Andres Freund
andres@anarazel.de
In reply to: Amit Kapila (#21)
#23Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#22)
#24Stephen Frost
sfrost@snowman.net
In reply to: Amit Kapila (#21)
#25Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#15)
#26Dilip Kumar
dilipbalaut@gmail.com
In reply to: Stephen Frost (#24)
#27Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#25)
#28Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#27)
#29Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#28)
#30Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#29)
#31Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#30)
#32Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#31)
#33Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#32)
#34Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#33)
#35Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#34)
#36Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#33)
#37Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#36)
#38Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#37)
#39Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#38)
#40Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#39)
#41Mahendra Singh Thalor
mahi6run@gmail.com
In reply to: Amit Kapila (#35)
#42Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#40)
#43Dilip Kumar
dilipbalaut@gmail.com
In reply to: Mahendra Singh Thalor (#41)
#44Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#42)