encouraging index-only scans

Started by Andrew Dunstanover 13 years ago64 messageshackers
Jump to latest
#1Andrew Dunstan
andrew@dunslane.net

A client is testing a migration from 9.1 to 9.2, and has found that a
large number of queries run much faster if they use index-only scans.
However, the only way he has found to get such a plan is by increasing
the seq_page_cost to insanely high levels (3.5). Is there any approved
way to encourage such scans that's a but less violent than this?

cheers

andrew

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#1)
Re: encouraging index-only scans

Andrew Dunstan <andrew@dunslane.net> writes:

A client is testing a migration from 9.1 to 9.2, and has found that a
large number of queries run much faster if they use index-only scans.
However, the only way he has found to get such a plan is by increasing
the seq_page_cost to insanely high levels (3.5). Is there any approved
way to encourage such scans that's a but less violent than this?

Is the pg_class.relallvisible estimate for the table realistic? They
might need a few more VACUUM and ANALYZE cycles to get it into the
neighborhood of reality, if not.

Keep in mind also that small values of random_page_cost necessarily
decrease the apparent advantage of index-only scans. If you think 3.5
is an "insanely high" setting, I wonder whether you haven't driven those
numbers too far in the other direction to compensate for something else.

regards, tom lane

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

#3Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#2)
Re: encouraging index-only scans

On 12/12/2012 04:32 PM, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

A client is testing a migration from 9.1 to 9.2, and has found that a
large number of queries run much faster if they use index-only scans.
However, the only way he has found to get such a plan is by increasing
the seq_page_cost to insanely high levels (3.5). Is there any approved
way to encourage such scans that's a but less violent than this?

Is the pg_class.relallvisible estimate for the table realistic? They
might need a few more VACUUM and ANALYZE cycles to get it into the
neighborhood of reality, if not.

That was the problem - I didn't know this hadn't been done.

Keep in mind also that small values of random_page_cost necessarily
decrease the apparent advantage of index-only scans. If you think 3.5
is an "insanely high" setting, I wonder whether you haven't driven those
numbers too far in the other direction to compensate for something else.

Right.

Thanks for the help.

cheers

andrew

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

#4Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#3)
Re: encouraging index-only scans

On 12/12/2012 05:12 PM, Andrew Dunstan wrote:

On 12/12/2012 04:32 PM, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

A client is testing a migration from 9.1 to 9.2, and has found that a
large number of queries run much faster if they use index-only scans.
However, the only way he has found to get such a plan is by increasing
the seq_page_cost to insanely high levels (3.5). Is there any approved
way to encourage such scans that's a but less violent than this?

Is the pg_class.relallvisible estimate for the table realistic? They
might need a few more VACUUM and ANALYZE cycles to get it into the
neighborhood of reality, if not.

That was the problem - I didn't know this hadn't been done.

Actually, the table had been analysed but not vacuumed, so this kinda
begs the question what will happen to this value on pg_upgrade? Will
people's queries suddenly get slower until autovacuum kicks in on the table?

cheers

andrew

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

#5Bruce Momjian
bruce@momjian.us
In reply to: Andrew Dunstan (#4)
Re: [PERFORM] encouraging index-only scans

On Wed, Dec 12, 2012 at 05:27:39PM -0500, Andrew Dunstan wrote:

On 12/12/2012 05:12 PM, Andrew Dunstan wrote:

On 12/12/2012 04:32 PM, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

A client is testing a migration from 9.1 to 9.2, and has found that a
large number of queries run much faster if they use index-only scans.
However, the only way he has found to get such a plan is by increasing
the seq_page_cost to insanely high levels (3.5). Is there any approved
way to encourage such scans that's a but less violent than this?

Is the pg_class.relallvisible estimate for the table realistic? They
might need a few more VACUUM and ANALYZE cycles to get it into the
neighborhood of reality, if not.

That was the problem - I didn't know this hadn't been done.

Actually, the table had been analysed but not vacuumed, so this
kinda begs the question what will happen to this value on
pg_upgrade? Will people's queries suddenly get slower until
autovacuum kicks in on the table?

[ moved to hackers list.]

Yes, this does seem like a problem for upgrades from 9.2 to 9.3? We can
have pg_dump --binary-upgrade set these, or have ANALYZE set it. I
would prefer the later.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#5)
Re: [PERFORM] encouraging index-only scans

Bruce Momjian <bruce@momjian.us> writes:

On Wed, Dec 12, 2012 at 05:27:39PM -0500, Andrew Dunstan wrote:

Actually, the table had been analysed but not vacuumed, so this
kinda begs the question what will happen to this value on
pg_upgrade? Will people's queries suddenly get slower until
autovacuum kicks in on the table?

[ moved to hackers list.]

Yes, this does seem like a problem for upgrades from 9.2 to 9.3? We can
have pg_dump --binary-upgrade set these, or have ANALYZE set it. I
would prefer the later.

ANALYZE does not set that value, and is not going to start doing so,
because it doesn't scan enough of the table to derive a trustworthy
value.

It's been clear for some time that pg_upgrade ought to do something
about transferring the "statistics" columns in pg_class to the new
cluster. This is just another example of why.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Tom Lane (#6)
Re: [PERFORM] encouraging index-only scans

On Thu, Dec 13, 2012 at 9:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Bruce Momjian <bruce@momjian.us> writes:

On Wed, Dec 12, 2012 at 05:27:39PM -0500, Andrew Dunstan wrote:

Actually, the table had been analysed but not vacuumed, so this
kinda begs the question what will happen to this value on
pg_upgrade? Will people's queries suddenly get slower until
autovacuum kicks in on the table?

[ moved to hackers list.]

Yes, this does seem like a problem for upgrades from 9.2 to 9.3? We can
have pg_dump --binary-upgrade set these, or have ANALYZE set it. I
would prefer the later.

ANALYZE does not set that value, and is not going to start doing so,
because it doesn't scan enough of the table to derive a trustworthy
value.

Should we do that though ? i.e. scan the entire map and count the
number of bits at the end of ANALYZE, like we do at the end of VACUUM
? I recently tried to optimize that code path by not recounting at the
end of the vacuum and instead track the number of all-visible bits
while scanning them in the earlier phases on vacuum. But it turned out
that its so fast to count even a million bits that its probably not
worth doing so.

It's been clear for some time that pg_upgrade ought to do something
about transferring the "statistics" columns in pg_class to the new
cluster. This is just another example of why.

+1.

Thanks,
Pavan

--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#6)
Re: [PERFORM] encouraging index-only scans

On 13 December 2012 03:51, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yes, this does seem like a problem for upgrades from 9.2 to 9.3? We can
have pg_dump --binary-upgrade set these, or have ANALYZE set it. I
would prefer the later.

ANALYZE does not set that value, and is not going to start doing so,
because it doesn't scan enough of the table to derive a trustworthy
value.

ISTM that ANALYZE doesn't need to scan the table to do this. The
vismap is now trustworthy and we can scan it separately on ANALYZE.

More to the point, since we run ANALYZE more frequently than we run
VACUUM, the value stored by the last VACUUM could be very stale.

It's been clear for some time that pg_upgrade ought to do something
about transferring the "statistics" columns in pg_class to the new
cluster. This is just another example of why.

Agreed, but that could bring other problems as well.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#8)
Re: [PERFORM] encouraging index-only scans

On Thu, Dec 13, 2012 at 09:40:40AM +0000, Simon Riggs wrote:

On 13 December 2012 03:51, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yes, this does seem like a problem for upgrades from 9.2 to 9.3? We can
have pg_dump --binary-upgrade set these, or have ANALYZE set it. I
would prefer the later.

ANALYZE does not set that value, and is not going to start doing so,
because it doesn't scan enough of the table to derive a trustworthy
value.

ISTM that ANALYZE doesn't need to scan the table to do this. The
vismap is now trustworthy and we can scan it separately on ANALYZE.

More to the point, since we run ANALYZE more frequently than we run
VACUUM, the value stored by the last VACUUM could be very stale.

Wouldn't inserts affect the relallvisible ratio, but not cause a vacuum?
Seems we should be having analyze update this independent of pg_upgrade
needing it. Also, why is this in pg_class?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Tom Lane (#6)
Re: [PERFORM] encouraging index-only scans

On 13 December 2012 03:51, Tom Lane <tgl@sss.pgh.pa.us> wrote:

ANALYZE does not set that value, and is not going to start doing so,
because it doesn't scan enough of the table to derive a trustworthy
value.

I'm slightly surprised by your remarks here, because the commit
message where the relallvisible column was added (commit
a2822fb9337a21f98ac4ce850bb4145acf47ca27) says:

"Add a column pg_class.relallvisible to remember the number of pages
that were all-visible according to the visibility map as of the last
VACUUM
(or ANALYZE, or some other operations that update pg_class.relpages).
Use relallvisible/relpages, instead of an arbitrary constant, to
estimate how many heap page fetches can be avoided during an
index-only scan."

Have I missed some nuance?

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Bruce Momjian
bruce@momjian.us
In reply to: Peter Geoghegan (#10)
Re: [PERFORM] encouraging index-only scans

On Thu, Dec 13, 2012 at 03:31:06PM +0000, Peter Geoghegan wrote:

On 13 December 2012 03:51, Tom Lane <tgl@sss.pgh.pa.us> wrote:

ANALYZE does not set that value, and is not going to start doing so,
because it doesn't scan enough of the table to derive a trustworthy
value.

I'm slightly surprised by your remarks here, because the commit
message where the relallvisible column was added (commit
a2822fb9337a21f98ac4ce850bb4145acf47ca27) says:

"Add a column pg_class.relallvisible to remember the number of pages
that were all-visible according to the visibility map as of the last
VACUUM
(or ANALYZE, or some other operations that update pg_class.relpages).
Use relallvisible/relpages, instead of an arbitrary constant, to
estimate how many heap page fetches can be avoided during an
index-only scan."

Have I missed some nuance?

I am looking back at this issue now and I think you are correct. The
commit you mention (Oct 7 2011) says ANALYZE updates the visibility map,
and the code matches that:

if (!inh)
vac_update_relstats(onerel,
RelationGetNumberOfBlocks(onerel),
totalrows,
--> visibilitymap_count(onerel),
hasindex,
InvalidTransactionId);

so if an index scan was not being used after an ANALYZE, it isn't a bad
allvisibile estimate but something else. This code was in PG 9.2.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#11)
Re: [PERFORM] encouraging index-only scans

On Wed, Sep 4, 2013 at 04:56:55PM -0400, Bruce Momjian wrote:

"Add a column pg_class.relallvisible to remember the number of pages
that were all-visible according to the visibility map as of the last
VACUUM
(or ANALYZE, or some other operations that update pg_class.relpages).
Use relallvisible/relpages, instead of an arbitrary constant, to
estimate how many heap page fetches can be avoided during an
index-only scan."

Have I missed some nuance?

I am looking back at this issue now and I think you are correct. The
commit you mention (Oct 7 2011) says ANALYZE updates the visibility map,
and the code matches that:

if (!inh)
vac_update_relstats(onerel,
RelationGetNumberOfBlocks(onerel),
totalrows,
--> visibilitymap_count(onerel),
hasindex,
InvalidTransactionId);

so if an index scan was not being used after an ANALYZE, it isn't a bad
allvisibile estimate but something else. This code was in PG 9.2.

Actually, I now realize it is more complex than that, and worse. There
are several questions to study to understand when pg_class.relallvisible
is updated (which is used to determine if index-only scans are a good
optimization choice), and when VM all-visible bits are set so heap pages
can be skipped during index-only scans:

1) When are VM bits set:
vacuum (non-full)
analyze (only some random pages)
2) When are massive rows added but VM bits not set:
copy
3) When are VM bits cleared:
insert/update/delete
vacuum (non-full)
4) When are VM map files cleared:
vacuum full
cluster
5) When is pg_class.relallvisible updated via a VM map file scan:
vacuum (non-full)
analyze

Vacuums run by autovacuum are driven by n_dead_tuples, which is only
update and delete. Therefore, any scenario where vacuum (non-full) is
never run will not have significant VM bits set. The only bits that
will be set will be by pages visited randomly by analyze.

The following table activities will not set proper VM bits:

vacuum full
cluster
copy
insert-only

If updates and deletes happen, there will eventually be sufficient
reason for autovacuum to vacuum the table and set proper VM bits, and
pg_class.relallvisible.

The calculus we should use to determine when we need to run vacuum has
changed with index-only scans, and I am not sure we ever fully addressed
this.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#12)
Re: [PERFORM] encouraging index-only scans

On Thu, Sep 5, 2013 at 8:14 PM, Bruce Momjian <bruce@momjian.us> wrote:

Actually, I now realize it is more complex than that, and worse. There
are several questions to study to understand when pg_class.relallvisible
is updated (which is used to determine if index-only scans are a good
optimization choice), and when VM all-visible bits are set so heap pages
can be skipped during index-only scans:

1) When are VM bits set:
vacuum (non-full)
analyze (only some random pages)

Analyze doesn't set visibility-map bits. It only updates statistics
about how many are set.

The calculus we should use to determine when we need to run vacuum has
changed with index-only scans, and I am not sure we ever fully addressed
this.

Yeah, we didn't. I think the hard part is figuring out what behavior
would be best. Counting inserts as well as updates and deletes would
be a simple approach, but I don't have much confidence in it. My
experience is that having vacuum or analyze kick in during a bulk-load
operation is a disaster. We'd kinda like to come up with a way to
make vacuum run after the bulk load is complete, maybe, but how would
we identify that time, and there are probably cases where that's not
right either.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Gavin Flower
GavinFlower@archidevsys.co.nz
In reply to: Robert Haas (#13)
Re: [PERFORM] encouraging index-only scans

On 06/09/13 13:10, Robert Haas wrote:

On Thu, Sep 5, 2013 at 8:14 PM, Bruce Momjian <bruce@momjian.us> wrote:

Actually, I now realize it is more complex than that, and worse. There
are several questions to study to understand when pg_class.relallvisible
is updated (which is used to determine if index-only scans are a good
optimization choice), and when VM all-visible bits are set so heap pages
can be skipped during index-only scans:

1) When are VM bits set:
vacuum (non-full)
analyze (only some random pages)

Analyze doesn't set visibility-map bits. It only updates statistics
about how many are set.

The calculus we should use to determine when we need to run vacuum has
changed with index-only scans, and I am not sure we ever fully addressed
this.

Yeah, we didn't. I think the hard part is figuring out what behavior
would be best. Counting inserts as well as updates and deletes would
be a simple approach, but I don't have much confidence in it. My
experience is that having vacuum or analyze kick in during a bulk-load
operation is a disaster. We'd kinda like to come up with a way to
make vacuum run after the bulk load is complete, maybe, but how would
we identify that time, and there are probably cases where that's not
right either.

How about a 'VACUUM AFTER' command (part of the 'BEGIN' transaction
syntax?) that would:

1. only be valid in a transaction
2. initiate a vacuum after the current transaction completed
3. defer any vacuum triggered due to other criteria

If the transaction was rolled back: then if there was a pending vacuum,
due to other reasons, it would then be actioned.

On normal transaction completion, then if there was a pending vacuum it
would be combined with the one in the transaction.

Still would need some method of ensuring any pending vacuum was done if
the transaction hung, or took too long.

Cheers,
Gavin

#15Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#13)
Re: [PERFORM] encouraging index-only scans

On Thu, Sep 5, 2013 at 09:10:06PM -0400, Robert Haas wrote:

On Thu, Sep 5, 2013 at 8:14 PM, Bruce Momjian <bruce@momjian.us> wrote:

Actually, I now realize it is more complex than that, and worse. There
are several questions to study to understand when pg_class.relallvisible
is updated (which is used to determine if index-only scans are a good
optimization choice), and when VM all-visible bits are set so heap pages
can be skipped during index-only scans:

1) When are VM bits set:
vacuum (non-full)
analyze (only some random pages)

Analyze doesn't set visibility-map bits. It only updates statistics
about how many are set.

Sorry, yes you are correct.

The calculus we should use to determine when we need to run vacuum has
changed with index-only scans, and I am not sure we ever fully addressed
this.

Yeah, we didn't. I think the hard part is figuring out what behavior
would be best. Counting inserts as well as updates and deletes would
be a simple approach, but I don't have much confidence in it. My
experience is that having vacuum or analyze kick in during a bulk-load
operation is a disaster. We'd kinda like to come up with a way to
make vacuum run after the bulk load is complete, maybe, but how would
we identify that time, and there are probably cases where that's not
right either.

I am unsure how we have gone a year with index-only scans and I am just
now learning that it only works well with update/delete workloads or by
running vacuum manually. I only found this out going back over January
emails. Did other people know this? Was it not considered a serious
problem?

Well, our logic has been that vacuum is only for removing expired rows.
I think we either need to improve that, or somehow make sequential scans
update the VM map, and then find a way to trigger update of
relallvisible even without inserts.

Ideas
-----

I think we need to detect tables that do not have VM bits set and try to
determine if they should be vacuumed. If a table has most of its VM
bits set, there in need to vacuum it for VM bit setting.

Autovacuum knows how many pages are in the table via its file size, and
it can scan the VM map to see how many pages are _not_ marked
all-visible. If the VM map has many pages that are _not_ marked as
all-visible, and change count since last vacuum is low, those pages
might now be all-visible and vacuum might find them. One problem is
that a long-running transaction is not going to update relallvisible
until commit, so you might be vacuuming a table that is being modified,
e.g. bulk loads. Do we have any way of detecting if a backend is
modifying a table?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bruce Momjian (#15)
Re: [PERFORM] encouraging index-only scans

Bruce Momjian escribi�:

Ideas
-----

I think we need to detect tables that do not have VM bits set and try to
determine if they should be vacuumed. If a table has most of its VM
bits set, there in need to vacuum it for VM bit setting.

I think it's shortsighted to keep thinking of autovacuum as just a way
to run VACUUM and ANALYZE. We have already discussed work items that
need to be done separately, such as truncating the last few empty pages
on a relation that was vacuumed recently. We also need to process a GIN
index' pending insertion list; and with minmax indexes I will want to
run summarization of heap page ranges.

So maybe instead of trying to think of VM bit setting as part of vacuum,
we could just keep stats about how many pages we might need to scan
because of possibly needing to set the bit, and then doing that in
autovacuum, independently from actually vacuuming the relation.

I'm not sure if we need to expose all these new maintenance actions as
SQL commands.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Alvaro Herrera (#16)
Re: [PERFORM] encouraging index-only scans

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

I'm not sure if we need to expose all these new maintenance actions as
SQL commands.

I strongly think we should, if only for diagnostic purposes. Also to
adapt to some well defined workloads that the automatic system is not
designed to handle.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Hannu Krosing
hannu@tm.ee
In reply to: Dimitri Fontaine (#17)
Re: [PERFORM] encouraging index-only scans

On 09/06/2013 09:23 AM, Dimitri Fontaine wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

I'm not sure if we need to expose all these new maintenance actions as
SQL commands.

I strongly think we should, if only for diagnostic purposes.

It would be much easier and more flexible to expose them
as pg_*() function calls, not proper "commands".

Also to
adapt to some well defined workloads that the automatic system is not
designed to handle.

+1

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic O�

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#16)
Re: [PERFORM] encouraging index-only scans

On 2013-09-06 01:22:36 -0400, Alvaro Herrera wrote:

I think it's shortsighted to keep thinking of autovacuum as just a way
to run VACUUM and ANALYZE. We have already discussed work items that
need to be done separately, such as truncating the last few empty pages
on a relation that was vacuumed recently. We also need to process a GIN
index' pending insertion list; and with minmax indexes I will want to
run summarization of heap page ranges.

Agreed.

So maybe instead of trying to think of VM bit setting as part of vacuum,
we could just keep stats about how many pages we might need to scan
because of possibly needing to set the bit, and then doing that in
autovacuum, independently from actually vacuuming the relation.

I am not sure I understand this though. What would be the point to go
and set all visible and not do the rest of the vacuuming work?

I think triggering vacuuming by scanning the visibility map for the
number of unset bits and use that as another trigger is a good idea. The
vm should ensure we're not doing superflous work.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Andres Freund
andres@anarazel.de
In reply to: Hannu Krosing (#18)
Re: [PERFORM] encouraging index-only scans

On 2013-09-06 13:38:56 +0200, Hannu Krosing wrote:

On 09/06/2013 09:23 AM, Dimitri Fontaine wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

I'm not sure if we need to expose all these new maintenance actions as
SQL commands.

I strongly think we should, if only for diagnostic purposes.

It would be much easier and more flexible to expose them
as pg_*() function calls, not proper "commands".

I don't think that's as easy as you might imagine. For much of what's
done in that context you cannot be in a transaction, you even need to be
in a toplevel statement (since we internally
CommitTransactionCommand/StartTransactionCommand).

So those pg_* commands couldn't be called (except possibly via the
fastpath function call API ...) which might restrict their usefulnes a
teensy bit ;)

So, I think extending the options passed to VACUUM - since it can take
pretty generic options these days - is a more realistic path.

Also to
adapt to some well defined workloads that the automatic system is not
designed to handle.

+1

What would you like to expose individually?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Hannu Krosing
hannu@tm.ee
In reply to: Andres Freund (#20)
#22Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#19)
#23Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#22)
#24Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#23)
#25Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#24)
#26Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Gavin Flower (#14)
#27Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Bruce Momjian (#25)
#28Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#24)
#29Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#25)
#30Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#29)
#31Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#30)
#32Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#31)
#33Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#32)
#34Jeff Janes
jeff.janes@gmail.com
In reply to: Bruce Momjian (#15)
#35Amit Kapila
amit.kapila16@gmail.com
In reply to: Jeff Janes (#34)
#36Jeff Janes
jeff.janes@gmail.com
In reply to: Amit Kapila (#35)
#37Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#33)
#38Amit Kapila
amit.kapila16@gmail.com
In reply to: Jeff Janes (#36)
#39Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Andres Freund (#31)
#40Andres Freund
andres@anarazel.de
In reply to: Jim Nasby (#39)
#41Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Andres Freund (#40)
#42Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#40)
#43Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#42)
#44Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#42)
#45Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#44)
#46Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#45)
#47Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#46)
#48Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#42)
#49Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#48)
#50Jeff Janes
jeff.janes@gmail.com
In reply to: Robert Haas (#49)
#51Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Janes (#50)
#52Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#49)
#53Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#52)
#54Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#53)
#55Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#54)
#56Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#55)
#57Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#56)
#58Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#57)
#59Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#58)
#60Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#58)
#61Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#59)
#62Jeff Janes
jeff.janes@gmail.com
In reply to: Bruce Momjian (#54)
#63Claudio Freire
klaussfreire@gmail.com
In reply to: Jeff Janes (#62)
#64Bruce Momjian
bruce@momjian.us
In reply to: Claudio Freire (#63)