Another idea for index-only scans
I have added another idea for index-only scans to the TODO list:
A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
On 8/15/07, Bruce Momjian <bruce@momjian.us> wrote:
I have added another idea for index-only scans to the TODO list:
A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.
Perhaps this is naive (or discussed and discarded... if so, I couldn't
find it, but I apologize if that's the case), but wouldn't recording
the xid of non-readonly transactions, at commit time, and at the table
level, be equivalent to the flag and remove the need for a counter?
Readers could just check the last-modification-xid at the beginning
and end of their scans to test for heap stability.
I suppose that would require a write-exclusive lock on some metadata
for each modified table during each commit... so perhaps it's a
non-starter right there.
--miker
Bruce,
I have added another idea for index-only scans to the TODO list:
A third idea would be for a heap scan to check if all rows are
visible and if so set a per-table flag which can be checked by index
scans. Any change to the table would have to clear the flag. To
detect changes during the heap scan a counter could be set at the
start and checked at the end --- if it is the same, the table has not
been modified --- any table change would increment the counter.
Seems marginal at best. Checking overlap between the index and the FSM/DSM
and only check dirty pages seems more intelligent, and able to cover a
larger number of cases.
--
--Josh
Josh Berkus
PostgreSQL @ Sun
San Francisco
"Bruce Momjian" <bruce@momjian.us> writes:
I have added another idea for index-only scans to the TODO list:
A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.
I think I would prefer to address this in the same infrastructure as the
dead-space-map. That way you're not dependent on having no updates happening
on the table at all. Any tuples on pages which contain no in-doubt tuples
could have their visibility check skipped but when you come across a tuple on
a page which has been modified since the last vacuum then you have to check
the visibility.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
A hybrid scan approach combined with this idea would fit nicely - provide results for tids that are directly visible and set a bit in a bitmap for those that need recheck and extend recheck to take a bitmap (wait - it already does :-)
- Luke
Msg is shrt cuz m on ma treo
-----Original Message-----
From: Gregory Stark [mailto:stark@enterprisedb.com]
Sent: Wednesday, August 15, 2007 02:58 PM Eastern Standard Time
To: Bruce Momjian
Cc: PostgreSQL-development
Subject: Re: [HACKERS] Another idea for index-only scans
"Bruce Momjian" <bruce@momjian.us> writes:
I have added another idea for index-only scans to the TODO list:
A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.
I think I would prefer to address this in the same infrastructure as the
dead-space-map. That way you're not dependent on having no updates happening
on the table at all. Any tuples on pages which contain no in-doubt tuples
could have their visibility check skipped but when you come across a tuple on
a page which has been modified since the last vacuum then you have to check
the visibility.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
Import Notes
Resolved by subject fallback
Gregory Stark wrote:
"Bruce Momjian" <bruce@momjian.us> writes:
I have added another idea for index-only scans to the TODO list:
A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.I think I would prefer to address this in the same infrastructure as the
dead-space-map. That way you're not dependent on having no updates happening
on the table at all. Any tuples on pages which contain no in-doubt tuples
could have their visibility check skipped but when you come across a tuple on
a page which has been modified since the last vacuum then you have to check
the visibility.
Yea, the bitmap/page idea is already on the TODO list. This was just a
less granular idea.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
On Aug 15, 2007, at 1:54 PM, Gregory Stark wrote:
A third idea would be for a heap scan to check if all rows are
visible
and if so set a per-table flag which can be checked by index
scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the
start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.I think I would prefer to address this in the same infrastructure
as the
dead-space-map. That way you're not dependent on having no updates
happening
on the table at all. Any tuples on pages which contain no in-doubt
tuples
could have their visibility check skipped but when you come across
a tuple on
a page which has been modified since the last vacuum then you have
to check
the visibility.
The advantage to Bruce's idea is that it sounds pretty simple to
implement. While it wouldn't be of use for many general cases, it
*would* be useful for read-only tables, ie: old partitions.
--
Decibel!, aka Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
On Wed, 2007-08-15 at 11:54 -0400, Bruce Momjian wrote:
I have added another idea for index-only scans to the TODO list:
A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.
This sounds useful for read-only or read-mostly tables.
However, it also sounds a little dangerous. If you test your application
performance, but not thoroughly enough, you might end up with a surprise
when going into production.
Regards,
Jeff Davis
Jeff Davis wrote:
On Wed, 2007-08-15 at 11:54 -0400, Bruce Momjian wrote:
I have added another idea for index-only scans to the TODO list:
A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.This sounds useful for read-only or read-mostly tables.
I think it's too coarse-grained to be really useful. If it was one bit
per page it could work, but one bit per relation is going to be reset
too frequently.
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Thu, Aug 16, 2007 at 04:06:35PM -0400, Alvaro Herrera wrote:
Jeff Davis wrote:
On Wed, 2007-08-15 at 11:54 -0400, Bruce Momjian wrote:
I have added another idea for index-only scans to the TODO list:
A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.This sounds useful for read-only or read-mostly tables.
I think it's too coarse-grained to be really useful. If it was one bit
per page it could work, but one bit per relation is going to be reset
too frequently.
Not for the most common use cases for table partitioning.
--
Decibel!, aka Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Alvaro Herrera <alvherre@commandprompt.com> writes:
On Wed, 2007-08-15 at 11:54 -0400, Bruce Momjian wrote:
A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
I think it's too coarse-grained to be really useful. If it was one bit
per page it could work, but one bit per relation is going to be reset
too frequently.
Another problem it would have is that the flag would be a single point
of contention.
regards, tom lane
Decibel! wrote:
The advantage to Bruce's idea is that it sounds pretty simple to
implement. While it wouldn't be of use for many general cases, it
*would* be useful for read-only tables, ie: old partitions.
Wouldn't the mostcommon case by foreign key checks against tables that
essentially map application enums to display strings? This is a rather
common scenario. It would be nice if such tables (which are typically
small) could be retained in each backend process with a simple check
that the cached data is still valid.
James