Another idea for index-only scans

Started by Bruce Momjianover 18 years ago12 messages

bruce@momjian.us

over 18 years ago

I have added another idea for index-only scans to the TODO list:

A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Mike Rylander

mrylander@gmail.com

over 18 years ago

In reply to: Bruce Momjian (#1)

Re: Another idea for index-only scans

On 8/15/07, Bruce Momjian <bruce@momjian.us> wrote:

I have added another idea for index-only scans to the TODO list:

A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.

Perhaps this is naive (or discussed and discarded... if so, I couldn't
find it, but I apologize if that's the case), but wouldn't recording
the xid of non-readonly transactions, at commit time, and at the table
level, be equivalent to the flag and remove the need for a counter?
Readers could just check the last-modification-xid at the beginning
and end of their scans to test for heap stability.

I suppose that would require a write-exclusive lock on some metadata
for each modified table during each commit... so perhaps it's a
non-starter right there.

--miker

Josh Berkus

josh@agliodbs.com

over 18 years ago

In reply to: Bruce Momjian (#1)

Re: Another idea for index-only scans

Bruce,

I have added another idea for index-only scans to the TODO list:

A third idea would be for a heap scan to check if all rows are
visible and if so set a per-table flag which can be checked by index
scans. Any change to the table would have to clear the flag. To
detect changes during the heap scan a counter could be set at the
start and checked at the end --- if it is the same, the table has not
been modified --- any table change would increment the counter.

Seems marginal at best. Checking overlap between the index and the FSM/DSM
and only check dirty pages seems more intelligent, and able to cover a
larger number of cases.

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

Gregory Stark

stark@enterprisedb.com

over 18 years ago

In reply to: Bruce Momjian (#1)

Re: Another idea for index-only scans

"Bruce Momjian" <bruce@momjian.us> writes:

I have added another idea for index-only scans to the TODO list:

A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.

I think I would prefer to address this in the same infrastructure as the
dead-space-map. That way you're not dependent on having no updates happening
on the table at all. Any tuples on pages which contain no in-doubt tuples
could have their visibility check skipped but when you come across a tuple on
a page which has been modified since the last vacuum then you have to check
the visibility.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

Luke Lonergan

LLonergan@greenplum.com

over 18 years ago

In reply to: Gregory Stark (#4)

Re: Another idea for index-only scans

A hybrid scan approach combined with this idea would fit nicely - provide results for tids that are directly visible and set a bit in a bitmap for those that need recheck and extend recheck to take a bitmap (wait - it already does :-)

- Luke

Msg is shrt cuz m on ma treo

-----Original Message-----
From: Gregory Stark [mailto:stark@enterprisedb.com]
Sent: Wednesday, August 15, 2007 02:58 PM Eastern Standard Time
To: Bruce Momjian
Cc: PostgreSQL-development
Subject: Re: [HACKERS] Another idea for index-only scans

"Bruce Momjian" <bruce@momjian.us> writes:

I have added another idea for index-only scans to the TODO list:

A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Import Notes

Resolved by subject fallback

Bruce Momjian

bruce@momjian.us

over 18 years ago

In reply to: Gregory Stark (#4)

Re: Another idea for index-only scans

Gregory Stark wrote:

"Bruce Momjian" <bruce@momjian.us> writes:

I have added another idea for index-only scans to the TODO list:

A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.

I think I would prefer to address this in the same infrastructure as the
dead-space-map. That way you're not dependent on having no updates happening
on the table at all. Any tuples on pages which contain no in-doubt tuples
could have their visibility check skipped but when you come across a tuple on
a page which has been modified since the last vacuum then you have to check
the visibility.

Yea, the bitmap/page idea is already on the TODO list. This was just a
less granular idea.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Decibel!

decibel@decibel.org

over 18 years ago

In reply to: Gregory Stark (#4)

Re: Another idea for index-only scans

On Aug 15, 2007, at 1:54 PM, Gregory Stark wrote:

A third idea would be for a heap scan to check if all rows are
visible
and if so set a per-table flag which can be checked by index
scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the
start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.

I think I would prefer to address this in the same infrastructure
as the
dead-space-map. That way you're not dependent on having no updates
happening
on the table at all. Any tuples on pages which contain no in-doubt
tuples
could have their visibility check skipped but when you come across
a tuple on
a page which has been modified since the last vacuum then you have
to check
the visibility.

The advantage to Bruce's idea is that it sounds pretty simple to
implement. While it wouldn't be of use for many general cases, it
*would* be useful for read-only tables, ie: old partitions.
--
Decibel!, aka Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

Jeff Davis

pgsql@j-davis.com

over 18 years ago

In reply to: Bruce Momjian (#1)

Re: Another idea for index-only scans

On Wed, 2007-08-15 at 11:54 -0400, Bruce Momjian wrote:

I have added another idea for index-only scans to the TODO list:

A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.

This sounds useful for read-only or read-mostly tables.

However, it also sounds a little dangerous. If you test your application
performance, but not thoroughly enough, you might end up with a surprise
when going into production.

Regards,
Jeff Davis

Alvaro Herrera

alvherre@commandprompt.com

over 18 years ago

In reply to: Jeff Davis (#8)

Re: Another idea for index-only scans

Jeff Davis wrote:

On Wed, 2007-08-15 at 11:54 -0400, Bruce Momjian wrote:

I have added another idea for index-only scans to the TODO list:

A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.

This sounds useful for read-only or read-mostly tables.

I think it's too coarse-grained to be really useful. If it was one bit
per page it could work, but one bit per relation is going to be reset
too frequently.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#10

Decibel!

decibel@decibel.org

over 18 years ago

In reply to: Alvaro Herrera (#9)

Re: Another idea for index-only scans

On Thu, Aug 16, 2007 at 04:06:35PM -0400, Alvaro Herrera wrote:

Jeff Davis wrote:

On Wed, 2007-08-15 at 11:54 -0400, Bruce Momjian wrote:

I have added another idea for index-only scans to the TODO list:

A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag. To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.

This sounds useful for read-only or read-mostly tables.

I think it's too coarse-grained to be really useful. If it was one bit
per page it could work, but one bit per relation is going to be reset
too frequently.

Not for the most common use cases for table partitioning.
--
Decibel!, aka Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#11

Tom Lane

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: Alvaro Herrera (#9)

Re: Another idea for index-only scans

Alvaro Herrera <alvherre@commandprompt.com> writes:

On Wed, 2007-08-15 at 11:54 -0400, Bruce Momjian wrote:

A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.

I think it's too coarse-grained to be really useful. If it was one bit
per page it could work, but one bit per relation is going to be reset
too frequently.

Another problem it would have is that the flag would be a single point
of contention.

regards, tom lane

#12

James Mansion

james@mansionfamily.plus.com

over 18 years ago

In reply to: Decibel! (#7)

Re: Another idea for index-only scans

Decibel! wrote:

The advantage to Bruce's idea is that it sounds pretty simple to
implement. While it wouldn't be of use for many general cases, it
*would* be useful for read-only tables, ie: old partitions.

Wouldn't the mostcommon case by foreign key checks against tables that
essentially map application enums to display strings? This is a rather
common scenario. It would be nice if such tables (which are typically
small) could be retained in each backend process with a simple check
that the cached data is still valid.

James