Preventing index scans for non-recoverable index AMs

Started by Simon Riggsover 17 years ago31 messageshackers
Jump to latest
#1Simon Riggs
simon@2ndQuadrant.com

Hot Standby won't work with hash indexes because they are
non-recoverable.

We have a number of ways of dealing with this:

1. Workaround: Implement WAL for hash indexes

2. Specific Solution: make hashcostestimate() massively increase cost of
scans during recovery so that they will very seldom be picked and make
hashbeginscan() refuse scans during recovery in case they do happen

3. Generic Solution: add amisrecoverable flag to pg_am, and alter
generic index AM to check whether index is recoverable before selecting
it during planning

I was hoping to do (1), but I'm looking for a quicker solution now so we
can tie up loose ends for review.

(2) seems most appropriate, since it will last only until (1) is
complete in a later release.

Anyone see additional options or prefer alternatives?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#2Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Simon Riggs (#1)
Re: Preventing index scans for non-recoverable index AMs

On Wed, Dec 17, 2008 at 4:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Hot Standby won't work with hash indexes because they are
non-recoverable.

We have a number of ways of dealing with this:

i don't see a reason for inventing the wheel, we don't have wal for
hash indexes because makes those more slow without any benefit at
all... now there will be one...

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#3Kenneth Marshall
ktm@rice.edu
In reply to: Jaime Casanova (#2)
Re: Preventing index scans for non-recoverable index AMs

On Wed, Dec 17, 2008 at 05:42:41PM -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 4:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Hot Standby won't work with hash indexes because they are
non-recoverable.

We have a number of ways of dealing with this:

i don't see a reason for inventing the wheel, we don't have wal for
hash indexes because makes those more slow without any benefit at
all... now there will be one...

--
Atentamente,
Jaime Casanova
Soporte y capacitaci?n de PostgreSQL
Asesor?a y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

I think having your index survive a server power outage or other
crash is a very good thing. Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

Ken

#4Simon Riggs
simon@2ndQuadrant.com
In reply to: Jaime Casanova (#2)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 17:42 -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 4:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Hot Standby won't work with hash indexes because they are
non-recoverable.

We have a number of ways of dealing with this:

i don't see a reason for inventing the wheel, we don't have wal for
hash indexes because makes those more slow without any benefit at
all... now there will be one...

Well, we're running short of time for 8.4 to put it mildly, so option
(1) is not on my radar. Even if somebody wrote WAL support for hash
indexes right now, I would be much happier with my other two suggestions
from a robustness perspective. We don't yet have a mechanism for an
index AM to say "damn, this index is screwed up, don't use it". So a
rushed implementation of WAL support would be counterproductive, ISTM.

So it's either (2), (3) or another option.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#5Simon Riggs
simon@2ndQuadrant.com
In reply to: Kenneth Marshall (#3)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 16:47 -0600, Kenneth Marshall wrote:

I think having your index survive a server power outage or other
crash is a very good thing. Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

Completely agree.

We may be outta time to make it happen.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#6Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Simon Riggs (#4)
Re: Preventing index scans for non-recoverable index AMs

On Wed, Dec 17, 2008 at 5:54 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

We don't yet have a mechanism for an
index AM to say "damn, this index is screwed up, don't use it".

mark pg_index.indisvalid and/or pg_index.indisready to false in the
hot standby node?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#7Kenneth Marshall
ktm@rice.edu
In reply to: Simon Riggs (#5)
Re: Preventing index scans for non-recoverable index AMs

On Wed, Dec 17, 2008 at 10:58:11PM +0000, Simon Riggs wrote:

On Wed, 2008-12-17 at 16:47 -0600, Kenneth Marshall wrote:

I think having your index survive a server power outage or other
crash is a very good thing. Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

Completely agree.

We may be outta time to make it happen.

I agree. I was working on adding the WAL and ran up against the
deadline. A rushed hash WAL implementation would be worse than the
other alternatives. I plan on picking it back up after 8.4 is out
the door.

Regards,
Ken

#8Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Kenneth Marshall (#3)
Re: Preventing index scans for non-recoverable index AMs

On Wed, Dec 17, 2008 at 5:47 PM, Kenneth Marshall <ktm@rice.edu> wrote:

Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

there's such a situation?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#9Kenneth Marshall
ktm@rice.edu
In reply to: Jaime Casanova (#8)
Re: Preventing index scans for non-recoverable index AMs

On Wed, Dec 17, 2008 at 06:07:41PM -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 5:47 PM, Kenneth Marshall <ktm@rice.edu> wrote:

Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

there's such a situation?

As of 8.4, yes.

Ken

In reply to: Kenneth Marshall (#9)
Re: Preventing index scans for non-recoverable index AMs

On Wed, Dec 17, 2008 at 05:10:40PM -0600, Kenneth Marshall wrote:

On Wed, Dec 17, 2008 at 06:07:41PM -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 5:47 PM, Kenneth Marshall <ktm@rice.edu> wrote:

Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

there's such a situation?

As of 8.4, yes.

In addition, hash indexes can index items larger than the 1/3 page
limit of btree indexes.

Ken

#11Simon Riggs
simon@2ndQuadrant.com
In reply to: Jaime Casanova (#6)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 18:01 -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 5:54 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

We don't yet have a mechanism for an
index AM to say "damn, this index is screwed up, don't use it".

mark pg_index.indisvalid and/or pg_index.indisready to false in the
hot standby node?

We can't edit the database until recovery is over, so that doesn't help
us while in recovery mode. So not an option.

It doesn't help us after recovery mode either because there is no
infrastructure (yet) for an index AM's rmgr to exact a transaction after
completion of recovery. So yes, that is the way it will be done, but
there is a non-trivial effort to allow it to happen.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#12Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Simon Riggs (#11)
Re: Preventing index scans for non-recoverable index AMs

Simon Riggs wrote:

On Wed, 2008-12-17 at 18:01 -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 5:54 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

We don't yet have a mechanism for an
index AM to say "damn, this index is screwed up, don't use it".

mark pg_index.indisvalid and/or pg_index.indisready to false in the
hot standby node?

We can't edit the database until recovery is over, so that doesn't help
us while in recovery mode. So not an option.

Maybe we should add a WAL record that's the physical representation for
"mark this index invalid", and have any transaction that modifies a hash
index write that to WAL. It should be simple code to write, because
the underlying replay is based on a regular heap update.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#13Simon Riggs
simon@2ndQuadrant.com
In reply to: Alvaro Herrera (#12)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 21:23 -0300, Alvaro Herrera wrote:

Simon Riggs wrote:

On Wed, 2008-12-17 at 18:01 -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 5:54 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

We don't yet have a mechanism for an
index AM to say "damn, this index is screwed up, don't use it".

mark pg_index.indisvalid and/or pg_index.indisready to false in the
hot standby node?

We can't edit the database until recovery is over, so that doesn't help
us while in recovery mode. So not an option.

Maybe we should add a WAL record that's the physical representation for
"mark this index invalid", and have any transaction that modifies a hash
index write that to WAL. It should be simple code to write, because
the underlying replay is based on a regular heap update.

Doesn't sound like it would work. It doesn't really matter how you
*decide* to do this, it's when you do this that counts.

What we need is a way for recovery to remember a list of pending actions
that can then be issued in a transaction at the end of recovery. As you
pointed out, the Startup process cannot issue transactions, so that
means this is harder than it should be.

Short route is to:
* allow Startup process to run transactions (when recovery finished)
* introduce another rmgr API call that gets called in its own
transaction at the end of recovery

Longer route is to
* allow startup process to queue up work following recovery
* have another process (autovac-ish) get spawned immediately after
recovery to read the list and execute - this would allow us to startup
quickly even if the rmgr decided to mark index invalid and then
completely rebuild the index.

Which sounds like a major project in itself.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#14Josh Berkus
josh@agliodbs.com
In reply to: Alvaro Herrera (#12)
Re: Preventing index scans for non-recoverable index AMs

All,

Are we really releasing an index type without recoverability for 8.4?
Will this be in /contrib?

--Josh

#15Jeff Davis
pgsql@j-davis.com
In reply to: Kenneth Marshall (#9)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 17:10 -0600, Kenneth Marshall wrote:

On Wed, Dec 17, 2008 at 06:07:41PM -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 5:47 PM, Kenneth Marshall <ktm@rice.edu> wrote:

Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

there's such a situation?

As of 8.4, yes.

My understanding was that the hash index type never supported
recoverability, and could require a rebuild on power failure.

If it's not written to WAL before the data page changes, how could it be
safe for recovery? The tuple inserts are logged, so during recovery the
tuple would be put in the table but the index would not be updated.

What am I missing?

Regards,
Jeff Davis

#16Simon Riggs
simon@2ndQuadrant.com
In reply to: Josh Berkus (#14)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 18:09 -0800, Josh Berkus wrote:

Are we really releasing an index type without recoverability for 8.4?
Will this be in /contrib?

Worse than that, I'm talking about hash indexes (which are already here,
without WAL)

I would not contemplate a new index type without WAL.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#17Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#1)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 21:26 +0000, Simon Riggs wrote:

Hot Standby won't work with hash indexes because they are
non-recoverable.

We have a number of ways of dealing with this:

2. Specific Solution: make hashcostestimate() massively increase cost of
scans during recovery so that they will very seldom be picked and make
hashbeginscan() refuse scans during recovery in case they do happen

(2) seems most appropriate, since it will last only until (1) is
complete in a later release.

Code *fragment* enclosed here for discussion. (Not an independent patch)

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

Attachments:

hs.hashindex.v6.patchtext/x-patch; charset=utf-8; name=hs.hashindex.v6.patchDownload+23-2
#18Jeff Davis
pgsql@j-davis.com
In reply to: Jeff Davis (#15)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 18:20 -0800, Jeff Davis wrote:

On Wed, 2008-12-17 at 17:10 -0600, Kenneth Marshall wrote:

On Wed, Dec 17, 2008 at 06:07:41PM -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 5:47 PM, Kenneth Marshall <ktm@rice.edu> wrote:

Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

there's such a situation?

As of 8.4, yes.

My understanding was that the hash index type never supported
recoverability, and could require a rebuild on power failure.

If it's not written to WAL before the data page changes, how could it be
safe for recovery? The tuple inserts are logged, so during recovery the
tuple would be put in the table but the index would not be updated.

What am I missing?

On second read, it occurs to me that you may have meant: "as of 8.4,
hash indexes have never been safe" but I read it as: "as of 8.4, hash
indexes will require rebuild on crash, whereas that was unnecessary
before 8.4".

If you meant the former, you can disregard my question.

Regards,
Jeff Davis

#19Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Simon Riggs (#13)
Re: Preventing index scans for non-recoverable index AMs

Simon Riggs wrote:

On Wed, 2008-12-17 at 21:23 -0300, Alvaro Herrera wrote:

Maybe we should add a WAL record that's the physical representation for
"mark this index invalid", and have any transaction that modifies a hash
index write that to WAL. It should be simple code to write, because
the underlying replay is based on a regular heap update.

Doesn't sound like it would work. It doesn't really matter how you
*decide* to do this, it's when you do this that counts.

Hmm, it doesn't seem like you understood my suggestion ... basically I'm
saying that a hash index insert/delete should put out this WAL record:

HEAP update address-of-pg_index-tuple set indisvalid=false

(I'm just guessing at indisvalid but you get my point)

No need to remember anything. Of course, the user then needs to fix the
index after the fact.

Of course, for 8.5 we would do something smarter.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#20Simon Riggs
simon@2ndQuadrant.com
In reply to: Alvaro Herrera (#19)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 23:28 -0300, Alvaro Herrera wrote:

Hmm, it doesn't seem like you understood my suggestion ... basically I'm
saying that a hash index insert/delete should put out this WAL record:

HEAP update address-of-pg_index-tuple set indisvalid=false

(I'm just guessing at indisvalid but you get my point)

That would be simple and I'm very sorry to say I still don't think it
would work. But yes, I did misunderstand you.

In-progress hash index scans would not be prevented from executing by
the WAL record, so you might end up following a bad pointer. We probably
wouldn't want to try killing anybody using index either, since that
would end up as a complete bloodbath.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#21Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Jeff Davis (#15)
#22Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#21)
#23Peter Eisentraut
peter_e@gmx.net
In reply to: Simon Riggs (#22)
#24Simon Riggs
simon@2ndQuadrant.com
In reply to: Peter Eisentraut (#23)
#25Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Heikki Linnakangas (#21)
#26Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Pavan Deolasee (#25)
#27Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Heikki Linnakangas (#26)
#28Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Pavan Deolasee (#27)
#29Bruce Momjian
bruce@momjian.us
In reply to: Heikki Linnakangas (#28)
#30Peter Eisentraut
peter_e@gmx.net
In reply to: Bruce Momjian (#29)
#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#30)