Preventing index scans for non-recoverable index AMs

Started by Simon Riggsabout 17 years ago31 messages
#1Simon Riggs
simon@2ndQuadrant.com

Hot Standby won't work with hash indexes because they are
non-recoverable.

We have a number of ways of dealing with this:

1. Workaround: Implement WAL for hash indexes

2. Specific Solution: make hashcostestimate() massively increase cost of
scans during recovery so that they will very seldom be picked and make
hashbeginscan() refuse scans during recovery in case they do happen

3. Generic Solution: add amisrecoverable flag to pg_am, and alter
generic index AM to check whether index is recoverable before selecting
it during planning

I was hoping to do (1), but I'm looking for a quicker solution now so we
can tie up loose ends for review.

(2) seems most appropriate, since it will last only until (1) is
complete in a later release.

Anyone see additional options or prefer alternatives?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#2Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Simon Riggs (#1)
Re: Preventing index scans for non-recoverable index AMs

On Wed, Dec 17, 2008 at 4:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Hot Standby won't work with hash indexes because they are
non-recoverable.

We have a number of ways of dealing with this:

i don't see a reason for inventing the wheel, we don't have wal for
hash indexes because makes those more slow without any benefit at
all... now there will be one...

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#3Kenneth Marshall
ktm@rice.edu
In reply to: Jaime Casanova (#2)
Re: Preventing index scans for non-recoverable index AMs

On Wed, Dec 17, 2008 at 05:42:41PM -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 4:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Hot Standby won't work with hash indexes because they are
non-recoverable.

We have a number of ways of dealing with this:

i don't see a reason for inventing the wheel, we don't have wal for
hash indexes because makes those more slow without any benefit at
all... now there will be one...

--
Atentamente,
Jaime Casanova
Soporte y capacitaci?n de PostgreSQL
Asesor?a y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

I think having your index survive a server power outage or other
crash is a very good thing. Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

Ken

#4Simon Riggs
simon@2ndQuadrant.com
In reply to: Jaime Casanova (#2)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 17:42 -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 4:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Hot Standby won't work with hash indexes because they are
non-recoverable.

We have a number of ways of dealing with this:

i don't see a reason for inventing the wheel, we don't have wal for
hash indexes because makes those more slow without any benefit at
all... now there will be one...

Well, we're running short of time for 8.4 to put it mildly, so option
(1) is not on my radar. Even if somebody wrote WAL support for hash
indexes right now, I would be much happier with my other two suggestions
from a robustness perspective. We don't yet have a mechanism for an
index AM to say "damn, this index is screwed up, don't use it". So a
rushed implementation of WAL support would be counterproductive, ISTM.

So it's either (2), (3) or another option.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#5Simon Riggs
simon@2ndQuadrant.com
In reply to: Kenneth Marshall (#3)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 16:47 -0600, Kenneth Marshall wrote:

I think having your index survive a server power outage or other
crash is a very good thing. Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

Completely agree.

We may be outta time to make it happen.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#6Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Simon Riggs (#4)
Re: Preventing index scans for non-recoverable index AMs

On Wed, Dec 17, 2008 at 5:54 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

We don't yet have a mechanism for an
index AM to say "damn, this index is screwed up, don't use it".

mark pg_index.indisvalid and/or pg_index.indisready to false in the
hot standby node?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#7Kenneth Marshall
ktm@rice.edu
In reply to: Simon Riggs (#5)
Re: Preventing index scans for non-recoverable index AMs

On Wed, Dec 17, 2008 at 10:58:11PM +0000, Simon Riggs wrote:

On Wed, 2008-12-17 at 16:47 -0600, Kenneth Marshall wrote:

I think having your index survive a server power outage or other
crash is a very good thing. Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

Completely agree.

We may be outta time to make it happen.

I agree. I was working on adding the WAL and ran up against the
deadline. A rushed hash WAL implementation would be worse than the
other alternatives. I plan on picking it back up after 8.4 is out
the door.

Regards,
Ken

#8Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Kenneth Marshall (#3)
Re: Preventing index scans for non-recoverable index AMs

On Wed, Dec 17, 2008 at 5:47 PM, Kenneth Marshall <ktm@rice.edu> wrote:

Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

there's such a situation?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#9Kenneth Marshall
ktm@rice.edu
In reply to: Jaime Casanova (#8)
Re: Preventing index scans for non-recoverable index AMs

On Wed, Dec 17, 2008 at 06:07:41PM -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 5:47 PM, Kenneth Marshall <ktm@rice.edu> wrote:

Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

there's such a situation?

As of 8.4, yes.

Ken

In reply to: Kenneth Marshall (#9)
Re: Preventing index scans for non-recoverable index AMs

On Wed, Dec 17, 2008 at 05:10:40PM -0600, Kenneth Marshall wrote:

On Wed, Dec 17, 2008 at 06:07:41PM -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 5:47 PM, Kenneth Marshall <ktm@rice.edu> wrote:

Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

there's such a situation?

As of 8.4, yes.

In addition, hash indexes can index items larger than the 1/3 page
limit of btree indexes.

Ken

#11Simon Riggs
simon@2ndQuadrant.com
In reply to: Jaime Casanova (#6)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 18:01 -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 5:54 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

We don't yet have a mechanism for an
index AM to say "damn, this index is screwed up, don't use it".

mark pg_index.indisvalid and/or pg_index.indisready to false in the
hot standby node?

We can't edit the database until recovery is over, so that doesn't help
us while in recovery mode. So not an option.

It doesn't help us after recovery mode either because there is no
infrastructure (yet) for an index AM's rmgr to exact a transaction after
completion of recovery. So yes, that is the way it will be done, but
there is a non-trivial effort to allow it to happen.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#12Alvaro Herrera
alvherre@commandprompt.com
In reply to: Simon Riggs (#11)
Re: Preventing index scans for non-recoverable index AMs

Simon Riggs wrote:

On Wed, 2008-12-17 at 18:01 -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 5:54 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

We don't yet have a mechanism for an
index AM to say "damn, this index is screwed up, don't use it".

mark pg_index.indisvalid and/or pg_index.indisready to false in the
hot standby node?

We can't edit the database until recovery is over, so that doesn't help
us while in recovery mode. So not an option.

Maybe we should add a WAL record that's the physical representation for
"mark this index invalid", and have any transaction that modifies a hash
index write that to WAL. It should be simple code to write, because
the underlying replay is based on a regular heap update.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#13Simon Riggs
simon@2ndQuadrant.com
In reply to: Alvaro Herrera (#12)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 21:23 -0300, Alvaro Herrera wrote:

Simon Riggs wrote:

On Wed, 2008-12-17 at 18:01 -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 5:54 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

We don't yet have a mechanism for an
index AM to say "damn, this index is screwed up, don't use it".

mark pg_index.indisvalid and/or pg_index.indisready to false in the
hot standby node?

We can't edit the database until recovery is over, so that doesn't help
us while in recovery mode. So not an option.

Maybe we should add a WAL record that's the physical representation for
"mark this index invalid", and have any transaction that modifies a hash
index write that to WAL. It should be simple code to write, because
the underlying replay is based on a regular heap update.

Doesn't sound like it would work. It doesn't really matter how you
*decide* to do this, it's when you do this that counts.

What we need is a way for recovery to remember a list of pending actions
that can then be issued in a transaction at the end of recovery. As you
pointed out, the Startup process cannot issue transactions, so that
means this is harder than it should be.

Short route is to:
* allow Startup process to run transactions (when recovery finished)
* introduce another rmgr API call that gets called in its own
transaction at the end of recovery

Longer route is to
* allow startup process to queue up work following recovery
* have another process (autovac-ish) get spawned immediately after
recovery to read the list and execute - this would allow us to startup
quickly even if the rmgr decided to mark index invalid and then
completely rebuild the index.

Which sounds like a major project in itself.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#14Josh Berkus
josh@agliodbs.com
In reply to: Alvaro Herrera (#12)
Re: Preventing index scans for non-recoverable index AMs

All,

Are we really releasing an index type without recoverability for 8.4?
Will this be in /contrib?

--Josh

#15Jeff Davis
pgsql@j-davis.com
In reply to: Kenneth Marshall (#9)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 17:10 -0600, Kenneth Marshall wrote:

On Wed, Dec 17, 2008 at 06:07:41PM -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 5:47 PM, Kenneth Marshall <ktm@rice.edu> wrote:

Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

there's such a situation?

As of 8.4, yes.

My understanding was that the hash index type never supported
recoverability, and could require a rebuild on power failure.

If it's not written to WAL before the data page changes, how could it be
safe for recovery? The tuple inserts are logged, so during recovery the
tuple would be put in the table but the index would not be updated.

What am I missing?

Regards,
Jeff Davis

#16Simon Riggs
simon@2ndQuadrant.com
In reply to: Josh Berkus (#14)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 18:09 -0800, Josh Berkus wrote:

Are we really releasing an index type without recoverability for 8.4?
Will this be in /contrib?

Worse than that, I'm talking about hash indexes (which are already here,
without WAL)

I would not contemplate a new index type without WAL.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#17Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#1)
1 attachment(s)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 21:26 +0000, Simon Riggs wrote:

Hot Standby won't work with hash indexes because they are
non-recoverable.

We have a number of ways of dealing with this:

2. Specific Solution: make hashcostestimate() massively increase cost of
scans during recovery so that they will very seldom be picked and make
hashbeginscan() refuse scans during recovery in case they do happen

(2) seems most appropriate, since it will last only until (1) is
complete in a later release.

Code *fragment* enclosed here for discussion. (Not an independent patch)

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

Attachments:

hs.hashindex.v6.patchtext/x-patch; charset=utf-8; name=hs.hashindex.v6.patchDownload
Index: src/backend/access/hash/hash.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/hash/hash.c,v
retrieving revision 1.107
diff -c -r1.107 hash.c
*** src/backend/access/hash/hash.c	13 Nov 2008 17:42:10 -0000	1.107
--- src/backend/access/hash/hash.c	17 Dec 2008 22:46:43 -0000
***************
*** 341,346 ****
--- 341,357 ----
  	IndexScanDesc scan;
  	HashScanOpaque so;
  
+ 	/* 
+ 	 * Hash indexes are not recoverable, so should not be used
+ 	 * during recovery mode. We try to avoid this by tweaking the 
+ 	 * cost of hash index scans during recovery (see selfuncs.c),
+ 	 * but we may still get called, so specifically prevent scans here.
+ 	 * This is a reasonably ugly hack that we expect to remove again
+ 	 * as soon as hash indexes write WAL records.
+ 	 */
+ 	if (IsRecoveryProcessingMode())
+ 		elog(ERROR, "Cannot use hash index while recovery is in progress");
+ 
  	scan = RelationGetIndexScan(rel, keysz, scankey);
  	so = (HashScanOpaque) palloc(sizeof(HashScanOpaqueData));
  	so->hashso_bucket_valid = false;
Index: src/backend/utils/adt/selfuncs.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/utils/adt/selfuncs.c,v
retrieving revision 1.257
diff -c -r1.257 selfuncs.c
*** src/backend/utils/adt/selfuncs.c	23 Oct 2008 00:24:50 -0000	1.257
--- src/backend/utils/adt/selfuncs.c	17 Dec 2008 22:41:17 -0000
***************
*** 5911,5919 ****
  	Selectivity *indexSelectivity = (Selectivity *) PG_GETARG_POINTER(6);
  	double	   *indexCorrelation = (double *) PG_GETARG_POINTER(7);
  
  	genericcostestimate(root, index, indexQuals, outer_rel, 0.0,
  						indexStartupCost, indexTotalCost,
! 						indexSelectivity, indexCorrelation);
  
  	PG_RETURN_VOID();
  }
--- 5911,5929 ----
  	Selectivity *indexSelectivity = (Selectivity *) PG_GETARG_POINTER(6);
  	double	   *indexCorrelation = (double *) PG_GETARG_POINTER(7);
  
+ 	/* 
+ 	 * Hash indexes are not recoverable, so should not be used
+ 	 * during recovery mode. Dissuade attempts to use them by adding
+ 	 * the same disableCost as if enable_indexscan = off was selected.
+ 	 * This is a reasonably ugly hack that we expect to remove again
+ 	 * as soon as hash indexes write WAL records.
+ 	 */
+ 	if (IsRecoveryProcessingMode())
+ 		*indexStartupCost += 100000000.0;
+ 
  	genericcostestimate(root, index, indexQuals, outer_rel, 0.0,
  						indexStartupCost, indexTotalCost,
! 						indexSelectivity, indexCorrelation);		
  
  	PG_RETURN_VOID();
  }
#18Jeff Davis
pgsql@j-davis.com
In reply to: Jeff Davis (#15)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 18:20 -0800, Jeff Davis wrote:

On Wed, 2008-12-17 at 17:10 -0600, Kenneth Marshall wrote:

On Wed, Dec 17, 2008 at 06:07:41PM -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 5:47 PM, Kenneth Marshall <ktm@rice.edu> wrote:

Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

there's such a situation?

As of 8.4, yes.

My understanding was that the hash index type never supported
recoverability, and could require a rebuild on power failure.

If it's not written to WAL before the data page changes, how could it be
safe for recovery? The tuple inserts are logged, so during recovery the
tuple would be put in the table but the index would not be updated.

What am I missing?

On second read, it occurs to me that you may have meant: "as of 8.4,
hash indexes have never been safe" but I read it as: "as of 8.4, hash
indexes will require rebuild on crash, whereas that was unnecessary
before 8.4".

If you meant the former, you can disregard my question.

Regards,
Jeff Davis

#19Alvaro Herrera
alvherre@commandprompt.com
In reply to: Simon Riggs (#13)
Re: Preventing index scans for non-recoverable index AMs

Simon Riggs wrote:

On Wed, 2008-12-17 at 21:23 -0300, Alvaro Herrera wrote:

Maybe we should add a WAL record that's the physical representation for
"mark this index invalid", and have any transaction that modifies a hash
index write that to WAL. It should be simple code to write, because
the underlying replay is based on a regular heap update.

Doesn't sound like it would work. It doesn't really matter how you
*decide* to do this, it's when you do this that counts.

Hmm, it doesn't seem like you understood my suggestion ... basically I'm
saying that a hash index insert/delete should put out this WAL record:

HEAP update address-of-pg_index-tuple set indisvalid=false

(I'm just guessing at indisvalid but you get my point)

No need to remember anything. Of course, the user then needs to fix the
index after the fact.

Of course, for 8.5 we would do something smarter.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#20Simon Riggs
simon@2ndQuadrant.com
In reply to: Alvaro Herrera (#19)
Re: Preventing index scans for non-recoverable index AMs

On Wed, 2008-12-17 at 23:28 -0300, Alvaro Herrera wrote:

Hmm, it doesn't seem like you understood my suggestion ... basically I'm
saying that a hash index insert/delete should put out this WAL record:

HEAP update address-of-pg_index-tuple set indisvalid=false

(I'm just guessing at indisvalid but you get my point)

That would be simple and I'm very sorry to say I still don't think it
would work. But yes, I did misunderstand you.

In-progress hash index scans would not be prevented from executing by
the WAL record, so you might end up following a bad pointer. We probably
wouldn't want to try killing anybody using index either, since that
would end up as a complete bloodbath.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#21Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Jeff Davis (#15)
Re: Preventing index scans for non-recoverable index AMs

Jeff Davis wrote:

On Wed, 2008-12-17 at 17:10 -0600, Kenneth Marshall wrote:

On Wed, Dec 17, 2008 at 06:07:41PM -0500, Jaime Casanova wrote:

On Wed, Dec 17, 2008 at 5:47 PM, Kenneth Marshall <ktm@rice.edu> wrote:

Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

there's such a situation?

As of 8.4, yes.

My understanding was that the hash index type never supported
recoverability, and could require a rebuild on power failure.

Right, this is certainly not a new problem. It's not even a new problem
in the context of replication or hot standby, because we already have
the problem with PITR and file-based log shipping.

Also, it's not just a problem *during* the recovery. The index is just
as corrupt after the recovery has finished.

I think we should just leave it alone for 8.4, and fix it properly in a
future relase by implementing WAL-logging for hash indexes.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#22Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#21)
Re: Preventing index scans for non-recoverable index AMs

On Thu, 2008-12-18 at 08:29 +0200, Heikki Linnakangas wrote:

Right, this is certainly not a new problem. It's not even a new problem
in the context of replication or hot standby, because we already have
the problem with PITR and file-based log shipping.

Also, it's not just a problem *during* the recovery. The index is just
as corrupt after the recovery has finished.

Agreed.

I think we should just leave it alone for 8.4, and fix it properly in a
future relase by implementing WAL-logging for hash indexes.

You really think we should just leave it alone? That gives me less work,
so I will accept that if you think so. Gives me the shivers though.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#23Peter Eisentraut
peter_e@gmx.net
In reply to: Simon Riggs (#22)
Re: Preventing index scans for non-recoverable index AMs

Simon Riggs wrote:

I think we should just leave it alone for 8.4, and fix it properly in a
future relase by implementing WAL-logging for hash indexes.

You really think we should just leave it alone? That gives me less work,
so I will accept that if you think so. Gives me the shivers though.

At least you ought to feel free to treat it as an independent problem
from your current project.

#24Simon Riggs
simon@2ndQuadrant.com
In reply to: Peter Eisentraut (#23)
Re: Preventing index scans for non-recoverable index AMs

On Thu, 2008-12-18 at 13:52 +0200, Peter Eisentraut wrote:

Simon Riggs wrote:

I think we should just leave it alone for 8.4, and fix it properly in a
future relase by implementing WAL-logging for hash indexes.

You really think we should just leave it alone? That gives me less work,
so I will accept that if you think so. Gives me the shivers though.

At least you ought to feel free to treat it as an independent problem
from your current project.

OK, good. Some weight off my shoulders :-)

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#25Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Heikki Linnakangas (#21)
Re: Preventing index scans for non-recoverable index AMs

On Thu, Dec 18, 2008 at 11:59 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

Right, this is certainly not a new problem. It's not even a new problem in
the context of replication or hot standby, because we already have the
problem with PITR and file-based log shipping.

Also, it's not just a problem *during* the recovery. The index is just as
corrupt after the recovery has finished.

Just curious, how do we handle the case of corrupted hash index today
? If we can detect that the index is corrupt because of bad page
headers etc, then its still OK; we can throw an error. But what if the
hash index is used after recovery and it returns wrong tuple(s) ?

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

#26Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Pavan Deolasee (#25)
Re: Preventing index scans for non-recoverable index AMs

Pavan Deolasee wrote:

Just curious, how do we handle the case of corrupted hash index today?

We don't.

If we can detect that the index is corrupt because of bad page
headers etc, then its still OK; we can throw an error. But what if the
hash index is used after recovery and it returns wrong tuple(s) ?

You get to keep both pieces..

In short, don't use hash index, unless you're prepared to run REINDEX
manually after every crash.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#27Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Heikki Linnakangas (#26)
Re: Preventing index scans for non-recoverable index AMs

On Thu, Dec 18, 2008 at 6:02 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

In short, don't use hash index, unless you're prepared to run REINDEX
manually after every crash.

I think that should be mentioned in *bold* letters in the
documentation. The doc currently has the following: "so hash indexes
might need to be rebuilt with REINDEX after a database crash" This
isn't a strong statement. How would we handle automatic recovery where
user may not even get chance to run REINDEX before his database is
corrupted ?

Hot standby will fail miserably with hash indexes since the index
would be completely useless at the standby (but planner will
nevertheless try to use it), IMHO either hash index should not be
supported at all or should be WAL logged and properly handled in
presence of hot standby.

BTW, if there is no proven case where hash index works significantly
better than btree (that's what the doc says), why not just completely
abandon it ?

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

#28Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Pavan Deolasee (#27)
Re: Preventing index scans for non-recoverable index AMs

Pavan Deolasee wrote:

BTW, if there is no proven case where hash index works significantly
better than btree (that's what the doc says), why not just completely
abandon it ?

That has been considered many times, see archives. I believe the changes
done in 8.4 actually made it faster for some cases. And as Kenneth
pointed out hash indexes can handle keys larger than 1/3 of page size,
that b-tree can't.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#29Greg Stark
greg.stark@enterprisedb.com
In reply to: Heikki Linnakangas (#28)
Re: Preventing index scans for non-recoverable index AMs

It would be perfectly reasonable to add an amisrecoverable like Simon
described. It could automatically set indisvalid to false after a
crash and treat the index as if indisvalid is false during recovery.
That would be a lot smoother and safer than what we have now.

It might even be possible to do this with a new wal record type so it
only happens if there was a write to the index. I imagine most users
who read that warning and use hash indexes anyways are using them on
read-only tables where they know it's safe.

--
Greg

On 18 Dec 2008, at 07:51, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com

Show quoted text

wrote:

Pavan Deolasee wrote:

BTW, if there is no proven case where hash index works significantly
better than btree (that's what the doc says), why not just completely
abandon it ?

That has been considered many times, see archives. I believe the
changes done in 8.4 actually made it faster for some cases. And as
Kenneth pointed out hash indexes can handle keys larger than 1/3 of
page size, that b-tree can't.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30Peter Eisentraut
peter_e@gmx.net
In reply to: Greg Stark (#29)
Re: Preventing index scans for non-recoverable index AMs

Greg Stark wrote:

It would be perfectly reasonable to add an amisrecoverable like Simon
described. It could automatically set indisvalid to false after a crash
and treat the index as if indisvalid is false during recovery. That
would be a lot smoother and safer than what we have now.

It might even be possible to do this with a new wal record type so it
only happens if there was a write to the index. I imagine most users who
read that warning and use hash indexes anyways are using them on
read-only tables where they know it's safe.

This is essentially Alvaro's suggestions, which Simon has already given
a counterargument to.

#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#30)
Re: Preventing index scans for non-recoverable index AMs

Peter Eisentraut <peter_e@gmx.net> writes:

Greg Stark wrote:

It would be perfectly reasonable to add an amisrecoverable like Simon
described. It could automatically set indisvalid to false after a crash
and treat the index as if indisvalid is false during recovery. That
would be a lot smoother and safer than what we have now.

It might even be possible to do this with a new wal record type so it
only happens if there was a write to the index. I imagine most users who
read that warning and use hash indexes anyways are using them on
read-only tables where they know it's safe.

This is essentially Alvaro's suggestions, which Simon has already given
a counterargument to.

The long and the short of it is that the reason hash indexes still don't
have WAL support is no one's seen fit to do the work. I do not see the
point of proposing to expend work to substitute for that work.

I think all that ought to be done here is document that hash indexes
shouldn't be used in a replication or PITR environment.

regards, tom lane