indexes no longer used after shutdown during reindexing

Started by Matt Dewover 14 years ago12 messagesgeneral
Jump to latest
#1Matt Dew
mattd@consistentstate.com

Hello all,
I have a database that was shut down, cleanly, during an 'reindex
table' command. When the database came back up, queries against that
table started doing sequential scans instead of using the indexes as
they had been up until that point.

We tried:
1) vacuuming the table (vacuum tblName)
2) reindexing the table (reindex table tblName)
3) dropping and recreating the indexes

but none of those actions helped. We ended up recreating the table by
renaming the table and doing a create table as select * from oldTable
and readding the indexes. This worked.

This problem presented itself as an application timing out. It took
several people, several hours to track this down and solve it.

Several months ago I had two other tables also stopped using their
indexes. Those times however I don't know if a database shutdown caused
the problem.

Has anyone had this problem? If so, what specifically is the cause? Is
shutting down a database during a table rebuild or vacuum an absolute no-no?

Any and all help or insight would be appreciated,
Matt

#2Scott Marlowe
scott.marlowe@gmail.com
In reply to: Matt Dew (#1)
Re: indexes no longer used after shutdown during reindexing

On Wed, Jan 11, 2012 at 10:42 AM, Matt Dew <mattd@consistentstate.com> wrote:

Hello all,
  I have a database that was shut down, cleanly, during an 'reindex table'
 command.  When the database came back up, queries against that table
started doing sequential scans instead of using the indexes as they had been
up until that point.

We tried:
1) vacuuming the table (vacuum tblName)
2) reindexing the table (reindex table tblName)
3) dropping and recreating the indexes

but none of those actions helped.   We ended up recreating the table by
renaming the table and doing a create table as select * from oldTable and
readding the indexes.  This worked.

This problem presented itself as an application timing out. It took several
people, several hours to track this down and solve it.

Several months ago I had two other tables also stopped using their indexes.
 Those times however I don't know if a database shutdown caused the problem.

Has anyone had this problem?  If so, what specifically is the cause?  Is
shutting down a database during a table rebuild or vacuum an absolute no-no?

Any and all help or insight would be appreciated,
Matt

You likely had an invalid index, I've seen that crop up when doing a
create index concurrently. Just a guess. What did or does \d of the
table and its indexes show? Look for invalid in the output.

#3Matt Dew
mattd@consistentstate.com
In reply to: Scott Marlowe (#2)
Re: indexes no longer used after shutdown during reindexing

On 01/11/2012 11:07 AM, Scott Marlowe wrote:

On Wed, Jan 11, 2012 at 10:42 AM, Matt Dew<mattd@consistentstate.com> wrote:

Hello all,
I have a database that was shut down, cleanly, during an 'reindex table'
command. When the database came back up, queries against that table
started doing sequential scans instead of using the indexes as they had been
up until that point.

We tried:
1) vacuuming the table (vacuum tblName)
2) reindexing the table (reindex table tblName)
3) dropping and recreating the indexes

but none of those actions helped. We ended up recreating the table by
renaming the table and doing a create table as select * from oldTable and
readding the indexes. This worked.

This problem presented itself as an application timing out. It took several
people, several hours to track this down and solve it.

Several months ago I had two other tables also stopped using their indexes.
Those times however I don't know if a database shutdown caused the problem.

Has anyone had this problem? If so, what specifically is the cause? Is
shutting down a database during a table rebuild or vacuum an absolute no-no?

Any and all help or insight would be appreciated,
Matt

You likely had an invalid index, I've seen that crop up when doing a
create index concurrently. Just a guess. What did or does \d of the
table and its indexes show? Look for invalid in the output.

Hi Scott,
The output of \d looked normal. Nothing weird or different than before.

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matt Dew (#1)
Re: indexes no longer used after shutdown during reindexing

Matt Dew <mattd@consistentstate.com> writes:

I have a database that was shut down, cleanly, during an 'reindex
table' command. When the database came back up, queries against that
table started doing sequential scans instead of using the indexes as
they had been up until that point.

What exactly is your definition of a "clean shutdown"? At the very
least you'd have had to abort the session running the reindex. Also,
what PG version is this, and what are the index definitions?

regards, tom lane

#5Matt Dew
mattd@consistentstate.com
In reply to: Tom Lane (#4)
Re: indexes no longer used after shutdown during reindexing

On 01/11/2012 04:29 PM, Tom Lane wrote:

Matt Dew<mattd@consistentstate.com> writes:

I have a database that was shut down, cleanly, during an 'reindex
table' command. When the database came back up, queries against that
table started doing sequential scans instead of using the indexes as
they had been up until that point.

What exactly is your definition of a "clean shutdown"? At the very
least you'd have had to abort the session running the reindex. Also,
what PG version is this, and what are the index definitions?

Is a reboot command considered a clean shutdown? It's a redhat box
which called /etc/init.d/postgresql stop, which does: pg_ctl stop -D
'$PGDATA' -s -m fast

We're using v8.3.9

"idx1" UNIQUE, btree (id)
"idx_2" btree (homeaddress)
"idx_3" btree (f3)
"idx_4" btree (lower(firstname::text) varchar_pattern_ops)
"idx_5" btree (lower(lastname::text) varchar_pattern_ops)
"idx_6" btree (lower(lastname::text) varchar_pattern_ops,
lower(firstname::text) varchar_pattern_ops, id, f5)
"idx_7" btree (s2id)
"idx_8" btree (sid, lower(memberusername::text)
varchar_pattern_ops, lower(email::text) varchar_pattern_ops, birthdate)
"idx_9" btree (id, f5) WHERE f5 = false

I'm in a rabbit hole. I dug in more and learned that that problem may
have existed before the shutdown. I believe the root problem is still
the same though; having to recreate the table to get it to use indexes.

thanks for any help,
Matt

Show quoted text

regards, tom lane

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matt Dew (#5)
Re: indexes no longer used after shutdown during reindexing

Matt Dew <mattd@consistentstate.com> writes:

On 01/11/2012 04:29 PM, Tom Lane wrote:

What exactly is your definition of a "clean shutdown"?

Is a reboot command considered a clean shutdown? It's a redhat box
which called /etc/init.d/postgresql stop, which does: pg_ctl stop -D
'$PGDATA' -s -m fast

Well, a fast-mode stop would abort the reindex operation, but that
should certainly have left the catalog entries in the same state as
before, so there's no obvious reason here why the indexes would've
stopped being used.

We're using v8.3.9

That's a tad old. Please consult
http://www.postgresql.org/docs/8.3/static/release.html
for reasons why an update might be a good idea. I don't recall any
8.3.x bugs that might be related to this, but I haven't trawled the
commit logs to see what I've forgotten, either.

I'm in a rabbit hole. I dug in more and learned that that problem may
have existed before the shutdown. I believe the root problem is still
the same though; having to recreate the table to get it to use indexes.

Hmm. If that's the case then we don't have to explain how an aborted
reindex operation could have affected the usability of the old indexes,
so I'm inclined to believe that it didn't. Which seems to mean that you
have a garden variety "why won't the planner use my index" issue, not
something unusual. If you no longer have the original table then it may
be impossible to investigate further; but if you can recreate the state
where it's not using the index(es), please see
http://wiki.postgresql.org/wiki/Slow_Query_Questions
and pursue the issue on pgsql-performance.

regards, tom lane

#7Matt Dew
mattd@consistentstate.com
In reply to: Tom Lane (#6)
Re: indexes no longer used after shutdown during reindexing

On 01/12/2012 01:21 PM, Tom Lane wrote:

Matt Dew<mattd@consistentstate.com> writes:

On 01/11/2012 04:29 PM, Tom Lane wrote:

What exactly is your definition of a "clean shutdown"?

Is a reboot command considered a clean shutdown? It's a redhat box
which called /etc/init.d/postgresql stop, which does: pg_ctl stop -D
'$PGDATA' -s -m fast

Well, a fast-mode stop would abort the reindex operation, but that
should certainly have left the catalog entries in the same state as
before, so there's no obvious reason here why the indexes would've
stopped being used.

We're using v8.3.9

That's a tad old. Please consult
http://www.postgresql.org/docs/8.3/static/release.html
for reasons why an update might be a good idea. I don't recall any
8.3.x bugs that might be related to this, but I haven't trawled the
commit logs to see what I've forgotten, either.

I'm in a rabbit hole. I dug in more and learned that that problem may
have existed before the shutdown. I believe the root problem is still
the same though; having to recreate the table to get it to use indexes.

Hmm. If that's the case then we don't have to explain how an aborted
reindex operation could have affected the usability of the old indexes,
so I'm inclined to believe that it didn't. Which seems to mean that you
have a garden variety "why won't the planner use my index" issue, not
something unusual. If you no longer have the original table then it may
be impossible to investigate further; but if you can recreate the state
where it's not using the index(es), please see
http://wiki.postgresql.org/wiki/Slow_Query_Questions
and pursue the issue on pgsql-performance.

Thanks Tom. I have the original database stored away for investigation.
This was a serious problem so we're investigatng how to prevent this in
the future. It's strange because even though it looks like this problem
did happen before the reboot, it was once in a while. After the reboot
it was everytime and the application completely stopped working.

Plus after the reboot even a simple query against the table:
select * from tbl where id=1; was/is doing sequential scans.

It's a smallish table, just under 5 million rows.

Thanks for the link. I'm using that.

It's on a netapp if that matters. (Not my choice.)

#8Matt Dew
mattd@consistentstate.com
In reply to: Tom Lane (#6)
Re: indexes no longer used after shutdown during reindexing

On 01/12/2012 01:21 PM, Tom Lane wrote:

Matt Dew<mattd@consistentstate.com> writes:

On 01/11/2012 04:29 PM, Tom Lane wrote:

What exactly is your definition of a "clean shutdown"?

Is a reboot command considered a clean shutdown? It's a redhat box
which called /etc/init.d/postgresql stop, which does: pg_ctl stop -D
'$PGDATA' -s -m fast

Well, a fast-mode stop would abort the reindex operation, but that
should certainly have left the catalog entries in the same state as
before, so there's no obvious reason here why the indexes would've
stopped being used.

We're using v8.3.9

That's a tad old. Please consult
http://www.postgresql.org/docs/8.3/static/release.html
for reasons why an update might be a good idea. I don't recall any
8.3.x bugs that might be related to this, but I haven't trawled the
commit logs to see what I've forgotten, either.

I'm in a rabbit hole. I dug in more and learned that that problem may
have existed before the shutdown. I believe the root problem is still
the same though; having to recreate the table to get it to use indexes.

Hmm. If that's the case then we don't have to explain how an aborted
reindex operation could have affected the usability of the old indexes,
so I'm inclined to believe that it didn't. Which seems to mean that you
have a garden variety "why won't the planner use my index" issue, not
something unusual. If you no longer have the original table then it may
be impossible to investigate further; but if you can recreate the state
where it's not using the index(es), please see
http://wiki.postgresql.org/wiki/Slow_Query_Questions
and pursue the issue on pgsql-performance.

regards, tom lane

Thanks Tom.

An interesting sidenote we realized. the nice system shutdown script
/etc/init.d/postgres doesn't actually wait for the db to be down, it
just waits for pg_ctl to return.

I'm guessing it's not good when the box shuts down before postgres is.

Matt

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matt Dew (#8)
Re: indexes no longer used after shutdown during reindexing

Matt Dew <mattd@consistentstate.com> writes:

An interesting sidenote we realized. the nice system shutdown script
/etc/init.d/postgres doesn't actually wait for the db to be down, it
just waits for pg_ctl to return.

By default, "pg_ctl stop" does wait for the server to shut down ...

regards, tom lane

#10Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Tom Lane (#9)
Re: indexes no longer used after shutdown during reindexing

On 13.1.2012 22:20, Tom Lane wrote:

Matt Dew <mattd@consistentstate.com> writes:

An interesting sidenote we realized. the nice system shutdown script
/etc/init.d/postgres doesn't actually wait for the db to be down, it
just waits for pg_ctl to return.

By default, "pg_ctl stop" does wait for the server to shut down ...

Not really. It waits for up to 60 seconds and if the shutdown was not
successful (as there was a connected client), it prints a message to the log

pg_ctl: server does not shut down
HINT: The "-m fast" option immediately disconnects sessions
rather than

and returns 1.

If you really need to wait for shutdown, you need to add "-w" to the
command line, use "-m fast" or "-m immediate".

But even ignoring the return value should not cause corruption IMHO.

Tomas

#11Matt Dew
mattd@consistentstate.com
In reply to: Tomas Vondra (#10)
Re: indexes no longer used after shutdown during reindexing

On 01/13/2012 02:49 PM, Tomas Vondra wrote:

On 13.1.2012 22:20, Tom Lane wrote:

Matt Dew<mattd@consistentstate.com> writes:

An interesting sidenote we realized. the nice system shutdown script
/etc/init.d/postgres doesn't actually wait for the db to be down, it
just waits for pg_ctl to return.

By default, "pg_ctl stop" does wait for the server to shut down ...

Not really. It waits for up to 60 seconds and if the shutdown was not
successful (as there was a connected client), it prints a message to the log

pg_ctl: server does not shut down
HINT: The "-m fast" option immediately disconnects sessions
rather than

and returns 1.

If you really need to wait for shutdown, you need to add "-w" to the
command line, use "-m fast" or "-m immediate".

But even ignoring the return value should not cause corruption IMHO.

Thanks Tom and Tomas,
I remember -w now, but I'd long forgotten about it.

If the pg_ctl returns a 1 but the machine physically powers off, there
is a chance for corruption though right? Postgres is trying to write
stuff to disk and clean up and BAM power goes out. ?

There is a chance for corruption though if the machine physically powers
off after the pg_ctl return

#12Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Matt Dew (#11)
Re: indexes no longer used after shutdown during reindexing

On 20.1.2012 19:47, Matt Dew wrote:

On 01/13/2012 02:49 PM, Tomas Vondra wrote:

On 13.1.2012 22:20, Tom Lane wrote:

Matt Dew<mattd@consistentstate.com> writes:

An interesting sidenote we realized. the nice system shutdown script
/etc/init.d/postgres doesn't actually wait for the db to be down, it
just waits for pg_ctl to return.

By default, "pg_ctl stop" does wait for the server to shut down ...

Not really. It waits for up to 60 seconds and if the shutdown was not
successful (as there was a connected client), it prints a message to
the log

pg_ctl: server does not shut down
HINT: The "-m fast" option immediately disconnects sessions
rather than

and returns 1.

If you really need to wait for shutdown, you need to add "-w" to the
command line, use "-m fast" or "-m immediate".

But even ignoring the return value should not cause corruption IMHO.

Thanks Tom and Tomas,
I remember -w now, but I'd long forgotten about it.

If the pg_ctl returns a 1 but the machine physically powers off, there
is a chance for corruption though right? Postgres is trying to write
stuff to disk and clean up and BAM power goes out. ?

There is a chance for corruption though if the machine physically powers
off after the pg_ctl return

There are various types of corruption. If you power off the system
before the database properly shuts down, the data files will be
corrupted. But this should be fixed on the next database startup - the
database should find out it was not switched properly and perform a
recovery (replay the WAL logs).

So yes, it would be corrupted but fixed on the next startup. And thus
should not cause issues like the one you describe.

Tomas