Mysterious table that exists but doesn't exist

Started by Dale Fukamialmost 13 years ago8 messagesgeneral
Jump to latest
#1Dale Fukami
dale.fukami@gmail.com

Hi,

I'm having a problem on a standby server (streaming replication) where a
table seems to exist but is not queryable. Essentially a select statement
(and drop/insert/etc) fails but \d and pg_tables show it exists. The table
exists on the master (and is queryable) and replication is still working in
that changes to the master table don't cause errors on the standby and
changes to other tables are replicated and verified to be on the standby.
Queries from the standby pasted below.

I have a couple of questions that arise from this:

1) Any thoughts on what is going on here?
2) If there were corruption or something in the data for that particular
table on the standby only, would replication report a failure (i.e., be
unable to apply the binary changes) or would the binary changes still just
sort of happen overtop the bad data? Because in this case, replication is
still running without reporting any errors.
3) We managed to discover this by accident during some other routine work
we do from a snapshot we'd taken of the standby drives (6 volume raid0). I
had assumed that if replication and the pg_last_xlog_receive_location
information was up to date then I was safe but, in this case, replication
continued to run but the standby is essentially unusable as a failover
option since this table is not available. Is there some other way to be
certain that a standby server is "consistent" with master?

Thanks,
Dale

psql session output
----------------------------

live=# set search_path to someschema;
SET
live=# select * from tracked_deductibles;
ERROR: relation "tracked_deductibles" does not exist
LINE 1: select * from tracked_deductibles;
live=# select * from someschema.tracked_deductibles;
ERROR: relation "someschema.tracked_deductibles" does not exist
LINE 1: select * from someschema.tracked_deductibles;

live=# select *,'X'||tablename||'X' from pg_tables where schemaname =
'someschema' and tablename ilike '%tracked%';
schemaname | tablename | tableowner | tablespace |
hasindexes | hasrules | hastriggers | ?column?
------------+-------------------------+------------+------------+------------+----------+-------------+---------------------------
someschema | tracked_deductibles | live | | t
| f | t | Xtracked_deductiblesX
someschema | tracked_deductibles_log | live | | t
| f | f | Xtracked_deductibles_logX
(2 rows)

live=#\d
List of relations
Schema | Name |
Type | Owner
------------+-------------------------------------------------------------+----------+-------
<snip>
someschema | tracked_deductibles |
table | live
someschema | tracked_deductibles_id_seq |
sequence | live
<snip>

#2John R Pierce
pierce@hogranch.com
In reply to: Dale Fukami (#1)
Re: Mysterious table that exists but doesn't exist

On 4/16/2013 1:29 PM, Dale Fukami wrote:

live=# select * from someschema.tracked_deductibles;
ERROR: relation "someschema.tracked_deductibles" does not exist
LINE 1: select * from someschema.tracked_deductibles;
........
live=#\d
List of relations
Schema | Name | Type |
Owner
------------+-------------------------------------------------------------+----------+-------
<snip>
someschema | tracked_deductibles | table | live

someschema.... any chance that has mixed case in it? try
"someschema".tracked_deductibles

--
john r pierce 37N 122W
somewhere on the middle of the left coast

#3Dale Fukami
dale.fukami@gmail.com
In reply to: John R Pierce (#2)
Re: Mysterious table that exists but doesn't exist

Tried the select with the quotes around each of the schema and table and I
get the same result ("someschema"."tracked_deductibles"). I'm pretty
certain they're both lowercase. Would the select from pg_tables show if
they were mixed case by accident?

Dale

On Tue, Apr 16, 2013 at 2:38 PM, John R Pierce <pierce@hogranch.com> wrote:

Show quoted text

On 4/16/2013 1:29 PM, Dale Fukami wrote:

live=# select * from someschema.tracked_deductibles;
ERROR: relation "someschema.tracked_deductibles" does not exist
LINE 1: select * from someschema.tracked_deductibles;
........
live=#\d
List of relations
Schema | Name
| Type | Owner

------------+-------------------------------------------------------------+----------+-------
<snip>
someschema | tracked_deductibles
| table | live

someschema.... any chance that has mixed case in it? try
"someschema".tracked_deductibles

--
john r pierce 37N 122W
somewhere on the middle of the left coast

#4John R Pierce
pierce@hogranch.com
In reply to: Dale Fukami (#3)
Re: Mysterious table that exists but doesn't exist

On 4/16/2013 1:48 PM, Dale Fukami wrote:

Tried the select with the quotes around each of the schema and table
and I get the same result ("someschema"."tracked_deductibles"). I'm
pretty certain they're both lowercase. Would the select from pg_tables
show if they were mixed case by accident?

yes, it would. I was assuming "someschema" was a made-up name to
hide some business secrets.

--
john r pierce 37N 122W
somewhere on the middle of the left coast

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dale Fukami (#1)
Re: Mysterious table that exists but doesn't exist

Dale Fukami <dale.fukami@gmail.com> writes:

I'm having a problem on a standby server (streaming replication) where a
table seems to exist but is not queryable. Essentially a select statement
(and drop/insert/etc) fails but \d and pg_tables show it exists. The table
exists on the master (and is queryable) and replication is still working in
that changes to the master table don't cause errors on the standby and
changes to other tables are replicated and verified to be on the standby.

That's peculiar. The most likely theories seem to be
(1) corruption in the standby's pg_class indexes, such that an index
search for relname = 'tracked_deductibles' fails to find the row,
although it is found by seqscans; or
(2) some type of visibility issue causing SnapshotNow operations to
think the row is invalid, though it is valid to MVCC queries.

Either way, if it's working on the master, then you've had a replication
failure since the standby's files evidently don't match the master's.

What PG version is this (and which versions have been installed since
the replication was set up)? Have you had any system-level crashes on
the standby?

2) If there were corruption or something in the data for that particular
table on the standby only, would replication report a failure (i.e., be
unable to apply the binary changes) or would the binary changes still just
sort of happen overtop the bad data?

Depends. It's not impossible that the corruption could magically
disappear, if there's a full-page update sent from the master that
overwrites whatever the problem is. Incremental updates on the same
page would not make things better, of course, and could easily make them
worse. It's unlikely that an incremental WAL replay operation would
notice a problem that evidently isn't being noticed by regular queries.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6Dale Fukami
dale.fukami@gmail.com
In reply to: Tom Lane (#5)
Re: Mysterious table that exists but doesn't exist

On Tue, Apr 16, 2013 at 3:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dale Fukami <dale.fukami@gmail.com> writes:

I'm having a problem on a standby server (streaming replication) where a
table seems to exist but is not queryable. Essentially a select statement
(and drop/insert/etc) fails but \d and pg_tables show it exists. The

table

exists on the master (and is queryable) and replication is still working

in

that changes to the master table don't cause errors on the standby and
changes to other tables are replicated and verified to be on the standby.

That's peculiar. The most likely theories seem to be
(1) corruption in the standby's pg_class indexes, such that an index
search for relname = 'tracked_deductibles' fails to find the row,
although it is found by seqscans; or
(2) some type of visibility issue causing SnapshotNow operations to
think the row is invalid, though it is valid to MVCC queries.

Either way, if it's working on the master, then you've had a replication
failure since the standby's files evidently don't match the master's.

What PG version is this (and which versions have been installed since
the replication was set up)? Have you had any system-level crashes on
the standby?

My apologies: PostgreSQL 9.0.5 on x86_64-pc-linux-gnu, compiled by GCC
gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit
This is the only version we've had since we've set up streaming replication.

We believe we've found the moment of corruption. It appears that the db was
shutdown then restarted and didn't quite catch up to the master and then
was shut down again. We use this standby to take snapshots on a regular
basis but it looks like this one didn't quite get into a consistent state
before the next one started. Logs pasted at the end of this email.

At this point I'm most concerned about:

1) We were quite lucky I think to have discovered this issue. I think the
only other way we'd have noticed would have been if we'd failed over to it
and our app stopped working. I'm worried now that we'll end up in a similar
situation and won't have known it for many weeks. At some point we lose the
ability to PITR based on how many backups are kept. Is there a way to be
more confident in our standby machines?

2) Is there a way to recover from this? In the extreme case where we might
have failed over to this standby, could we somehow figure out where exactly
the corruption is and recover or are we talking about going back to an
older snapshot and using our archived WALs? Note: we've already rebuilt
this standby to be consistent with master last night. This is more a
hypothetical at this point. We've kept the bad slave for experimentation in
case anyone has thoughts on this.

Just to clear up the confusion that I had caused John. I had obscured the
previous schema name by restoring from snapshots and altering the schema
name itself. So, the queries and output are an exact copy/paste from my
terminal. I get the same results on the Live standby when using the actual
Live schema name but, obviously, it shows the Live schema name rather than
'someschema'.

Thanks,
Dale

Log snippet from what appears to be the time of corruption. We've confirmed
this by restoring snapshots just prior and just following this time range.
Note the lovely HINT that there may be corruption :)

2013-04-15 15:38:22 UTC @ DETAIL: last completed transaction was at log
time 2013-04-15 15:37:07.621579+00
2013-04-15 15:38:22 UTC @ LOG: shutting down
2013-04-15 15:38:22 UTC @ LOG: database system is shut down
2013-04-15 15:39:21 UTC @ LOG: database system was shut down in recovery
at 2013-04-15 15:38:22 UTC
2013-04-15 15:39:21 UTC [unknown]@[unknown] LOG: incomplete startup packet
cp: cannot stat `/var/lib/postgresql-archives/00000005.history': No such
file or directory
2013-04-15 15:39:21 UTC @ LOG: entering standby mode
cp: cannot stat `/var/lib/postgresql-archives/000000050000164800000088': No
such file or directory
2013-04-15 15:39:21 UTC postgres@postgres FATAL: the database system is
starting up
2013-04-15 15:39:22 UTC postgres@postgres FATAL: the database system is
starting up
cp: cannot stat `/var/lib/postgresql-archives/000000050000164800000086': No
such file or directory
2013-04-15 15:39:22 UTC @ LOG: redo starts at 1648/86ED4360
cp: cannot stat `/var/lib/postgresql-archives/000000050000164800000087': No
such file or directory
2013-04-15 15:39:23 UTC postgres@postgres FATAL: the database system is
starting up
cp: cannot stat `/var/lib/postgresql-archives/000000050000164800000088': No
such file or directory
cp: cannot stat `/var/lib/postgresql-archives/000000050000164800000089': No
such file or directory
cp: cannot stat `/var/lib/postgresql-archives/00000005000016480000008A': No
such file or directory
2013-04-15 15:39:23 UTC postgres@postgres FATAL: the database system is
starting up
2013-04-15 15:39:23 UTC @ LOG: consistent recovery state reached at
1648/8A6460E0
2013-04-15 15:39:23 UTC @ LOG: invalid record length at 1648/8A6460E0
2013-04-15 15:39:23 UTC @ LOG: database system is ready to accept read
only connections
cp: cannot stat `/var/lib/postgresql-archives/00000005000016480000008A': No
such file or directory
2013-04-15 15:39:23 UTC @ LOG: streaming replication successfully
connected to primary
2013-04-15 15:39:27 UTC @ LOG: restartpoint starting: xlog
2013-04-15 15:43:56 UTC @ LOG: restartpoint complete: wrote 13894 buffers
(2.1%); write=48.569 s, sync=219.650 s, total=269.460 s
2013-04-15 15:43:56 UTC @ LOG: recovery restart point at 1648/8909ED50
2013-04-15 15:43:56 UTC @ DETAIL: last completed transaction was at log
time 2013-04-15 15:40:48.449826+00
2013-04-15 15:43:56 UTC @ LOG: restartpoint starting: xlog time
2013-04-15 15:45:02 UTC @ LOG: received fast shutdown request
2013-04-15 15:45:02 UTC @ LOG: aborting any active transactions
2013-04-15 15:45:15 UTC @ FATAL: terminating walreceiver process due to
administrator command
2013-04-15 15:45:25 UTC @ LOG: restartpoint complete: wrote 12759 buffers
(1.9%); write=41.211 s, sync=47.518 s, total=88.759 s
2013-04-15 15:45:25 UTC @ LOG: recovery restart point at 1648/8FBEBAD8
2013-04-15 15:45:25 UTC @ DETAIL: last completed transaction was at log
time 2013-04-15 15:44:30.43801+00
2013-04-15 15:45:25 UTC @ LOG: shutting down
2013-04-15 15:45:25 UTC @ LOG: restartpoint starting: shutdown immediate
2013-04-15 15:46:02 UTC @ LOG: received immediate shutdown request
2013-04-15 15:47:24 UTC @ LOG: database system was interrupted while in
recovery at log time 2013-04-15 15:39:45 UTC
2013-04-15 15:47:24 UTC @ HINT: If this has occurred more than once some
data might be corrupted and you might need to choose an earlier recovery
target.
2013-04-15 15:47:24 UTC [unknown]@[unknown] LOG: incomplete startup packet
cp: cannot stat `/var/lib/postgresql-archives/00000005.history': No such
file or directory
2013-04-15 15:47:24 UTC @ LOG: entering standby mode
cp: cannot stat `/var/lib/postgresql-archives/000000050000164800000091': No
such file or directory
2013-04-15 15:47:24 UTC postgres@postgres FATAL: the database system is
starting up
cp: cannot stat `/var/lib/postgresql-archives/00000005000016480000008F': No
such file or directory
2013-04-15 15:47:25 UTC @ LOG: redo starts at 1648/8FBEBAD8
cp: cannot stat `/var/lib/postgresql-archives/000000050000164800000090': No
such file or directory
2013-04-15 15:47:25 UTC postgres@postgres FATAL: the database system is
starting up
cp: cannot stat `/var/lib/postgresql-archives/000000050000164800000091': No
such file or directory
2013-04-15 15:47:25 UTC postgres@postgres FATAL: the database system is
starting up
cp: cannot stat `/var/lib/postgresql-archives/000000050000164800000092': No
such file or directory
cp: cannot stat `/var/lib/postgresql-archives/000000050000164800000093': No
such file or directory
2013-04-15 15:47:26 UTC postgres@postgres FATAL: the database system is
starting up
2013-04-15 15:47:26 UTC @ LOG: restartpoint starting: xlog
cp: cannot stat `/var/lib/postgresql-archives/000000050000164800000094': No
such file or directory
2013-04-15 15:47:26 UTC postgres@postgres FATAL: the database system is
starting up
cp: cannot stat `/var/lib/postgresql-archives/000000050000164800000095': No
such file or directory
2013-04-15 15:47:27 UTC postgres@postgres FATAL: the database system is
starting up
cp: cannot stat `/var/lib/postgresql-archives/000000050000164800000096': No
such file or directory
2013-04-15 15:47:27 UTC postgres@postgres FATAL: the database system is
starting up
cp: cannot stat `/var/lib/postgresql-archives/000000050000164800000097': No
such file or directory
2013-04-15 15:47:28 UTC postgres@postgres FATAL: the database system is
starting up
cp: cannot stat `/var/lib/postgresql-archives/000000050000164800000098': No
such file or directory
2013-04-15 15:47:28 UTC postgres@postgres FATAL: the database system is
starting up
cp: cannot stat `/var/lib/postgresql-archives/000000050000164800000099': No
such file or directory
2013-04-15 15:47:29 UTC postgres@postgres FATAL: the database system is
starting up
2013-04-15 15:47:29 UTC @ LOG: consistent recovery state reached at
1648/99E05F08
2013-04-15 15:47:29 UTC @ LOG: database system is ready to accept read
only connections
cp: cannot stat `/var/lib/postgresql-archives/00000005000016480000009A': No
such file or directory
2013-04-15 15:47:30 UTC @ LOG: unexpected pageaddr 1648/794E2000 in log
file 5704, segment 154, offset 5120000
cp: cannot stat `/var/lib/postgresql-archives/00000005000016480000009A': No
such file or directory
2013-04-15 15:47:30 UTC @ LOG: streaming replication successfully
connected to primary
2013-04-15 15:52:34 UTC @ LOG: restartpoint complete: wrote 17096 buffers
(2.6%); write=39.897 s, sync=264.845 s, total=307.746 s
2013-04-15 15:52:34 UTC @ LOG: recovery restart point at 1648/9196C3B0
2013-04-15 15:52:34 UTC @ DETAIL: last completed transaction was at log
time 2013-04-15 15:48:44.049103+00
2013-04-15 15:52:34 UTC @ LOG: restartpoint starting: xlog time
2013-04-15 15:53:58 UTC @ LOG: restartpoint complete: wrote 27760 buffers
(4.2%); write=43.305 s, sync=41.135 s, total=84.489 s
2013-04-15 15:53:58 UTC @ LOG: recovery restart point at 1648/A20E13E8
2013-04-15 15:53:58 UTC @ DETAIL: last completed transaction was at log
time 2013-04-15 15:53:35.897601+00
2013-04-15 15:53:58 UTC @ LOG: restartpoint starting: xlog time
2013-04-15 15:55:04 UTC @ LOG: restartpoint complete: wrote 34572 buffers
(5.3%); write=41.229 s, sync=24.578 s, total=65.855 s
2013-04-15 15:55:04 UTC @ LOG: recovery restart point at 1648/B1293B80
2013-04-15 15:55:04 UTC @ DETAIL: last completed transaction was at log
time 2013-04-15 15:54:41.804552+00
2013-04-15 15:55:04 UTC @ LOG: restartpoint starting: xlog time
2013-04-15 15:55:58 UTC @ LOG: restartpoint complete: wrote 9283 buffers
(1.4%); write=41.391 s, sync=12.263 s, total=53.805 s
2013-04-15 15:55:58 UTC @ LOG: recovery restart point at 1648/B371CFF0
2013-04-15 15:55:58 UTC @ DETAIL: last completed transaction was at log
time 2013-04-15 15:55:35.627312+00
2013-04-15 15:55:58 UTC @ LOG: restartpoint starting: xlog
2013-04-15 15:56:51 UTC @ LOG: restartpoint complete: wrote 6459 buffers
(1.0%); write=41.648 s, sync=11.185 s, total=52.856 s
2013-04-15 15:56:51 UTC @ LOG: recovery restart point at 1648/B6C1DF00

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dale Fukami (#6)
Re: Mysterious table that exists but doesn't exist

Dale Fukami <dale.fukami@gmail.com> writes:

On Tue, Apr 16, 2013 at 3:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Either way, if it's working on the master, then you've had a replication
failure since the standby's files evidently don't match the master's.

What PG version is this (and which versions have been installed since
the replication was set up)? Have you had any system-level crashes on
the standby?

My apologies: PostgreSQL 9.0.5 on x86_64-pc-linux-gnu, compiled by GCC
gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit
This is the only version we've had since we've set up streaming replication.

We believe we've found the moment of corruption. It appears that the db was
shutdown then restarted and didn't quite catch up to the master and then
was shut down again. We use this standby to take snapshots on a regular
basis but it looks like this one didn't quite get into a consistent state
before the next one started. Logs pasted at the end of this email.

Hm ... there was a fix in 9.0.12 that might be relevant to this:
http://git.postgresql.org/gitweb/?p=postgresql.git&amp;a=commitdiff&amp;h=5840e3181b7e6c784fdb3aff708c4dcc2dfe551d
Whether that explains it or not, 9.0.5 is getting long in the tooth;
you really need to think about an update. Especially in view of
CVE-2013-1899.

1) We were quite lucky I think to have discovered this issue. I think the
only other way we'd have noticed would have been if we'd failed over to it
and our app stopped working. I'm worried now that we'll end up in a similar
situation and won't have known it for many weeks. At some point we lose the
ability to PITR based on how many backups are kept. Is there a way to be
more confident in our standby machines?

I think you should update your PG version and then resync your standbys
(ie, apply a fresh base backup) at the next convenient maintenance
window.

Just to clear up the confusion that I had caused John. I had obscured the
previous schema name by restoring from snapshots and altering the schema
name itself. So, the queries and output are an exact copy/paste from my
terminal. I get the same results on the Live standby when using the actual
Live schema name but, obviously, it shows the Live schema name rather than
'someschema'.

Well, in that case there's the question of whether you'd duplicated the
standby's state accurately ...

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8Dale Fukami
dale.fukami@gmail.com
In reply to: Tom Lane (#7)
Re: Mysterious table that exists but doesn't exist

Hm ... there was a fix in 9.0.12 that might be relevant to this:

http://git.postgresql.org/gitweb/?p=postgresql.git&amp;a=commitdiff&amp;h=5840e3181b7e6c784fdb3aff708c4dcc2dfe551d
Whether that explains it or not, 9.0.5 is getting long in the tooth;
you really need to think about an update. Especially in view of
CVE-2013-1899.

Yeah, we've been in the process of planning the update in light of the
security fix. We're likely just making the jump to 9.2 in the process so
that'll be handled soon.

I think you should update your PG version and then resync your standbys
(ie, apply a fresh base backup) at the next convenient maintenance
window.

We did apply a fresh base backup last night to ensure our current standbys
are in a good state. I was mostly wondering if there were some better way
to continually monitor the state of our standbys on an ongoing basis aside
from checking pg_last_xlog_receive_location, etc. I realize it's no small
task but thought maybe there was something I had missed.

In the end, I think this just helps nudge our upgrade along :)

Thanks for all the help!
Dale