Table corrupt?

Started by Brandon Ibachover 27 years ago8 messages
#1Brandon Ibach
bibach@infomansol.com

Hey, guys...
I suppose this probably belongs in questions, but I thought I might
get a quicker answer here. I have a table in a customer's database
that has gotten quite large via lack of vacuuming (57MB). Queries
involving this table started failing in the form of the backend just
sitting there chugging away on the CPU (but not disk so much) for
hours on end.
This began about 24 hours ago, and as of about 12 hours ago, no
queries on this table work. I started a vacuum about 3 hours ago, and
it has had upper-90s percent CPU usage the whole time, and still
hasn't completed.
Any ideas on what might be going on here? And, if postgres won't
be able to access the table, is there any hope of extracting rows from
the raw database file, such that I could reconstruct the table?
Please cc responses to me directly, as I'm only on the digest list,
and thanks in advance for any advice/help!

-Brandon :)

#2Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Brandon Ibach (#1)
Re: [HACKERS] Table corrupt?

Hey, guys...
I suppose this probably belongs in questions, but I thought I might
get a quicker answer here. I have a table in a customer's database
that has gotten quite large via lack of vacuuming (57MB). Queries
involving this table started failing in the form of the backend just
sitting there chugging away on the CPU (but not disk so much) for
hours on end.
This began about 24 hours ago, and as of about 12 hours ago, no
queries on this table work. I started a vacuum about 3 hours ago, and
it has had upper-90s percent CPU usage the whole time, and still
hasn't completed.
Any ideas on what might be going on here? And, if postgres won't
be able to access the table, is there any hope of extracting rows from
the raw database file, such that I could reconstruct the table?
Please cc responses to me directly, as I'm only on the digest list,
and thanks in advance for any advice/help!

pg_dump -t tablename, drop and reload?

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)
#3Brandon Ibach
bibach@infomansol.com
In reply to: Bruce Momjian (#2)
Re: [HACKERS] Table corrupt?

Bruce Momjian said:

Any ideas on what might be going on here? And, if postgres won't
be able to access the table, is there any hope of extracting rows from
the raw database file, such that I could reconstruct the table?

pg_dump -t tablename, drop and reload?

I thought pg_dump got the data out via queries through the backend?
(But, then, I could be wrong... please correct me if so...)

-Brandon :)

#4Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Brandon Ibach (#3)
Re: [HACKERS] Table corrupt?

Bruce Momjian said:

Any ideas on what might be going on here? And, if postgres won't
be able to access the table, is there any hope of extracting rows from
the raw database file, such that I could reconstruct the table?

pg_dump -t tablename, drop and reload?

I thought pg_dump got the data out via queries through the backend?
(But, then, I could be wrong... please correct me if so...)

-Brandon :)

I gets the data out via COPY, which is slightly different than a normal
query that does through the parser/optimizer/executor. It is possible
you just have a lot of extra data and it is taking time to vacuum.

If there is a real problem, I would dump the entire database and reload
it.

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)
#5Brandon Ibach
bibach@infomansol.com
In reply to: Bruce Momjian (#4)
Re: [HACKERS] Table corrupt?

Bruce Momjian said:

Bruce Momjian said:

Any ideas on what might be going on here? And, if postgres won't
be able to access the table, is there any hope of extracting rows from
the raw database file, such that I could reconstruct the table?

pg_dump -t tablename, drop and reload?

I thought pg_dump got the data out via queries through the backend?
(But, then, I could be wrong... please correct me if so...)

-Brandon :)

I gets the data out via COPY, which is slightly different than a normal
query that does through the parser/optimizer/executor. It is possible
you just have a lot of extra data and it is taking time to vacuum.

Hmmm... well, the table may be 57 Meg, but then, the backend
running the vacuum has consumed 5 1/2 hours of CPU time so far, and
still going strong, so something tells me there may be something
deeper. :)

If there is a real problem, I would dump the entire database and reload
it.

Probably good advice, tho the rest of the tables seem to be just
fine. *shrug*

-Brandon :)

#6Brandon Ibach
bibach@infomansol.com
In reply to: Brandon Ibach (#5)
Re: [HACKERS] Table corrupt?

Vadim Mikheev said:

Brandon Ibach wrote:

Hmmm... well, the table may be 57 Meg, but then, the backend
running the vacuum has consumed 5 1/2 hours of CPU time so far, and
still going strong, so something tells me there may be something
deeper. :)

Did you have any indices for this table ?

Vadim

Nope... no indices at all.

-Brandon :)

#7Brandon Ibach
bibach@infomansol.com
In reply to: Brandon Ibach (#6)
Re: [HACKERS] Table corrupt?

Vadim Mikheev said:

Well, could you use gdb to attach to backend runnig vacuum ?

Vadim

Okay... using pg_dump to get the data out of the table is causing
much the same situation. I did a backtrace in gdb, but unfortnately,
I'm running a copy of postgres without debugging symbols, so it may be
of limited use. Here 'tis...

#0 0x80b7590 in WaitIO ()
#1 0x80b6e3c in BufferAlloc ()
#2 0x80b6c6d in ReadBufferWithBufferLock ()
#3 0x80b7390 in ReleaseAndReadBuffer ()
#4 0x8065b90 in heapgettup ()
#5 0x80660a5 in heap_getnext ()
#6 0x8088c61 in SeqNext ()
#7 0x8084480 in ExecScan ()
#8 0x8088ca3 in ExecSeqScan ()
#9 0x80833ce in ExecProcNode ()
#10 0x8082a61 in ExecutePlan ()
#11 0x8082644 in ExecutorRun ()
#12 0x80c1577 in ProcessQueryDesc ()
#13 0x80c15d6 in ProcessQuery ()
#14 0x80c0048 in pg_eval_dest ()
#15 0x80bff56 in pg_eval ()
#16 0x80c0ff1 in PostgresMain ()
#17 0x808f7af in main ()
#18 0x805e1ab in _start ()

I'll see about getting a trace out of a version of postgres with
debugging symbols.

-Brandon :)

#8Brandon Ibach
bibach@infomansol.com
In reply to: Brandon Ibach (#7)
Re: [HACKERS] Table corrupt?

Vadim Mikheev said:

Brandon Ibach wrote:

Vadim Mikheev said:
Okay... using pg_dump to get the data out of the table is causing
much the same situation. I did a backtrace in gdb, but unfortnately,
I'm running a copy of postgres without debugging symbols, so it may be
of limited use. Here 'tis...

#0 0x80b7590 in WaitIO ()

Did you restart postmaster after killing backend (vacuum) ?

Vadim

Nope... :) And I just thought of that possibility while sifting
through the buffer manager code (and meantime, your email arrived).
Thanks for the tip, I bet it will work.

-Brandon :)