Table has 22 million records, but backup doesn't see them

Started by Radcon Entecabout 17 years ago9 messagesgeneral
Jump to latest
#1Radcon Entec
radconentec@yahoo.com

Greetings!

I'm running PostgreSQL 8.1 under Windows XP, looking at a database hosted on a machine running PostgreSQL under Windows Server 2003.

The database has a table with three simple columns and 22 million rows.  I am trying to back up that table by itself.  However, pg_dump finishes almost instantly, obviously not backing up any data from the table.  I've tried it from the DOS command line with and without the -a (data only) option, and from inside PGAdmin.  Can anyone suggest what might cause this behavior?

I'm going to have to do a full database backup, which I've been trying to avoid because there are other large tables I don't need and don't want to waste time getting.  A full backup of this database takes something like three hours.

Thanks again!

#2Steve Crawford
scrawford@pinpointresearch.com
In reply to: Radcon Entec (#1)
Re: Table has 22 million records, but backup doesn't see them

Radcon Entec wrote:

Greetings!

I'm running PostgreSQL 8.1 under Windows XP, looking at a database
hosted on a machine running PostgreSQL under Windows Server 2003.

The database has a table with three simple columns and 22 million
rows. I am trying to back up that table by itself. However, pg_dump
finishes almost instantly, obviously not backing up any data from the
table. I've tried it from the DOS command line with and without the
-a (data only) option, and from inside PGAdmin. Can anyone suggest
what might cause this behavior?

What is the exact command and what is the output (I'll be surprised if
there is no output at all to either stdout or stderr)? Does pg_dumpall
run fine from the same machine? How about psql? Are you sure you are
hitting a base-table and not a view? Do the server logs show anything
interesting?

Cheers,
Steve

#3Steve Crawford
scrawford@pinpointresearch.com
In reply to: Radcon Entec (#1)
Re: Table has 22 million records, but backup doesn't see them

Please remember to reply-all so others can help or see the solution as
well (also, the convention on this list is to bottom-post, not top-post).

Radcon Entec wrote:

Steve,

Here's the exact command and output, taken from the DOS command window:

C:\Documents and Settings\entec>"\program
files\postgresql\8.1\bin\pg_dump" -f f
eedback.sql -v -a -t feedback -h 159.138.80.150 -U postgres Anneal
pg_dump: reading schemas
pg_dump: reading user-defined functions
pg_dump: reading user-defined types
pg_dump: reading procedural languages
pg_dump: reading user-defined aggregate functions
pg_dump: reading user-defined operators
pg_dump: reading user-defined operator classes
pg_dump: reading user-defined conversions
pg_dump: reading user-defined tables
pg_dump: reading table inheritance information
pg_dump: reading rewrite rules
pg_dump: reading type casts
pg_dump: finding inheritance relationships
pg_dump: reading column info for interesting tables
pg_dump: finding the columns and types of table "feedback"
pg_dump: finding default expressions of table "feedback"
pg_dump: flagging inherited columns in subtables
pg_dump: reading indexes
pg_dump: reading indexes for table "feedback"
pg_dump: reading constraints
pg_dump: reading triggers
pg_dump: reading dependency data
pg_dump: saving encoding
pg_dump: executing SEQUENCE SET feedback_feedback_key_seq
pg_dump: restoring data for table "feedback"
pg_dump: dumping contents of table feedback

And what was the result? Zero-size file? If not, what was in the file?

When I ran a batch file dumping the schema and about forty tables into
separate files, no problems were encountered. All of the resulting
files have reasonable sizes.

What method did you use for that process?

I haven't tried psql yet, nor pg_dumpall. I've got a full backup
running now.

Via what utility if you aren't using pg_dumpall? Also, running a full
backup won't impede testing your connection with psql.

Show quoted text

I am sure that I'm looking at a table.

I'm not sure where to find the server logs (which gives you some idea
of my knowledge of PostgreSQL administration). I'll go check them now.

RobR

------------------------------------------------------------------------
*From:* Steve Crawford <scrawford@pinpointresearch.com>
*To:* Radcon Entec <radconentec@yahoo.com>
*Cc:* pgsql-general@postgresql.org
*Sent:* Wednesday, April 8, 2009 11:25:20 AM
*Subject:* Re: [GENERAL] Table has 22 million records, but backup
doesn't see them

Radcon Entec wrote:

Greetings!

I'm running PostgreSQL 8.1 under Windows XP, looking at a database
hosted on a machine running PostgreSQL under Windows Server 2003.

The database has a table with three simple columns and 22 million
rows. I am trying to back up that table by itself. However, pg_dump
finishes almost instantly, obviously not backing up any data from the
table. I've tried it from the DOS command line with and without the
-a (data only) option, and from inside PGAdmin. Can anyone suggest
what might cause this behavior?

What is the exact command and what is the output (I'll be surprised if
there is no output at all to either stdout or stderr)? Does pg_dumpall
run fine from the same machine? How about psql? Are you sure you are
hitting a base-table and not a view? Do the server logs show anything
interesting?

Cheers,
Steve

#4Radcon Entec
radconentec@yahoo.com
In reply to: Steve Crawford (#3)
Re: Table has 22 million records, but backup doesn't see them

From: Steve Crawford scrawford@pinpointresearch.com
 
<snip>And what was the result? Zero-size file? If not, what was in the file?
</snip>

Here is the text that results from dumping my 22-million-row feedback table:

--
-- PostgreSQL database dump
--
-- Started on 2009-04-08 10:10:49 Eastern Daylight Time
SET client_encoding = 'UTF8';
SET check_function_bodies = false;
SET client_min_messages = warning;
SET search_path = public, pg_catalog;
SET default_tablespace = '';
SET default_with_oids = false;
--
-- TOC entry 1417 (class 1259 OID 7667616)
-- Dependencies: 5
-- Name: feedback; Type: TABLE; Schema: public; Owner: caps; Tablespace:
--
CREATE TABLE feedback (
feedback_key bigserial NOT NULL,
charge integer,
elapsed_time smallint,
tag_type character varying(16),
stack smallint,
tag_value real,
heating smallint,
status smallint
);
 
ALTER TABLE public.feedback OWNER TO caps;
--
-- TOC entry 1783 (class 0 OID 0)
-- Dependencies: 1416
-- Name: feedback_feedback_key_seq; Type: SEQUENCE SET; Schema: public; Owner: caps
--
SELECT pg_catalog.setval(pg_catalog.pg_get_serial_sequence('feedback', 'feedback_key'), 22326846, true);
 
--
-- TOC entry 1780 (class 0 OID 7667616)
-- Dependencies: 1417
-- Data for Name: feedback; Type: TABLE DATA; Schema: public; Owner: caps
--
COPY feedback (feedback_key, charge, elapsed_time, tag_type, stack, tag_value, heating, status) FROM stdin;
\.
 
--
-- TOC entry 1779 (class 2606 OID 7667620)
-- Dependencies: 1417 1417
-- Name: feedback_pkey; Type: CONSTRAINT; Schema: public; Owner: caps; Tablespace:
--
ALTER TABLE ONLY feedback
ADD CONSTRAINT feedback_pkey PRIMARY KEY (feedback_key);
 
--
-- TOC entry 1777 (class 1259 OID 7829003)
-- Dependencies: 1417
-- Name: feedback_charge_idx; Type: INDEX; Schema: public; Owner: caps; Tablespace:
--
CREATE INDEX feedback_charge_idx ON feedback USING btree (charge);
 
--
-- TOC entry 1514 (class 2618 OID 7667631)
-- Dependencies: 1417 1417 1418 1417 1417 1417 1417 1417 1417 1417 1417
-- Name: feedback_partition_active; Type: RULE; Schema: public; Owner: caps
--
CREATE RULE feedback_partition_active AS ON INSERT TO feedback WHERE (new.charge > 7000) DO INSTEAD INSERT INTO feedback_active (charge, elapsed_time, tag_type, stack, tag_value, heating, status) VALUES (new.charge, new.elapsed_time, new.tag_type, new.stack, new.tag_value, new.heating, new.status);
 
--
-- TOC entry 1515 (class 2618 OID 7667632)
-- Dependencies: 1417 1417 1419 1417 1417 1417 1417 1417 1417 1417 1417
-- Name: feedback_partition_archived; Type: RULE; Schema: public; Owner: caps
--
CREATE RULE feedback_partition_archived AS ON INSERT TO feedback WHERE (new.charge <= 7000) DO INSTEAD INSERT INTO feedback_archived_7000 (charge, elapsed_time, tag_type, stack, tag_value, heating, status) VALUES (new.charge, new.elapsed_time, new.tag_type, new.stack, new.tag_value, new.heating, new.status);
 
--
-- TOC entry 1782 (class 0 OID 0)
-- Dependencies: 1417
-- Name: feedback; Type: ACL; Schema: public; Owner: caps
--
REVOKE ALL ON TABLE feedback FROM PUBLIC;
REVOKE ALL ON TABLE feedback FROM caps;
GRANT ALL ON TABLE feedback TO caps;
GRANT ALL ON TABLE feedback TO anneal_operator;
GRANT ALL ON TABLE feedback TO anneal_supervisor;
GRANT ALL ON TABLE feedback TO anneal_administrator;
GRANT SELECT ON TABLE feedback TO anneal_metallurgist;
GRANT SELECT ON TABLE feedback TO anneal_guest;
 
--
-- TOC entry 1784 (class 0 OID 0)
-- Dependencies: 1416
-- Name: feedback_feedback_key_seq; Type: ACL; Schema: public; Owner: caps
--
REVOKE ALL ON TABLE feedback_feedback_key_seq FROM PUBLIC;
REVOKE ALL ON TABLE feedback_feedback_key_seq FROM caps;
GRANT ALL ON TABLE feedback_feedback_key_seq TO caps;
GRANT ALL ON TABLE feedback_feedback_key_seq TO PUBLIC;
 
-- Completed on 2009-04-08 10:10:52 Eastern Daylight Time
--
-- PostgreSQL database dump complete
--
 
I ran a DOS batch file.  Here's the first few lines:
 
"\program files\postgresql\8.1\bin\pg_dump" -f schema.sql -v -s -h 159.138.80.150 -U postgres -X disable-triggers Anneal > backup_in_pieces.log
"\program files\postgresql\8.1\bin\pg_dump" -f adhoc_query.sql -v -a -t adhoc_query -h 159.138.80.150 -U postgres -X disable-triggers Anneal
"\program files\postgresql\8.1\bin\pg_dump" -f base_cycle_compatibility.sql -v -a -t base_cycle_compatibility -h 159.138.80.150 -U postgres -X disable-triggers Anneal
"\program files\postgresql\8.1\bin\pg_dump" -f base_type.sql -v -a -t base_type -h 159.138.80.150 -U postgres -X disable-triggers Anneal

Thanks again for your help!
 
RobR

  When I ran a batch file dumping the schema and about forty tables into separate files, no problems were encountered.  All of the resulting files have reasonable sizes.
 

What method did you use for that process?

#5Radcon Entec
radconentec@yahoo.com
In reply to: Steve Crawford (#3)
Re: Table has 22 million records, but backup doesn't see them

By the way, a full backup and restore using PGAdmin and accepting all default setings worked successfully, including all 22 million feedback records.

I still would like to understand why the feedback table cannot be backed up by itself.  The technique of backing up and restoring only selected tables will be frequently useful for many customers, and I would like to know how to get around this problem if it shows up again.

RobR

#6Steve Crawford
scrawford@pinpointresearch.com
In reply to: Radcon Entec (#4)
Re: Table has 22 million records, but backup doesn't see them

Radcon Entec wrote:

Here is the text that results from dumping my 22-million-row feedback
table:

...

CREATE RULE feedback_partition_active AS ON INSERT TO feedback WHERE
(new.charge > 7000) DO INSTEAD INSERT INTO feedback_active (charge,
elapsed_time, tag_type, stack, tag_value, heating, status) VALUES
(new.charge, new.elapsed_time, new.tag_type, new.stack, new.tag_value,
new.heating, new.status);

...

CREATE RULE feedback_partition_archived AS ON INSERT TO feedback WHERE
(new.charge <= 7000) DO INSTEAD INSERT INTO feedback_archived_7000
(charge, elapsed_time, tag_type, stack, tag_value, heating, status)
VALUES (new.charge, new.elapsed_time, new.tag_type, new.stack,
new.tag_value, new.heating, new.status);

Are you certain that feedback actually contains any data or is it just
the parent table and the real data is in the child tables? What is the
output of "select count(*) from only feedback;" ?

Cheers,
Steve

#7Radcon Entec
radconentec@yahoo.com
In reply to: Steve Crawford (#6)
Re: Table has 22 million records, but backup doesn't see them

________________________________
From: Steve Crawford <scrawford@pinpointresearch.com>
To: Radcon Entec <radconentec@yahoo.com>
Cc: "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
Sent: Wednesday, April 8, 2009 1:15:55 PM
Subject: Re: [GENERAL] Table has 22 million records, but backup doesn't see them

Radcon Entec wrote:

  Here is the text that results from dumping my 22-million-row feedback table:
  ...

CREATE RULE feedback_partition_active AS ON INSERT TO feedback WHERE (new.charge > 7000) DO INSTEAD INSERT INTO feedback_active (charge, elapsed_time, tag_type, stack, tag_value, heating, status) VALUES (new.charge, new.elapsed_time, new.tag_type, new.stack, new.tag_value, new.heating, new.status);

...

CREATE RULE feedback_partition_archived AS ON INSERT TO feedback WHERE (new.charge <= 7000) DO INSTEAD INSERT INTO feedback_archived_7000 (charge, elapsed_time, tag_type, stack, tag_value, heating, status) VALUES (new.charge, new.elapsed_time, new.tag_type, new.stack, new.tag_value, new.heating, new.status);

Are you certain that feedback actually contains any data or is it just the parent table and the real data is in the child tables? What is the output of "select count(*) from only feedback;" ?

Cheers,
Steve

Steve,

You are, of course, correct.  "select count(*) from only feedback" returns 0.  I have never used (or even seen) PostgreSQL rules before. 

When I run the query "select * from feedback where charge = 23017", I get 538 records.  Adding the word "only" gives me zero records, as expected, and querying the feedback_active table gets me my 538 records.  But the feedback table only has the INSERT rules you quoted above.  I clicked on the feedback table's Rules leaf and selected "New Rule", and saw that I can create SELECT, INSERT, UPDATE and DELETE rules.  But even though I don't have a SELECT rule explicitly defined, PostgreSQL appears to be smart enough to retrieve data from the correct actual table when I think I'm selecting from the feedback table.  Is that standard behavior? 

Of course, my next step will be to read the documentation.

Thank you yet again!

RobR

#8Steve Crawford
scrawford@pinpointresearch.com
In reply to: Radcon Entec (#7)
Re: Table has 22 million records, but backup doesn't see them

Radcon Entec wrote:

------------------------------------------------------------------------
*
*
You are, of course, correct. "select count(*) from only feedback"
returns 0. I have never used (or even seen) PostgreSQL rules before.

When I run the query "select * from feedback where charge = 23017", I
get 538 records. Adding the word "only" gives me zero records, as
expected, and querying the feedback_active table gets me my 538
records. But the feedback table only has the INSERT rules you quoted
above. I clicked on the feedback table's Rules leaf and selected "New
Rule", and saw that I can create SELECT, INSERT, UPDATE and DELETE
rules. But even though I don't have a SELECT rule explicitly defined,
PostgreSQL appears to be smart enough to retrieve data from the
correct actual table when I think I'm selecting from the feedback
table. Is that standard behavior?

Of course, my next step will be to read the documentation.

You won't find the cause of your surprise reading up on rules. Read up
on inheritance, instead. It can do a lot, but a typical simple use is
table-partitioning as appears to be the case in your situation.

You start with the main (parent) table, say "events" then create a bunch
of child tables that inherit events, say events_jan, events_feb,
events_mar...

You never put actual data in "events" but create a rule or trigger that
looks at the month information and puts January events in events_jan,
February in events_feb and so on.

Select * from events is more-or-less equivalent to:
select * from events_jan union
select * from events_feb union...

Inheritance goes far beyond the simple case shown above. There's a lot
of nifty stuff you can do and a number of things that can bite you. But,
as you say, that's where the documentation comes in.

Cheers,
Steve

#9Chris
dmagick@gmail.com
In reply to: Radcon Entec (#5)
Re: Table has 22 million records, but backup doesn't see them

I still would like to understand why the feedback table cannot be backed
up by itself.

Because there is no actual data in the feedback table.

It's being stored in the "feedback_active", "feedback_archived_7000" and
your other tables instead.

--
Postgresql & php tutorials
http://www.designmagick.com/