Thread: Table has 22 million records, but backup doesn't see them

Table has 22 million records, but backup doesn't see them

From

Radcon Entec

Date:

08 April 2009, 12:16:35

Greetings!

I'm running PostgreSQL 8.1 under Windows XP, looking at a database hosted on a machine running PostgreSQL under Windows Server 2003.

The database has a table with three simple columns and 22 million rows. I am trying to back up that table by itself. However, pg_dump finishes almost instantly, obviously not backing up any data from the table. I've tried it from the DOS command line with and without the -a (data only) option, and from inside PGAdmin. Can anyone suggest what might cause this behavior?

I'm going to have to do a full database backup, which I've been trying to avoid because there are other large tables I don't need and don't want to waste time getting. A full backup of this database takes something like three hours.

Thanks again!

Re: Table has 22 million records, but backup doesn't see them

From

Steve Crawford

Date:

08 April 2009, 12:25:27

Radcon Entec wrote:

Greetings!

I'm running PostgreSQL 8.1 under Windows XP, looking at a database hosted on a machine running PostgreSQL under Windows Server 2003.

The database has a table with three simple columns and 22 million rows. I am trying to back up that table by itself. However, pg_dump finishes almost instantly, obviously not backing up any data from the table. I've tried it from the DOS command line with and without the -a (data only) option, and from inside PGAdmin. Can anyone suggest what might cause this behavior?

What is the exact command and what is the output (I'll be surprised if there is no output at all to either stdout or stderr)? Does pg_dumpall run fine from the same machine? How about psql? Are you sure you are hitting a base-table and not a view? Do the server logs show anything interesting?

Cheers,
Steve

Re: Table has 22 million records, but backup doesn't see them

From

Steve Crawford

Date:

08 April 2009, 13:15:30

Please remember to reply-all so others can help or see the solution as
well (also, the convention on this list is to bottom-post, not top-post).

Radcon Entec wrote:
> Steve,
>
> Here's the exact command and output, taken from the DOS command window:
>
> C:\Documents and Settings\entec>"\program
> files\postgresql\8.1\bin\pg_dump" -f f
> eedback.sql -v -a -t feedback -h 159.138.80.150 -U postgres Anneal
> pg_dump: reading schemas
> pg_dump: reading user-defined functions
> pg_dump: reading user-defined types
> pg_dump: reading procedural languages
> pg_dump: reading user-defined aggregate functions
> pg_dump: reading user-defined operators
> pg_dump: reading user-defined operator classes
> pg_dump: reading user-defined conversions
> pg_dump: reading user-defined tables
> pg_dump: reading table inheritance information
> pg_dump: reading rewrite rules
> pg_dump: reading type casts
> pg_dump: finding inheritance relationships
> pg_dump: reading column info for interesting tables
> pg_dump: finding the columns and types of table "feedback"
> pg_dump: finding default expressions of table "feedback"
> pg_dump: flagging inherited columns in subtables
> pg_dump: reading indexes
> pg_dump: reading indexes for table "feedback"
> pg_dump: reading constraints
> pg_dump: reading triggers
> pg_dump: reading dependency data
> pg_dump: saving encoding
> pg_dump: executing SEQUENCE SET feedback_feedback_key_seq
> pg_dump: restoring data for table "feedback"
> pg_dump: dumping contents of table feedback
And what was the result? Zero-size file? If not, what was in the file?
>
> When I ran a batch file dumping the schema and about forty tables into
> separate files, no problems were encountered.  All of the resulting
> files have reasonable sizes.
>
What method did you use for that process?
> I haven't tried psql yet, nor pg_dumpall.  I've got a full backup
> running now.
Via what utility if you aren't using pg_dumpall? Also, running a full
backup won't impede testing your connection with psql.
>
> I am sure that I'm looking at a table.
>
> I'm not sure where to find the server logs (which gives you some idea
> of my knowledge of PostgreSQL administration).  I'll go check them now.
>
> RobR
>
> ------------------------------------------------------------------------
> *From:* Steve Crawford <scrawford@pinpointresearch.com>
> *To:* Radcon Entec <radconentec@yahoo.com>
> *Cc:* pgsql-general@postgresql.org
> *Sent:* Wednesday, April 8, 2009 11:25:20 AM
> *Subject:* Re: [GENERAL] Table has 22 million records, but backup
> doesn't see them
>
> Radcon Entec wrote:
>> Greetings!
>>
>> I'm running PostgreSQL 8.1 under Windows XP, looking at a database
>> hosted on a machine running PostgreSQL under Windows Server 2003.
>>
>> The database has a table with three simple columns and 22 million
>> rows.  I am trying to back up that table by itself.  However, pg_dump
>> finishes almost instantly, obviously not backing up any data from the
>> table.  I've tried it from the DOS command line with and without the
>> -a (data only) option, and from inside PGAdmin.  Can anyone suggest
>> what might cause this behavior?
>>
> What is the exact command and what is the output (I'll be surprised if
> there is no output at all to either stdout or stderr)? Does pg_dumpall
> run fine from the same machine? How about psql? Are you sure you are
> hitting a base-table and not a view? Do the server logs show anything
> interesting?
>
> Cheers,
> Steve
>
>

Re: Table has 22 million records, but backup doesn't see them

From

Radcon Entec

Date:

08 April 2009, 13:54:22

From: Steve Crawford scrawford@pinpointresearch.com

<snip>

And what was the result? Zero-size file? If not, what was in the file?

</snip>

Here is the text that results from dumping my 22-million-row feedback table:

-- PostgreSQL database dump

-- Started on 2009-04-08 10:10:49 Eastern Daylight Time

SET client_encoding = 'UTF8';

SET check_function_bodies = false;

SET client_min_messages = warning;

SET search_path = public, pg_catalog;

SET default_tablespace = '';

SET default_with_oids = false;

-- TOC entry 1417 (class 1259 OID 7667616)

-- Dependencies: 5

-- Name: feedback; Type: TABLE; Schema: public; Owner: caps; Tablespace:

CREATE TABLE feedback (

feedback_key bigserial NOT NULL,

charge integer,

elapsed_time smallint,

tag_type character varying(16),

stack smallint,

tag_value real,

heating smallint,

status smallint

);

ALTER TABLE public.feedback OWNER TO caps;

-- TOC entry 1783 (class 0 OID 0)

-- Dependencies: 1416

-- Name: feedback_feedback_key_seq; Type: SEQUENCE SET; Schema: public; Owner: caps

SELECT pg_catalog.setval(pg_catalog.pg_get_serial_sequence('feedback', 'feedback_key'), 22326846, true);

-- TOC entry 1780 (class 0 OID 7667616)

-- Dependencies: 1417

-- Data for Name: feedback; Type: TABLE DATA; Schema: public; Owner: caps

COPY feedback (feedback_key, charge, elapsed_time, tag_type, stack, tag_value, heating, status) FROM stdin;

-- TOC entry 1779 (class 2606 OID 7667620)

-- Dependencies: 1417 1417

-- Name: feedback_pkey; Type: CONSTRAINT; Schema: public; Owner: caps; Tablespace:

ALTER TABLE ONLY feedback

ADD CONSTRAINT feedback_pkey PRIMARY KEY (feedback_key);

-- TOC entry 1777 (class 1259 OID 7829003)

-- Dependencies: 1417

-- Name: feedback_charge_idx; Type: INDEX; Schema: public; Owner: caps; Tablespace:

CREATE INDEX feedback_charge_idx ON feedback USING btree (charge);

-- TOC entry 1514 (class 2618 OID 7667631)

-- Dependencies: 1417 1417 1418 1417 1417 1417 1417 1417 1417 1417 1417

-- Name: feedback_partition_active; Type: RULE; Schema: public; Owner: caps

CREATE RULE feedback_partition_active AS ON INSERT TO feedback WHERE (new.charge > 7000) DO INSTEAD INSERT INTO feedback_active (charge, elapsed_time, tag_type, stack, tag_value, heating, status) VALUES (new.charge, new.elapsed_time, new.tag_type, new.stack, new.tag_value, new.heating, new.status);

-- TOC entry 1515 (class 2618 OID 7667632)

-- Dependencies: 1417 1417 1419 1417 1417 1417 1417 1417 1417 1417 1417

-- Name: feedback_partition_archived; Type: RULE; Schema: public; Owner: caps

CREATE RULE feedback_partition_archived AS ON INSERT TO feedback WHERE (new.charge <= 7000) DO INSTEAD INSERT INTO feedback_archived_7000 (charge, elapsed_time, tag_type, stack, tag_value, heating, status) VALUES (new.charge, new.elapsed_time, new.tag_type, new.stack, new.tag_value, new.heating, new.status);

-- TOC entry 1782 (class 0 OID 0)

-- Dependencies: 1417

-- Name: feedback; Type: ACL; Schema: public; Owner: caps

REVOKE ALL ON TABLE feedback FROM PUBLIC;

REVOKE ALL ON TABLE feedback FROM caps;

GRANT ALL ON TABLE feedback TO caps;

GRANT ALL ON TABLE feedback TO anneal_operator;

GRANT ALL ON TABLE feedback TO anneal_supervisor;

GRANT ALL ON TABLE feedback TO anneal_administrator;

GRANT SELECT ON TABLE feedback TO anneal_metallurgist;

GRANT SELECT ON TABLE feedback TO anneal_guest;

-- TOC entry 1784 (class 0 OID 0)

-- Dependencies: 1416

-- Name: feedback_feedback_key_seq; Type: ACL; Schema: public; Owner: caps

REVOKE ALL ON TABLE feedback_feedback_key_seq FROM PUBLIC;

REVOKE ALL ON TABLE feedback_feedback_key_seq FROM caps;

GRANT ALL ON TABLE feedback_feedback_key_seq TO caps;

GRANT ALL ON TABLE feedback_feedback_key_seq TO PUBLIC;

-- Completed on 2009-04-08 10:10:52 Eastern Daylight Time

-- PostgreSQL database dump complete

> When I ran a batch file dumping the schema and about forty tables into separate files, no problems were encountered. All of the resulting files have reasonable sizes.
>
What method did you use for that process?

I ran a DOS batch file. Here's the first few lines:

"\program files\postgresql\8.1\bin\pg_dump" -f schema.sql -v -s -h 159.138.80.150 -U postgres -X disable-triggers Anneal > backup_in_pieces.log
"\program files\postgresql\8.1\bin\pg_dump" -f adhoc_query.sql -v -a -t adhoc_query -h 159.138.80.150 -U postgres -X disable-triggers Anneal
"\program files\postgresql\8.1\bin\pg_dump" -f base_cycle_compatibility.sql -v -a -t base_cycle_compatibility -h 159.138.80.150 -U postgres -X disable-triggers Anneal
"\program files\postgresql\8.1\bin\pg_dump" -f base_type.sql -v -a -t base_type -h 159.138.80.150 -U postgres -X disable-triggers Anneal

Thanks again for your help!

RobR

Re: Table has 22 million records, but backup doesn't see them

From

Radcon Entec

Date:

08 April 2009, 13:58:14

By the way, a full backup and restore using PGAdmin and accepting all default setings worked successfully, including all 22 million feedback records.

I still would like to understand why the feedback table cannot be backed up by itself. The technique of backing up and restoring only selected tables will be frequently useful for many customers, and I would like to know how to get around this problem if it shows up again.

RobR

Re: Table has 22 million records, but backup doesn't see them

From

Steve Crawford

Date:

08 April 2009, 14:16:07

Radcon Entec wrote:
>
>
> Here is the text that results from dumping my 22-million-row feedback
> table:
>
> ...
>
> CREATE RULE feedback_partition_active AS ON INSERT TO feedback WHERE
> (new.charge > 7000) DO INSTEAD INSERT INTO feedback_active (charge,
> elapsed_time, tag_type, stack, tag_value, heating, status) VALUES
> (new.charge, new.elapsed_time, new.tag_type, new.stack, new.tag_value,
> new.heating, new.status);
>
> ...
>
> CREATE RULE feedback_partition_archived AS ON INSERT TO feedback WHERE
> (new.charge <= 7000) DO INSTEAD INSERT INTO feedback_archived_7000
> (charge, elapsed_time, tag_type, stack, tag_value, heating, status)
> VALUES (new.charge, new.elapsed_time, new.tag_type, new.stack,
> new.tag_value, new.heating, new.status);
>

Are you certain that feedback actually contains any data or is it just
the parent table and the real data is in the child tables? What is the
output of "select count(*) from only feedback;" ?

Cheers,
Steve

Re: Table has 22 million records, but backup doesn't see them

From

Radcon Entec

Date:

08 April 2009, 14:46:45

From: Steve Crawford <scrawford@pinpointresearch.com>
To: Radcon Entec <radconentec@yahoo.com>
Cc: "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
Sent: Wednesday, April 8, 2009 1:15:55 PM
Subject: Re: [GENERAL] Table has 22 million records, but backup doesn't see them

Radcon Entec wrote:
>
> Here is the text that results from dumping my 22-million-row feedback table:
> ...
>
> CREATE RULE feedback_partition_active AS ON INSERT TO feedback WHERE (new.charge > 7000) DO INSTEAD INSERT INTO feedback_active (charge, elapsed_time, tag_type, stack, tag_value, heating, status) VALUES (new.charge, new.elapsed_time, new.tag_type, new.stack, new.tag_value, new.heating, new.status);
>
> ...
>
> CREATE RULE feedback_partition_archived AS ON INSERT TO feedback WHERE (new.charge <= 7000) DO INSTEAD INSERT INTO feedback_archived_7000 (charge, elapsed_time, tag_type, stack, tag_value, heating, status) VALUES (new.charge, new.elapsed_time, new.tag_type, new.stack, new.tag_value, new.heating, new.status);
>

Are you certain that feedback actually contains any data or is it just the parent table and the real data is in the child tables? What is the output of "select count(*) from only feedback;" ?

Cheers,
Steve

Steve,

You are, of course, correct. "select count(*) from only feedback" returns 0. I have never used (or even seen) PostgreSQL rules before.

When I run the query "select * from feedback where charge = 23017", I get 538 records. Adding the word "only" gives me zero records, as expected, and querying the feedback_active table gets me my 538 records. But the feedback table only has the INSERT rules you quoted above. I clicked on the feedback table's Rules leaf and selected "New Rule", and saw that I can create SELECT, INSERT, UPDATE and DELETE rules. But even though I don't have a SELECT rule explicitly defined, PostgreSQL appears to be smart enough to retrieve data from the correct actual table when I think I'm selecting from the feedback table. Is that standard behavior?

Of course, my next step will be to read the documentation.

Thank you yet again!

RobR

Re: Table has 22 million records, but backup doesn't see them

From

Steve Crawford

Date:

08 April 2009, 16:15:11

Radcon Entec wrote:
> ------------------------------------------------------------------------
> *
> *
> You are, of course, correct.  "select count(*) from only feedback"
> returns 0.  I have never used (or even seen) PostgreSQL rules before.
>
> When I run the query "select * from feedback where charge = 23017", I
> get 538 records.  Adding the word "only" gives me zero records, as
> expected, and querying the feedback_active table gets me my 538
> records.  But the feedback table only has the INSERT rules you quoted
> above.  I clicked on the feedback table's Rules leaf and selected "New
> Rule", and saw that I can create SELECT, INSERT, UPDATE and DELETE
> rules.  But even though I don't have a SELECT rule explicitly defined,
> PostgreSQL appears to be smart enough to retrieve data from the
> correct actual table when I think I'm selecting from the feedback
> table.  Is that standard behavior?
>
> Of course, my next step will be to read the documentation.
>
You won't find the cause of your surprise reading up on rules. Read up
on inheritance, instead. It can do a lot, but a typical simple use is
table-partitioning as appears to be the case in your situation.

You start with the main (parent) table, say "events" then create a bunch
of child tables that inherit events, say events_jan, events_feb,
events_mar...

You never put actual data in "events" but create a rule or trigger that
looks at the month information and puts January events in events_jan,
February in events_feb and so on.

Select * from events is more-or-less equivalent to:
select * from events_jan union
select * from events_feb union...

Inheritance goes far beyond the simple case shown above. There's a lot
of nifty stuff you can do and a number of things that can bite you. But,
as you say, that's where the documentation comes in.

Cheers,
Steve

Re: Table has 22 million records, but backup doesn't see them

From

Chris

Date:

08 April 2009, 19:49:50

> I still would like to understand why the feedback table cannot be backed
> up by itself.

Because there is no actual data in the feedback table.

It's being stored in the "feedback_active", "feedback_archived_7000" and
your other tables instead.

--
Postgresql & php tutorials
http://www.designmagick.com/