Re: database is bigger after dump/restore - why? (60 GB to 109 GB) - Mailing list pgsql-general

From Aleksey Tsalolikhin
Subject Re: database is bigger after dump/restore - why? (60 GB to 109 GB)
Date
Msg-id AANLkTinXb4NC5wgqp-LW_CRhMOfNPQknPHNovNqdm1e+@mail.gmail.com
Whole thread Raw
In response to Re: database is bigger after dump/restore - why? (60 GB to 109 GB)  (Sergey Burladyan <eshkinkot@gmail.com>)
Responses Re: database is bigger after dump/restore - why? (60 GB to 109 GB)  (Alban Hertroys <dalroi@solfertje.student.utwente.nl>)
Re: database is bigger after dump/restore - why? (60 GB to 109 GB)  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: database is bigger after dump/restore - why? (60 GB to 109 GB)  (Adrian Klaver <adrian.klaver@gmail.com>)
List pgsql-general
OK, just to recap:

database A has a table that is 50 GB in size (according to:

SELECT relname as "Table",
   pg_size_pretty(pg_total_relation_size(relid)) As "Size" from
   pg_catalog.pg_statio_user_tables ORDER BY
   pg_total_relation_size(relid) DESC;
)


I pg_dump -Fc this table, which gives me a 9.8 GB file.

I then pg_restore this table into database B, which results in a
100 GB table, according to the same SQL query.


Database versions are identical:
A: PostgreSQL 8.4.4 on x86_64-redhat-linux-gnu, compiled by GCC gcc
(GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
B: PostgreSQL 8.4.4 on x86_64-redhat-linux-gnu, compiled by GCC gcc
(GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit


Character encoding of the source and target databases are identical:
UTF8.  (As reported by "psql -l".)

fillfactor is not set for this table, it is the default on both A and
B.  (As reported by "select reloptions from pg_class where
relname='tablename';".)


If I pg_dump -Fc the 100 GB table on database B, I get a 9.9 GB file.

If I do a "select count(*) from tablename",  I get identical results
on A and B (1,628,348 rows).

Adrian asked about the schema for this table.  It is the same on A and B:

                   Table "tablename"
     Column      |           Type           | Modifiers
-----------------+--------------------------+-----------
 column1        | character varying        | not null
 column2    | character varying        |
 column3        | character varying        |
 column4 | character varying        |
 column5   | timestamp with time zone |
 column6        | character varying        |
 column7   | character varying        |
 column8     | character varying        |
 column9      | character varying        |
 column10  | character varying        |
Indexes:
    "tablename_pkey" PRIMARY KEY, btree (column1)
    "tablename_column6_index" btree (column6)
    "tablename_column9_index" btree (device_dnq)
    "tablename_column8_index" btree (kdm_gmt_end)
    "tablename_column7_index" btree (kdm_gmt_start)


When I pg_dump the 50 GB table, I get a 40 GB file.

When I pg_dump the 100 GB table, I get a 40 GB file.

so looks like it's something low-level, something about how the data is stored.

i've installed the "pageinspect" contrib module as Tomas suggested but
I don't know what to do with it or what to look at.  I looked at the
manual for it but it's totally over my head right now.

What sections of the manual should I read to be able to use this
module?  (there are 2167 pages in the whole Postgres 8.4 manual and I
don't have time to study the whole thing.  :(   I just need to study
enough to understand how to use pageinspect.)

(I'm not a Postgres DBA but I am having to become one to support our database.)

Thanks again for the help.   I'll be at SCALE 9x tomorrow helping Joe
Conway and Bruce M and Richard B and company get the word out about
Postgres.

Best,
Aleksey

pgsql-general by date:

Previous
From: Amitabh Kant
Date:
Subject: Re: Partitions and indexes
Next
From: Aleksey Tsalolikhin
Date:
Subject: Re: pg_dump makes our system unusable - any way to pg_dump in the middle of the day? (postgres 8.4.4)