Re: Strange results using pg_restore - Mailing list pgsql-admin

From Kevin Grittner
Subject Re: Strange results using pg_restore
Date
Msg-id 1382045522.51765.YahooMailNeo@web162906.mail.bf1.yahoo.com
Whole thread Raw
In response to Strange results using pg_restore  (Oscar Calderon <ocalderon@solucionesaplicativas.com>)
List pgsql-admin
Oscar Calderon <ocalderon@solucionesaplicativas.com> wrote:

> Have a good day. This friday i'm going to migrate an entire
> database of a government institution in my country. Those are
> like 4 database of 2GB each one. So, i was preparing about what
> i'm going to do tomorrow. They currently have PostgreSQL 9.1
> installed from source i think, and they will update RHEL version
> to 5.7, it will be a reinstall of operative system, so before
> that i will backup the databases, copy contents of pg_hba.conf
> and so on, and after the reinstallation of RHEL i'm going to
> install PostgreSQL 9.3 from official repositories alongside
> contrib and plpython.
>
> I've already tested that on a Centos 5.7 VM and it works fine.
> But well, yesterday i was practicing creation of backups and
> restoring backups in my laptop with the next specs:
>
> Core i7 2.2 Ghz
> 8GB RAM
> 30GB empty space
> Windows 8.1

An i7 with enough RAM to fully cache the cluster, plus leave
generous work memory allocations should go pretty fast.

> In my laptop i also have PostgreSQL 9.3. First i used the sample
> database pgdellstore, what i did is that i created a database,
> and ran the SQL file of the database. After that, i created a
> backup of that database pgdellstore with pg_dump, a custom
> format, without compression:
>
> pg_dump -h localhost -p 5432 -U postgres -W -Fc -Z0 -C -d pgdellstore > pgdellstore.backup
>
> And everything as normal. After that i created a second database
> and restored my custom backup with pg_restore trying to speed up
> using -j option:
>
> pg_restore -h localhost -p 5432 -U postgres -W -d pgdellstore -j4 pgdellstore.backup

So you are taking advantage of parallel restore; that will make it
even faster.

> Also, before that, i backed up my postgresql.conf and edited the
> current one with the next parameters, according to some research
> that i did in google about performance tuning of pg_restore:
>
> shared_buffers=2GB
> maintenance_work_mem=1GB
> work_mem=128MB
> wal_buffers=16MB
> checkpoint_segments=8
> autovacuum=off
> archive_mode=off
> fsync=off
> full_page_writes=off
> checkpoint_timeout=15min
> checkpoint_completion_target=0.9
> track_counts=off
> synchronous_commit=off
> bgwriter_delay=50ms

I have no problem with doing an initial load with such settings
(and use similar settings myself), but if you do that, be sure to
run VACUUM FREEZE ANALYZE on the cluster immediately, and then
revert to more normal settings, which protect your data in the
event of a crash.

> And it restored it in 2 seconds

For what you said above, that sounds about right.

> i think, for me it was extremely fast (is a database of 25MB
> aproximately) and i had doubts, so i have some queries to check
> database tables size and i compared the results of the first
> database (the one that i created with original sql script) and
> the second one (the one that i created using a custom backup of
> the first one) and i got this:
>
> DB created with original SQL script:
>
>  schemaname |  tablename   | reltuples |  tamanio   | tamanioord
> ------------+--------------+-----------+------------+------------
>  public     | customers    |     20000 | 5016 kB    |    5136384
>  public     | orderlines   |     60350 | 4440 kB    |    4546560
>  public     | cust_hist    |     60350 | 3976 kB    |    4071424
>  public     | products     |     10000 | 1552 kB    |    1589248
>  public     | orders       |     12000 | 1384 kB    |    1417216
>  public     | inventory    |     10000 | 704 kB     |     720896
>  public     | categories   |        16 | 24 kB      |      24576
>  public     | afiliado     |         4 | 24 kB      |      24576
>  public     | pruebafechas |         0 | 8192 bytes |       8192
>  public     | reorder      |         0 | 0 bytes    |          0
> (10 filas)
>
> DB created with custom backup based on first db
>
>  schemaname |  tablename   | reltuples |  tamanio   | tamanioord
> ------------+--------------+-----------+------------+------------
>  public     | customers    |     20000 | 4992 kB    |    5111808
>  public     | orderlines   |     60350 | 4416 kB    |    4521984
>  public     | cust_hist    |     60350 | 3952 kB    |    4046848
>  public     | products     |     10000 | 1528 kB    |    1564672
>  public     | orders       |     12000 | 1360 kB    |    1392640
>  public     | inventory    |     10000 | 680 kB     |     696320
>  public     | afiliado     |         4 | 24 kB      |      24576
>  public     | categories   |        16 | 24 kB      |      24576
>  public     | pruebafechas |         0 | 8192 bytes |       8192
>  public     | reorder      |         0 | 0 bytes    |          0
> (10 filas)

It is normal for a database to be packed a little tighter after a
dump/restore.  It will expand a little as it is modified and pages
split.

> Also, my second concern is the next: 4 months ago, i did a
> routine maintenance of those production database, and first as
> usual i backup all the databases. I took one of the backups of a
> development database for testing purposes. The backup file
> weights 279MB and the database is like 1.8 GB  . So, yesterday i
> also practiced restoring this backup with pg_restore in a new
> database, with the same configuration from above. The command
used > was this:
>
> pg_restore -h localhost -p 5432 -U postgres -W --create -d debdbrest -j4 devdb.backup
>
> And it took only 1:30 mins aproximately.

Again, for the given hardware and configuration, that sounds about
right.

> For me is suspiciously fast becuse when i was reading about
> pg_restore and how to enhace performance during backup restoring,
> i found various cases in previous postgresql versions about
> pg_restore is too slow, and also, i tried the same restore of the
> same backups 2 days ago but without postgresql.conf parameters
> modified and without using -j flag, and it started and after 2
> hours it hadn't finished, so i cancelled;

Now you know why these settings and restore options are
recommended.

> also in the first try i noted that it printed a lot of output in
> the console and in this second try it didn't print anything.

Try running your test a second time, so that CREATE statements fail
on duplicate objects.  You should see output if there are any
actual errors.  Nothing printed means it worked.

> But that time reduction from more than 2 hours to 1:30 mins is
> too suspicious for me, but i don't know if, as the previous
> question that i have, you consider that this is normal (a backup
> of 279MB of a database of 1840MB restored in just 1.30 min).

It sounds reasonable to me; you've disabled the features that
ensure that there will be no data loss or corruption after a crash.
As long as you're OK with starting the restore over again from
initdb if there *is* a crash, you can buy a lot of speed.  You can
now see how some of the products that don't actually protect your
data get such good benchmarks.

Like I said, VACUUM FREEZE ANALYZE after the restore, then change
the postgreql.conf back to safe values and restart the service and
you will be fine.  Be sure to allow some time for the service to
stop -- it may need to flush a lot of data to disk at that point.
That may take a lot longer than the pg_restore run itself,
especially if you do it right away, before a timed checkpoint
occurs.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-admin by date:

Previous
From: Payal Singh
Date:
Subject: Re: PG server clustering
Next
From: "ktm@rice.edu"
Date:
Subject: Re: PG server clustering