Re: Modification of data in base folder and very large tables - Mailing list pgsql-performance

From Andrew Gierth
Subject Re: Modification of data in base folder and very large tables
Date
Msg-id 877e5czua4.fsf@news-spur.riddles.org.uk
Whole thread Raw
In response to Re: Modification of data in base folder and very large tables  (Ogden Brash <info@litika.com>)
List pgsql-performance
>>>>> "Ogden" == Ogden Brash <info@litika.com> writes:

 Ogden> I did the restore as a data only restore so that it would not
 Ogden> try to recreate any tables.

Doing data-only restores is almost always a mistake.

pg_dump/pg_restore are very careful to create things in an order that
allows the data part of the restore to run quickly: tables are created
first without any indexes or constraints, then data is loaded, then
indexes and constraints are created in bulk afterwards.

If you do a data-only restore into an existing table, then it's up to
you to avoid performance problems.

 Ogden> As an experiment, I am in the process of clustering the source
 Ogden> database tables by the primary key constraint. I am hoping that
 Ogden> if I redo the pg_dump after that, it will contain the records in
 Ogden> more-or-less primary key order and on the subsequent pg_restore
 Ogden> it should not have to spend the vast majority of the time on
 Ogden> reading and seeking.

This is a waste of time; just restore the data without the primary key
in place and then create it at the end.

 Ogden> It is surprising to me that the cluster operations (which also
 Ogden> have to churn through the entire index and all the records) are
 Ogden> going *much* faster than pg_restore.

That's because cluster, as with creation of a fresh index, can do a bulk
index build: sequentially read the table, sort the values (spilling to
temporary files if need be, but these are also read and written
sequentially), and write out the index data in one sequential pass.

-- 
Andrew (irc:RhodiumToad)



pgsql-performance by date:

Previous
From: Ogden Brash
Date:
Subject: Re: Modification of data in base folder and very large tables
Next
From: Jeff Janes
Date:
Subject: Re: Modification of data in base folder and very large tables