We are looking to distribute postgres databases to our customers along with our application. We are currently evaluating postgres version 8.4.4. The database can be of size 25 gb (compressed files fits in few dvds, the product is distributed on dvds). The pg_restore of this database takes several hours on the low end machines running windows os. The pg_restore is run during our product install, and the current install time projection is not acceptable. Our customers can purchase different databases over a period of time, and the application makes transactional updates to the databases after installation. Hence, copying the entire data folder instead of using the pg_restore is not an option, as the transactional updates will be lost.
I have read the documentation and the few posts available that discourages file copy based restore of individual databases, but, I have found a way to do this. I would appreciate if the experts can read and advise if the approach will work, given our environment and usage boundaries.
Master Postgres instance (this is where we create the data, we have complete control of this environment):
1. Create the database and populate data.
2. Set vacuum_freeze_table_age to 0 in the postgresql.conf
3. Run vacuum full - this will reset the row xid to the FrozenXid
4. Shutdown postgres and take a copy of the files for the given database.
In the deploy instance at the customer site:
1. Create the new database.
2. Shutdown postgres instance and copy the database files created in the master instance to the database specific folder.
3. Start postgres instance.
We don't use table row oids. If the cluster wide oid collides with the oid in the copied database files during subsequent ddl operations, postgres resolves this by skipping to the next available oid. There will be a delay to find the next available oid, which is acceptable in our case, as the ddl operations at the customer site are rare. And, the vacuum full with vacuum_freeze_table_age set to 0 on the master instance takes care of the xmin, allowing transactions to be visible, and for further transactions at the customer site to continue without colliding.
I have tested this and it works, and I am continuing to test it more. I would like for validation of this idea from the experts and the community to make sure I haven't overlooked something obvious that might cause issues.
Thank you,
Srini