F.45. pg_transfer
The pg_transfer
extension enables quick transfer of tables between Postgres Pro Enterprise instances.
It may sometimes be required to load large volumes of data into a database, for example, when consolidating data from regional servers into a central one. If your data needs to be uploaded to a server under heavy load, you can use a temporary database on another server to accumulate the data and then transfer it all at once when the main server load is minimal.
If you use pg_dump and pg_restore applications to transfer data, the load on the receiving server is usually higher than that on the sending server. Since the data is loaded using INSERT
or COPY
commands, it creates a significant impact on the disk subsystem. Besides, you will have to re-build indexes in the target database once all the data is loaded, which will also contribute to the server load.
To achieve a much higher load speed for read-only data, you can use the pg_transfer
extension while doing dump/restore, which allows to copy data files directly, without using COPY
/INSERT
commands. pg_transfer
also provides auxiliary functions for pg_dump and pg_restore to transfer tables together with pre-built indexes and collected statistics, so you can avoid extra load on the target server incurred by statistics re-collection.
Note
Postgres Pro Enterprise configuration and the architectures of source and target servers must provide binary-compatible file formats. When restoring data, pg_transfer
checks that source and target servers have the same alignment, page and segment sizes, etc.
F.45.1. Installation
pg_transfer
is included into Postgres Pro Enterprise. To enable pg_transfer
, create its extension in your database using the following SQL command:
CREATE EXTENSION pg_transfer;
You must create pg_transfer
extension in both source and target databases.
F.45.2. Usage
To transfer data using the pg_transfer
module, do the following:
Prepare source and target systems for data transfer
Before the actual data transfer, you first need to transfer data schema to the target server:
In the source database, mark the table to be transferred as read-only:
ALTER TABLE
table_name
SET CONSTANT;Run the
VACUUM (ANALYZE)
command to remove dead tuples and refresh statistics:VACUUM (ANALYZE)
table_name
;Take a logical dump of the data schema on the source server and restore it on the target server:
pg_dump -Fc -t
table_name
--schema-only -ftransfer_dir
/archive.outold_database
pg_restore -dnew_database
--schema-onlytransfer_dir
/archive.out
Prepare TOAST identifiers for the transfer
This step is only required if the tables to be transferred have toasted values.
Determine TOAST identifiers (
reltoastid
) in the new database:psql
new_database
-c "SELECT reltoastrelid FROM pg_class WHERE relname='table_name
';"Using the received
reltoastid
identifiers, prepare the table for the transfer and force data flush to disk:psql -d
old_database
-c "SELECT pg_transfer_freeze('table_name
'::regclass::oid,reltoastrelid
::oid);"
Transfer the data to the target system
Copy the data into a separate directory using pg_dump utility:
pg_dump -Fc -t
table_name
--transfer-dirtransfer_dir
/ -ftransfer_dir
/archive.outold_database
Restore the data in the target database:
pg_restore -d
new_database
--data-only --transfer-dirtransfer_dir
/ --copy-mode-transfertransfer_dir
/archive.outWhen source and target databases are located in the same file system, the
--copy-mode-transfer
option must be specified at least once (for either pg_dump or pg_restore command) to get an independent copy of data. When restoring data on the primary server,--generate-wal
option must be specified for pg_restore for changes to be replicated to a standby server.
F.45.3. Compatibility
The pg_transfer
extension is only supported on Linux systems.
F.45.4. Authors
Postgres Professional, Moscow, Russia