pg_dump order of rows - Mailing list pgsql-general

From jan
Subject pg_dump order of rows
Date
Msg-id 59eeded2ed45f09f2f2b979a5eb4f6f7@dafuer.de
Whole thread Raw
List pgsql-general
Hello,

today I stumbled across a interesting question about the order rows are dumped
out while exporting a database with pg_dump. I know questions like this are
around this list sometimes, but I think this is a bit more special.

First of all I know that dumping a database is a somewhat nondeterministic
process. It's hard to determine in which order objects are dumped. If my
understanding is correct when it's about dumping the rows from a specific table
the rows will appear in the dump in disk-order. This order is changed everytime
there are updates to rows in that table and how often a vacuum occurs to release
old row versions.

From some testing I "verified" this by experimentation - just created a new
table with a known order of rows and dumped it. The dump was ordered in the same
way the rows were inserted. Same again with updating some of the rows those rows
appeard at the end of the dump. At last I vacuumed the database and updated some
rows they appeared in the "spaces" the previous updated rows left behind.
Exactly what I expected :-)

Now for my question - we have a case where rows are inserted in order and are
updated rarely (about 10% of the rows and often in order) and we are seeking for
a solution to make the dump (e.g. the backup of this database) more random at
database level (by influencing the on disk order?). Obvious way would be to
update all rows randomly by software before dumping but that would be a manual
process. So anyone out there with some other ideas (I found this interesting as
I'm getting a more insight view of my favourite database)?

Thanks in advance,
Jan



pgsql-general by date:

Previous
From: jan
Date:
Subject: pg_dump order of rows
Next
From: Greg Landrum
Date:
Subject: GiST indexing question