Thread: While only running pg_dump, postgresql performs writes inside base/ directory. How can we stop this?

We are dumping a 4TB postgres database using pg_dump. The source is on a copy-on-write snapshot filesystem, and the target is an external drive.

While the pg_dump is running we get about 4-5MB/sec writes on the filesystem where postgresql data is (iostat), and specifically inside the base/ directory.

Since the only query running on this DB is the COPY initiated from pg_dump which only reads data, who is writing data to postgres datafiles?

We need to find a way to make postgres stop writing because the source copy-on-write filesystem gets filled up before the dump is finished.

Postgresql version is 8.4, vacuum is not running (autovacuum = off), OS is debian 6.
On Mon, Dec 2, 2013 at 6:22 AM, Spiros Ioannou <sivann@inaccess.com> wrote:
We are dumping a 4TB postgres database using pg_dump. The source is on a copy-on-write snapshot filesystem, and the target is an external drive.

While the pg_dump is running we get about 4-5MB/sec writes on the filesystem where postgresql data is (iostat), and specifically inside the base/ directory.

Since the only query running on this DB is the COPY initiated from pg_dump which only reads data, who is writing data to postgres datafiles?

It is probably setting "hint bits" on the tuples it is visiting.
 

We need to find a way to make postgres stop writing because the source copy-on-write filesystem gets filled up before the dump is finished.

Vacuum the system before taking the snapshot. 

Postgresql version is 8.4, vacuum is not running (autovacuum = off), OS is debian 6.

Since autovacuum is off, what are you doing about vacuuming?

Cheers,

Jeff
Hi Jeff,
autovacuum is off on the DB running on the filesystem snapshot.
What "hint bits" do you suppose it is setting? It's running only one COPY command for days. Do you have any suggestions to make it more "read only" ?


On 2 December 2013 18:14, Jeff Janes <jeff.janes@gmail.com> wrote:
On Mon, Dec 2, 2013 at 6:22 AM, Spiros Ioannou <sivann@inaccess.com> wrote:
We are dumping a 4TB postgres database using pg_dump. The source is on a copy-on-write snapshot filesystem, and the target is an external drive.

While the pg_dump is running we get about 4-5MB/sec writes on the filesystem where postgresql data is (iostat), and specifically inside the base/ directory.

Since the only query running on this DB is the COPY initiated from pg_dump which only reads data, who is writing data to postgres datafiles?

It is probably setting "hint bits" on the tuples it is visiting.
 

We need to find a way to make postgres stop writing because the source copy-on-write filesystem gets filled up before the dump is finished.

Vacuum the system before taking the snapshot. 

Postgresql version is 8.4, vacuum is not running (autovacuum = off), OS is debian 6.

Since autovacuum is off, what are you doing about vacuuming?

Cheers,

Jeff

On Tue, Dec 10, 2013 at 12:43 AM, Spiros Ioannou <sivann@inaccess.com> wrote:
Hi Jeff,
autovacuum is off on the DB running on the filesystem snapshot.


Which probably makes sense on the snapshot, but is it also off on the pre-snapshot database?
 
What "hint bits" do you suppose it is setting? It's running only one COPY command for days. Do you have any suggestions to make it more "read only" ?


When a query sees a tuple that is still listed as part of an open transaction, it needs to figure out whether that transaction has now completed, and if so whether it committed or aborted.  This can be quite a bit of work to do, so once complete it sets a hint bit locally to that tuple, so that the next visitor doesn't have to repeat the work.  I don't believe that there is any way to turn this off, unless you want to run your own custom build of PostgreSQL.

Cheers,

Jeff
- autovacuum is not off in the pre-snapshot (production) database. It auto-runs every 2 weeks to avoid transaction ID wraparound.
- I tried modifying pgdump to have a more relaxed transaction isolation (READ UNCOMMITTED) instead of the default (serializable) but this didn't help.
thanks for the info.




On 10 December 2013 21:13, Jeff Janes <jeff.janes@gmail.com> wrote:
On Tue, Dec 10, 2013 at 12:43 AM, Spiros Ioannou <sivann@inaccess.com> wrote:
Hi Jeff,
autovacuum is off on the DB running on the filesystem snapshot.


Which probably makes sense on the snapshot, but is it also off on the pre-snapshot database?
 
What "hint bits" do you suppose it is setting? It's running only one COPY command for days. Do you have any suggestions to make it more "read only" ?


When a query sees a tuple that is still listed as part of an open transaction, it needs to figure out whether that transaction has now completed, and if so whether it committed or aborted.  This can be quite a bit of work to do, so once complete it sets a hint bit locally to that tuple, so that the next visitor doesn't have to repeat the work.  I don't believe that there is any way to turn this off, unless you want to run your own custom build of PostgreSQL.

Cheers,

Jeff