Re: Pg_upgrade speed for many tables - Mailing list pgsql-hackers
From | Jeff Janes |
---|---|
Subject | Re: Pg_upgrade speed for many tables |
Date | |
Msg-id | CAMkU=1wiOoSt3gPvqyv_1zehCYfRyjTwVPFbZO0Y6q8ZDAd=tw@mail.gmail.com Whole thread Raw |
In response to | Re: Pg_upgrade speed for many tables (Bruce Momjian <bruce@momjian.us>) |
Responses |
Use of fsync; was Re: Pg_upgrade speed for many tables
|
List | pgsql-hackers |
On Wed, Nov 14, 2012 at 3:55 PM, Bruce Momjian <bruce@momjian.us> wrote: > On Mon, Nov 12, 2012 at 10:29:39AM -0800, Jeff Janes wrote: >> >> Is turning off synchronous_commit enough? What about turning off fsync? > > I did some testing with the attached patch on a magnetic disk with no > BBU that turns off fsync; With which file system? I wouldn't expect you to see a benefit with ext2 or ext3, it seems to be a peculiarity of ext4 that inhibits "group fsync" of new file creations but rather does each one serially.Whether it is worth applying a fix that is only neededfor that one file system, I don't know. The trade-offs are not all that clear to me yet. > I got these results > > sync_com=off fsync=off > 1 15.90 13.51 > 1000 26.09 24.56 > 2000 33.41 31.20 > 4000 57.39 57.74 > 8000 102.84 116.28 > 16000 189.43 207.84 > > It shows fsync faster for < 4k, and slower for > 4k. Not sure why this > is the cause but perhaps the buffering of the fsync is actually faster > than doing a no-op fsync. synchronous-commit=off turns off not only the fsync at each commit, but also the write-to-kernel at each commit; so it is not surprising that it is faster at large scale. I would specify both synchronous-commit=off and fsync=off. >> When I'm doing a pg_upgrade with thousands of tables, the shutdown >> checkpoint after restoring the dump to the new cluster takes a very >> long time, as the writer drains its operation table by opening and >> individually fsync-ing thousands of files. This takes about 40 ms per >> file, which I assume is a combination of slow lap-top disk drive, and >> a strange deal with ext4 which makes fsyncing a recently created file >> very slow. But even with faster hdd, this would still be a problem >> if it works the same way, with every file needing 4 rotations to be >> fsynced and this happens in serial. > > Is this with the current code that does synchronous_commit=off? If not, > can you test to see if this is still a problem? Yes, it is with synchronous_commit=off. (or if it wasn't originally, it is now, with the same result) Applying your fsync patch does solve the problem for me on ext4. Having the new cluster be on ext3 rather than ext4 also solves the problem, without the need for a patch; but it would be nice to more friendly to ext4, which is popular even though not recommended. >> >> Anyway, the reason I think turning fsync off might be reasonable is >> that as soon as the new cluster is shut down, pg_upgrade starts >> overwriting most of those just-fsynced file with other files from the >> old cluster, and AFAICT makes no effort to fsync them. So until there >> is a system-wide sync after the pg_upgrade finishes, your new cluster >> is already in mortal danger anyway. > > pg_upgrade does a cluster shutdown before overwriting those files. Right. So as far as the cluster is concerned, those files have been fsynced. But then the next step is go behind the cluster's back and replace those fsynced files with different files, which may or may not have been fsynced. This is what makes me thing the new cluster is in mortal danger. Not only have the new files perhaps not been fsynced, but the cluster is not even aware of this fact, so you can start it up, and then shut it down, and it still won't bother to fsync them, because as far as it is concerned they already have been. Given that, how much extra danger would be added by having the new cluster schema restore run with fsync=off? In any event, I think the documentation should caution that the upgrade should not be deemed to be a success until after a system-wide sync has been done. Even if we use the link rather than copy method, are we sure that that is safe if the directories recording those links have not been fsynced? Cheers, Jeff
pgsql-hackers by date: