Thread: drop duplicate buffers in OS
Hi, I create patch that can drop duplicate buffers in OS using usage_count alogorithm. I have developed this patch since last summer. This feature seems to be discussed in hot topic, so I submit it more faster than my schedule. When usage_count is high in shared_buffers, they are hard to drop from shared_buffers. However, these buffers wasn't required in file cache. Because they aren't accessed by postgres(postgres access to shared_buffers). So I create algorithm that dropping file cache which is high usage_count in shared_buffers and is clean state in OS. If file cache are clean state in OS, and executing posix_fadvice DONTNEED, it can only free in file cache without writing physical disk. This algorithm will solve double-buffered situation problem and can use memory more efficiently. I am testing DBT-2 benchmark now... Regards, -- Mitsumasa KONDO NTT Open Source Software Center
Attachment
On Wed, Jan 15, 2014 at 1:53 AM, KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp> wrote: > I create patch that can drop duplicate buffers in OS using usage_count > alogorithm. I have developed this patch since last summer. This feature seems to > be discussed in hot topic, so I submit it more faster than my schedule. > > When usage_count is high in shared_buffers, they are hard to drop from > shared_buffers. However, these buffers wasn't required in file cache. Because > they aren't accessed by postgres(postgres access to shared_buffers). > So I create algorithm that dropping file cache which is high usage_count in > shared_buffers and is clean state in OS. If file cache are clean state in OS, and > executing posix_fadvice DONTNEED, it can only free in file cache without writing > physical disk. This algorithm will solve double-buffered situation problem and > can use memory more efficiently. > > I am testing DBT-2 benchmark now... The thing about this is that our usage counts for shared_buffers don't really work right now; it's common for everything, or nearly everything, to have a usage count of 5. So I'm reluctant to rely on that for much of anything. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Can we just get the backend that dirties the page to the posix_fadvice DONTNEED?
Or have another helper that sweeps the shared buffers and does this post-first-dirty?
a.
On Wed, Jan 15, 2014 at 1:34 PM, Robert Haas <robertmhaas@gmail.com> wrote:
The thing about this is that our usage counts for shared_buffers don'tOn Wed, Jan 15, 2014 at 1:53 AM, KONDO Mitsumasa
<kondo.mitsumasa@lab.ntt.co.jp> wrote:
> I create patch that can drop duplicate buffers in OS using usage_count
> alogorithm. I have developed this patch since last summer. This feature seems to
> be discussed in hot topic, so I submit it more faster than my schedule.
>
> When usage_count is high in shared_buffers, they are hard to drop from
> shared_buffers. However, these buffers wasn't required in file cache. Because
> they aren't accessed by postgres(postgres access to shared_buffers).
> So I create algorithm that dropping file cache which is high usage_count in
> shared_buffers and is clean state in OS. If file cache are clean state in OS, and
> executing posix_fadvice DONTNEED, it can only free in file cache without writing
> physical disk. This algorithm will solve double-buffered situation problem and
> can use memory more efficiently.
>
> I am testing DBT-2 benchmark now...
really work right now; it's common for everything, or nearly
everything, to have a usage count of 5. So I'm reluctant to rely on
that for much of anything.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Aidan Van Dyk Create like a god,
aidan@highrise.ca command like a king,
http://www.highrise.ca/ work like a slave.
(2014/01/16 21:38), Aidan Van Dyk wrote: > Can we just get the backend that dirties the page to the posix_fadvice DONTNEED? No, it can remove clean page in OS file caches. Because if page is dirtied, it cause physical-disk-writing. However, it is experimental patch so it might be changed by future benchmark testing. > Or have another helper that sweeps the shared buffers and does this post-first-dirty? We can add DropDuplicateOSCache() function to checkpointer process or other process. And we can chenged posix_fadvice() DONTNEED to sync_file_range(). It can cause physical-disk-writing in target buffer, not to free OS file caches. I'm considering that sync_file_range() SYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE in executing checkpoint. It can avoid fsync freeze situaition in part of of finnal checkpoint. Regards, -- Mitsumasa KONDO NTT Open Source Software Center
(2014/01/16 3:34), Robert Haas wrote: > On Wed, Jan 15, 2014 at 1:53 AM, KONDO Mitsumasa > <kondo.mitsumasa@lab.ntt.co.jp> wrote: >> I create patch that can drop duplicate buffers in OS using usage_count >> alogorithm. I have developed this patch since last summer. This feature seems to >> be discussed in hot topic, so I submit it more faster than my schedule. >> >> When usage_count is high in shared_buffers, they are hard to drop from >> shared_buffers. However, these buffers wasn't required in file cache. Because >> they aren't accessed by postgres(postgres access to shared_buffers). >> So I create algorithm that dropping file cache which is high usage_count in >> shared_buffers and is clean state in OS. If file cache are clean state in OS, and >> executing posix_fadvice DONTNEED, it can only free in file cache without writing >> physical disk. This algorithm will solve double-buffered situation problem and >> can use memory more efficiently. >> >> I am testing DBT-2 benchmark now... > > The thing about this is that our usage counts for shared_buffers don't > really work right now; it's common for everything, or nearly > everything, to have a usage count of 5. So I'm reluctant to rely on > that for much of anything. This patch aims to large shared_buffers situations, so 10% memory shared_buffers situaition might be not effective. This patch is in experimental and to show how to solve the double-buffers for one of a example. Regards, -- Mitsumasa KONDO NTT Open Source Software Center
On Wed, Jan 15, 2014 at 10:34 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jan 15, 2014 at 1:53 AM, KONDO Mitsumasa
<kondo.mitsumasa@lab.ntt.co.jp> wrote:
> I create patch that can drop duplicate buffers in OS using usage_count
> alogorithm. I have developed this patch since last summer. This feature seems to
> be discussed in hot topic, so I submit it more faster than my schedule.
>
> When usage_count is high in shared_buffers, they are hard to drop from
> shared_buffers. However, these buffers wasn't required in file cache. Because
> they aren't accessed by postgres(postgres access to shared_buffers).
> So I create algorithm that dropping file cache which is high usage_count in
> shared_buffers and is clean state in OS. If file cache are clean state in OS, and
> executing posix_fadvice DONTNEED, it can only free in file cache without writing
> physical disk. This algorithm will solve double-buffered situation problem and
> can use memory more efficiently.
>
> I am testing DBT-2 benchmark now...
Have you had any luck with it? I have reservations about this approach. Among other reasons, if the buffer is truly nailed in shared_buffers for the long term, the kernel won't see any activity on it and will be able to evict it fairly efficiently on its own.
So I'm reluctant to do a detailed review if the author cannot demonstrate a performance improvement. I'm going to mark it waiting-on-author for that reason.
The thing about this is that our usage counts for shared_buffers don't
really work right now; it's common for everything, or nearly
everything, to have a usage count of 5.
I'm surprised that that is common. The only cases I've seen that was either when the database exactly fits in shared_buffers, or when the database is mostly appended, and the appends are done with inserts in a loop rather than COPY.
Cheers,
Jeff
Hi, Attached is latest patch. I change little bit at PinBuffer() in bufmgr.c. It will evict target clean file cache in OS more exactly. - if (!(buf->flags & BM_FADVED) && !(buf->flags & BM_JUST_DIRTIED)) + if (!(buf->flags & BM_DIRTY) && !(buf->flags & BM_FADVED) && !(buf->flags & BM_JUST_DIRTIED)) (2014/01/29 8:20), Jeff Janes wrote: > On Wed, Jan 15, 2014 at 10:34 AM, Robert Haas <robertmhaas@gmail.com > <mailto:robertmhaas@gmail.com>> wrote: > > On Wed, Jan 15, 2014 at 1:53 AM, KONDO Mitsumasa > <kondo.mitsumasa@lab.ntt.co.jp <mailto:kondo.mitsumasa@lab.ntt.co.jp>> wrote: > > I create patch that can drop duplicate buffers in OS using usage_count > > alogorithm. I have developed this patch since last summer. This feature > seems to > > be discussed in hot topic, so I submit it more faster than my schedule. > > > > When usage_count is high in shared_buffers, they are hard to drop from > > shared_buffers. However, these buffers wasn't required in file cache. Because > > they aren't accessed by postgres(postgres access to shared_buffers). > > So I create algorithm that dropping file cache which is high usage_count in > > shared_buffers and is clean state in OS. If file cache are clean state in > OS, and > > executing posix_fadvice DONTNEED, it can only free in file cache without > writing > > physical disk. This algorithm will solve double-buffered situation problem and > > can use memory more efficiently. > > > > I am testing DBT-2 benchmark now... > > > Have you had any luck with it? I have reservations about this approach. Among > other reasons, if the buffer is truly nailed in shared_buffers for the long term, > the kernel won't see any activity on it and will be able to evict it fairly > efficiently on its own. My patch aims not to evict other useful file cache in OS. If we doesn't evict useful file caches in shered_buffers, they will be evicted with other useful file cache in OS. But if we evict them as soon as possible, it will be difficult to evict other useful file cache all the more. > So I'm reluctant to do a detailed review if the author cannot demonstrate a > performance improvement. I'm going to mark it waiting-on-author for that reason. Will you review my patch? Thank you so much! However, My patch performance is be little bit better. It might be error range. Optimize kernel readahead patch is grate. Too readahead in OS is too bad, and to be full of not useful file cache in OS. Here is the test result. Plain result is tested before(readahead patch test). * Test server Server: HP Proliant DL360 G7 CPU: Xeon E5640 2.66GHz (1P/4C) Memory: 18GB(PC3-10600R-9) Disk: 146GB(15k)*4 RAID1+0 RAID controller: P410i/256MB OS: RHEL 6.4(x86_64) FS: Ext4 * DBT-2 result(WH400, SESSION=100, ideal_score=5160) Method | score | average | 90%tile | Maximum ------------------------------------------------ plain | 3589 | 9.751 | 33.680 | 87.8036 patched | 3799 | 9.914 | 22.451 | 119.4259 * Main Settings shared_buffers= 2458MB drop_duplicate_buffers = 5 // patched only I tested benchmark with drop_duplicate_buffers = 3 and 4 settings. But I didn't get good result. So I will test with more larger shared_buffers and these settings. [detail settings] http://pgstatsinfo.projects.pgfoundry.org/readahead_dbt2/normal_20140109/HTML/dbserver/param.out * Detail results (uploading now. please wait for a hour...) [plain] http://pgstatsinfo.projects.pgfoundry.org/readahead_dbt2/normal_20140109/HTML/index_thput.html [patched] http://pgstatsinfo.projects.pgfoundry.org/drop_os_cache/drop_dupulicate_cache20140129/HTML/index_thput.html We can see faster response time at OS witeback situation(maybe) and executing CHECKPOINT. Because when these are happened, read transactions hit file cache more in my patch. So responce times are better than plain. Regards, -- Mitsumasa KONDO NTT Open Source Software Center
Attachment
On Wed, Jan 29, 2014 at 2:53 AM, KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp> wrote: > Attached is latest patch. > I change little bit at PinBuffer() in bufmgr.c. It will evict target clean > file cache in OS more exactly. > > - if (!(buf->flags & BM_FADVED) && !(buf->flags & BM_JUST_DIRTIED)) > + if (!(buf->flags & BM_DIRTY) && !(buf->flags & BM_FADVED) && !(buf->flags > & BM_JUST_DIRTIED)) > > > (2014/01/29 8:20), Jeff Janes wrote: >> >> On Wed, Jan 15, 2014 at 10:34 AM, Robert Haas <robertmhaas@gmail.com >> <mailto:robertmhaas@gmail.com>> wrote: >> >> On Wed, Jan 15, 2014 at 1:53 AM, KONDO Mitsumasa >> <kondo.mitsumasa@lab.ntt.co.jp <mailto:kondo.mitsumasa@lab.ntt.co.jp>> >> wrote: >> > I create patch that can drop duplicate buffers in OS using >> usage_count >> > alogorithm. I have developed this patch since last summer. This >> feature >> seems to >> > be discussed in hot topic, so I submit it more faster than my >> schedule. >> > >> > When usage_count is high in shared_buffers, they are hard to drop >> from >> > shared_buffers. However, these buffers wasn't required in file >> cache. Because >> > they aren't accessed by postgres(postgres access to >> shared_buffers). >> > So I create algorithm that dropping file cache which is high >> usage_count in >> > shared_buffers and is clean state in OS. If file cache are clean >> state in >> OS, and >> > executing posix_fadvice DONTNEED, it can only free in file cache >> without >> writing >> > physical disk. This algorithm will solve double-buffered situation >> problem and >> > can use memory more efficiently. >> > >> > I am testing DBT-2 benchmark now... >> >> >> Have you had any luck with it? I have reservations about this approach. >> Among >> other reasons, if the buffer is truly nailed in shared_buffers for the >> long term, >> the kernel won't see any activity on it and will be able to evict it >> fairly >> efficiently on its own. > > My patch aims not to evict other useful file cache in OS. If we doesn't > evict useful file caches in shered_buffers, they will be evicted with other > useful file cache in OS. But if we evict them as soon as possible, it will > be difficult to evict other useful file cache all the more. > > >> So I'm reluctant to do a detailed review if the author cannot demonstrate >> a >> performance improvement. I'm going to mark it waiting-on-author for that >> reason. > > Will you review my patch? Thank you so much! However, My patch performance > is be > little bit better. It might be error range. Optimize kernel readahead patch > is grate. > Too readahead in OS is too bad, and to be full of not useful file cache in > OS. > Here is the test result. Plain result is tested before(readahead patch > test). > > * Test server > Server: HP Proliant DL360 G7 > CPU: Xeon E5640 2.66GHz (1P/4C) > Memory: 18GB(PC3-10600R-9) > Disk: 146GB(15k)*4 RAID1+0 > RAID controller: P410i/256MB > OS: RHEL 6.4(x86_64) > FS: Ext4 > > * DBT-2 result(WH400, SESSION=100, ideal_score=5160) > Method | score | average | 90%tile | Maximum > ------------------------------------------------ > plain | 3589 | 9.751 | 33.680 | 87.8036 > patched | 3799 | 9.914 | 22.451 | 119.4259 > > * Main Settings > shared_buffers= 2458MB > drop_duplicate_buffers = 5 // patched only > > I tested benchmark with drop_duplicate_buffers = 3 and 4 settings. But I > didn't get good result. So I will test with more larger shared_buffers and > these settings. > > [detail settings] > http://pgstatsinfo.projects.pgfoundry.org/readahead_dbt2/normal_20140109/HTML/dbserver/param.out > > > * Detail results (uploading now. please wait for a hour...) > [plain] > http://pgstatsinfo.projects.pgfoundry.org/readahead_dbt2/normal_20140109/HTML/index_thput.html > [patched] > http://pgstatsinfo.projects.pgfoundry.org/drop_os_cache/drop_dupulicate_cache20140129/HTML/index_thput.html > > We can see faster response time at OS witeback situation(maybe) and > executing CHECKPOINT. Because when these are happened, read transactions hit > file cache more in my patch. So responce times are better than plain. I think it's pretty clear that these results are not good enough to justify committing this patch. To do something like this, we need to have a lot of confidence that this will be a win not just on one particular system or workload, but rather that it's got to be a general win across many systems and workloads. I'm not convinced that's true, and if it is true the test results submitted thus far are nowhere near sufficient to establish it, and I can't see that changing in the next few weeks. So I think it's pretty clear that we should mark this Returned with Feedback for now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company