Thread: Page Miss Hits
Hi, i would like to answer if there is any way in postgres to find the page miss hits caused during a query execution. Is there something like explain analyze with the page miss hits???
On Mon, 2004-08-02 at 02:11, Ioannis Theoharis wrote: > Hi, i would like to answer if there is any way in postgres to find the > page miss hits caused during a query execution. > > > Is there something like explain analyze with the page miss hits??? You're making a basic assumption that is (at least currently) untrue, and that is that PostgreSQL has it's own cache. It doesn't. It has a buffer that drops buffer back into the free pool when the last referencing backend concludes and shuts down. So, PostgreSQL currently relies on the kernel to cache for it. So, what you need is a tool that monitors the kernel cache usage and its hit rate. I'm not familiar with any, but I'm sure something out there likely does that.
Scott Marlowe wrote: > On Mon, 2004-08-02 at 02:11, Ioannis Theoharis wrote: > >>Hi, i would like to answer if there is any way in postgres to find the >>page miss hits caused during a query execution. >> >> >>Is there something like explain analyze with the page miss hits??? > > > You're making a basic assumption that is (at least currently) untrue, > and that is that PostgreSQL has it's own cache. Are you sure of this ? What is the meaning of the ARC recently introduced then ? Regards Gaetano Mendola
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Scott Marlowe wrote: | On Mon, 2004-08-02 at 10:43, Gaetano Mendola wrote: | |>Scott Marlowe wrote: |> |>>On Mon, 2004-08-02 at 02:11, Ioannis Theoharis wrote: |>> |>> |>>>Hi, i would like to answer if there is any way in postgres to find the |>>>page miss hits caused during a query execution. |>>> |>>> |>>>Is there something like explain analyze with the page miss hits??? |>> |>> |>>You're making a basic assumption that is (at least currently) untrue, |>>and that is that PostgreSQL has it's own cache. |> |>Are you sure of this ? What is the meaning of the ARC recently introduced |>then ? | | | Yes I am. Test it yourself, setup a couple of backends, select * from | some big tables, then, one at a time, shut down the psql clients and | when the last one closes, the shared mem goes away. Run another client, | do select * from the big table, and watch the client size grow from a | few meg to a size large enough to hold the whole table (or however much | your shared_buffers will hold.) | | While someone may make ARC and the shared buffers act like a cache some | day (can't be that hard, most of the work is done really) right now it's | not how it works. | | ARC still helps, since it makes sure the shared_buffers don't all get | flushed from the useful small datasets when a seq scan gets executed. I'm still not convinced. Why the last backend alive, have to throw away bunch of memory copied in the SHM? And again, the ARC is a replacement policy for a cache, which one ? Regards Gaetano Mendola -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBDqkL7UpzwH2SGd4RAsQFAKCWVpCXKgRfE1nc44ZmtEaIrtNaIQCgr4fd Hx2NiuRzV0UQ3Na9g/zQbzE= =XWua -----END PGP SIGNATURE-----
> | ARC still helps, since it makes sure the shared_buffers don't all get > | flushed from the useful small datasets when a seq scan gets executed. > > I'm still not convinced. Why the last backend alive, have to throw away > bunch of memory copied in the SHM? And again, the ARC is a replacement > policy for a cache, which one ? As you know, ARC is a recent addition. I've not seen any benchmarks demonstrating that the optimal SHARED_BUFFERS setting is different today than it was in the past. We know it's changed, but the old buffer strategy had an equally hard time with a small buffer as it did a large one. Does that mean the middle of the curve is still at 15k buffers but the extremes are handled better? Or something completely different? Please feel free to benchmark 7.5 (OSDL folks should be able to help us as well) and report back.
Rod Taylor wrote: >>| ARC still helps, since it makes sure the shared_buffers don't all get >>| flushed from the useful small datasets when a seq scan gets executed. >> >>I'm still not convinced. Why the last backend alive, have to throw away >>bunch of memory copied in the SHM? And again, the ARC is a replacement >>policy for a cache, which one ? > > > As you know, ARC is a recent addition. I've not seen any benchmarks > demonstrating that the optimal SHARED_BUFFERS setting is different today > than it was in the past. > > We know it's changed, but the old buffer strategy had an equally hard > time with a small buffer as it did a large one. Does that mean the > middle of the curve is still at 15k buffers but the extremes are handled > better? Or something completely different? > > Please feel free to benchmark 7.5 (OSDL folks should be able to help us > as well) and report back. I know, I know. We were discussing about the fact that postgres use a his own cache or not; and for the OP pleasure then if is possible retrieve hit and miss information from that cache. For benchmarch may be is better that you look not at the particular implementation done in postgresql but at the general improvements that the ARC replacement policy introduce. If I'm not wrong till now postgres was using an LRU, around you can find some articles like these: http://www.almaden.ibm.com/StorageSystems/autonomic_storage/ARC/rj10284.pdf http://www.almaden.ibm.com/cs/people/dmodha/arcfast.pdf where are showns the improvements. As you wrote no one did benchmarks on demostrating with the "brute force" that ARC is better but on the paper should be. Regards Gaetano Mendola