Re: Explain buffers display units. - Mailing list pgsql-hackers
From | Greg Stark |
---|---|
Subject | Re: Explain buffers display units. |
Date | |
Msg-id | 407d949e1002160236v4e05497ehcdb8a8e796edc94d@mail.gmail.com Whole thread Raw |
In response to | Re: Explain buffers display units. (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Explain buffers display units.
|
List | pgsql-hackers |
On Tue, Feb 16, 2010 at 2:48 AM, Robert Haas <robertmhaas@gmail.com> wrote: > Multiplying by the block size makes it sound as if all the > memory was read or used, which is simply not the case - especially for > things like buffer hits, which don't actually read or allocate any > memory at all. In which case it represents how much data would have had to have been read if it wasn't in the buffer cache which is a perfectly reasonable measurement. It's exactly what a cache profiler should be measuring. These are figures that users have to compare with their buffer cache size and with the output of iostat or other tools. Presenting them in arbitrary internal units makes that difficult. > We certainly do that for GUCs, and in that context it seems to me to > make sense. If you set your shared buffers to a gigabyte, PG will use > an additional GB of memory. But if you hit a "gigabyte" of shared > buffers, you may be examining anywhere from one 8K block over and over > again all the way up to a full GB of memory. Block hits and reads > just don't add in the same way that actual memory allocations do. Accessing the same 8kB of memory 100,1000 times is 1GB of memory bandwidth. The output of explain doesn't give you enough information to distinguish that from accessing 1GB of different data which is too bad but there's a limit to how much information we can fit in a reasonable amount of space. But 1GB of memory bandwidth is still an interesting figure even if it's the same 8kB a hundred thousand times. I think it's a lot more meaningful for a human reader than "131072". > And at any rate, what we DON'T do for GUCs is produce differing output > format for the same parameter based on the magnitude of the output > value, as you've done here. No, that's *exactly* what we do: postgres=# set work_mem = 64; SET postgres=# show work_mem;work_mem ----------64kB (1 row) postgres=# set work_mem = 1024; SET postgres=# show work_mem;work_mem ----------1MB (1 row) postgres=# set work_mem = 1048576; SET postgres=# show work_mem;work_mem ----------1GB (1 row) > We accept input in several different > formats, but there is only one canonical output formal for any > particular GUC, which is furthermore always chosen in such a way that > the exact value of the setting is preserved (again, unlike what you've > done here). I don't think the use case for GUCs is the same as for empirical measurements. Empirical results are never going to come out as a round number of megabytes so only using larger units in that case would be useless. In the case of GUCs I assume the argument was that someone should be able to copy the output into another postgresql.conf and get the same value, something which is irrelevant for empirical measurements. In any case the machine-readable form of GUC settings is not this one canonical format you describe for SHOW: postgres=# select name,setting,unit,min_val,max_val,boot_val,reset_val from pg_settings where name = 'work_mem'; name | setting | unit | min_val | max_val | boot_val | reset_val ----------+---------+------+---------+---------+----------+-----------work_mem | 1048576 | kB | 64 | 2097151 | 1024 | 1024 (1 row) This is similar to how I think the XML output should work. It should have the raw internal values with enough meta data in it that a tool can figure out how to display it or work with it. > So, you're saying we shouldn't look at the way that the pg_stat > functions format the output because somebody might write a view over > it that formats it in some different way that may or may not match > what you've done for the EXPLAIN output? What makes you think that > people don't just look at the raw numbers? I certainly have, and > there's no suggestion in the documentation that users should do > anything else. I'm not sure users need suggestions that they should format the data in whatever way they want. We still have to document the programmatic interface they use to get the raw data. > pg_stat_statements doesn't do what you're suggesting either; it, too, > presents raw numbers, and lets the user make of it what they will. > They might, for example, want to compute a hit ratio, as in the > example provided in the docs. In the case of EXPLAIN of an index > scan, they might want to estimate the number of seeks, on the theory > that an inner-indexscan is going to be all random IO. You can compute the hit ratio just fine from measurements with units. And if you're doing it in an automated way you'll want to use machine-readable output, rather than parsing the formatted text. > This doesn't seem to be a very carefully thought out proposal, because > you haven't explained how it would work for JSON or YAML output. A > format-neutral solution which we've already used for sort and hash > information (and for GUCs) is to include the unit designator in the > output.. But I generally think that trying to make the EXPLAIN output > self-documenting to the point where programs don't need any specific > knowledge of the meanings of the various properties is rather > hopeless. I think there's a lot of room between "provide the units for the measurement" and "100% self-documenting". Most tools are just going to display the measurement and want to know how to format the result nicely anyways. A fixed out-of-band schema isn't going to be able to handle more flexible cases either. Say we add dtrace support for counting arbitrary probes and including them in the explain output. The schema can't possibly list every possible probe and some of them might well be quantities of memory or time or other units. I *do* think we should have a schema decription for the structural tags -- eg. <join>, <append>, <sort> and the like. Tools should be able to know which tags they should expect to have subnodes and how many to expect. And they should be able to know which ones they should expect to have measurements. > Upon further review, I also notice that this patch seems to have > falsified the EXPLAIN documentation - both the description of the > BUFFERS option and the description of the FORMAT option are no longer > accurate Oops. Well, I would like to know if I'm in the minority and have to roll this back before I fix that. -- greg
pgsql-hackers by date: