Thread: pg_shmem_allocations view
Hi, I've more than once wanted to know what allocated shared memory in postgres installation is used for. Especially with more an more extensions around that's quite useful. Thus I've written a patch to add a new SRF/VIEW pg_get_shmem_allocations/pg_shmem_allocations that shows the contents of the shared memory index: postgres=# SELECT * FROM pg_shmem_allocations ORDER BY size DESC; key | off | size | allocated -------------------------------------+-------------+-------------+----------- Buffer Blocks | 286242528 | 17179869184 | t Buffer Descriptors | 152024800 | 134217728 | t Checkpointer Data | 17584720352 | 41943080 | t XLOG Ctl | 134234112 | 16804496 | t CLOG Ctl | 151038624 | 525312 | t | 17627719648 | 366624 | f Backend Activity Buffer | 17584379168 | 272000 | t SUBTRANS Ctl | 151563936 | 263168 | t OldSerXid SLRU Ctl | 17584225696 | 131648 | t MultiXactMember Ctl | 151892960 | 131648 | t Shared Buffer Lookup Table | 17466111712 | 131184 | t shmInvalBuffer | 17584653184 | 66104 | t Async Ctl | 17626666048 | 65856 | t MultiXactOffset Ctl | 151827104 | 65856 | t Fast Path Strong Relation Lock Data | 17583882752 | 4100 | t Backend Status Array | 17584373304 | 3672 | t PROCLOCK hash | 17583785856 | 2160 | t PREDICATELOCKTARGET hash | 17583886856 | 2160 | t PREDICATELOCK hash | 17583957632 | 2160 | t pg_stat_statements hash | 17626731952 | 2160 | t SERIALIZABLEXID hash | 17584175184 | 2160 | t LOCK hash | 17583684096 | 2160 | t Background Worker Data | 17584651184 | 1992 | t Wal Receiver Ctl | 17626663712 | 1192 | t Backend Client Host Name Buffer | 17584378064 | 1088 | t Backend Application Name Buffer | 17584376976 | 1088 | t ProcSignalSlots | 17584719472 | 864 | t Sync Scan Locations List | 17626665120 | 656 | t Async Queue Control | 17626665776 | 244 | t Control File | 134233856 | 240 | t AutoVacuum Data | 17626663432 | 224 | t BTree Vacuum State | 17626664904 | 216 | t PMSignalState | 17584719288 | 180 | t Shared MultiXact State | 152024608 | 176 | t Proc Array | 17584373192 | 108 | t PredXactList | 17584149248 | 88 | t Proc Header | 17584357360 | 88 | t Wal Sender Ctl | 17626663656 | 56 | t pg_stat_statements | 17626731904 | 48 | t Buffer Strategy Status | 17583684064 | 32 | t RWConflictPool | 17584184832 | 24 | t Prepared Transaction Table | 17584651168 | 16 | t FinishedSerializableTransactions | 17584225664 | 16 | t OldSerXidControlData | 17584357344 | 16 | t (44 rows) I think this is quite worthwile information. It'd possibly be better of in an extension, but the relevant dastructures aren't public. The attached patch doesn't contain documentation because I wasn't sure that others would be interested in the feature. Greetings, Andres Freund PS: Yes, the checkpointer's allocation is crazy. The fsync queue is sized by NBuffers which is absurd. -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Hi, On 2014-05-04 13:44:17 +0200, Andres Freund wrote: > postgres=# SELECT * FROM pg_shmem_allocations ORDER BY size DESC; > key | off | size | allocated > -------------------------------------+-------------+-------------+----------- > Buffer Blocks | 286242528 | 17179869184 | t > Buffer Descriptors | 152024800 | 134217728 | t > ... > OldSerXidControlData | 17584357344 | 16 | t > (44 rows) Thinking about this, I think it was a mistake to not add a 'name' field to dynamic shared memory's dsm_control_item. Right now it's very hard to figure out which extension allocated a dsm segment. Imo we should change that before 9.4 is out. I am not suggesting to use it to identify segments, but just as an identifier, passed in into dsm_create(). Imo there should be a corresponding pg_dynshmem_allocations to pg_shmem_allocations. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hi, On 2014-05-04 13:44:17 +0200, Andres Freund wrote: > postgres=# SELECT * FROM pg_shmem_allocations ORDER BY size DESC; > key | off | size | allocated > -------------------------------------+-------------+-------------+----------- > Buffer Blocks | 286242528 | 17179869184 | t > Buffer Descriptors | 152024800 | 134217728 | t Abhijit notified me that I've attached the wrong patch. Corrected. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 04-05-2014 08:44, Andres Freund wrote: > I've more than once wanted to know what allocated shared memory in > postgres installation is used for. Especially with more an more > extensions around that's quite useful. > A few years ago I had to provide such information an did something similar. Is it useful? Yes. However, it is a developer's feature. > On 2014-05-04 13:44:17 +0200, Andres Freund wrote: > Thinking about this, I think it was a mistake to not add a 'name' field > to dynamic shared memory's dsm_control_item. Right now it's very hard to > figure out which extension allocated a dsm segment. Imo we should change > that before 9.4 is out. I am not suggesting to use it to identify > segments, but just as an identifier, passed in into dsm_create(). > +1. > Imo there should be a corresponding pg_dynshmem_allocations to > pg_shmem_allocations. > ... or another boolean column (say 'dynamic') and just one view. -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte24x7 e Treinamento
On Sun, May 4, 2014 at 7:50 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-05-04 13:44:17 +0200, Andres Freund wrote: >> postgres=# SELECT * FROM pg_shmem_allocations ORDER BY size DESC; >> key | off | size | allocated >> -------------------------------------+-------------+-------------+----------- >> Buffer Blocks | 286242528 | 17179869184 | t >> Buffer Descriptors | 152024800 | 134217728 | t >> ... >> OldSerXidControlData | 17584357344 | 16 | t >> (44 rows) > > Thinking about this, I think it was a mistake to not add a 'name' field > to dynamic shared memory's dsm_control_item. Right now it's very hard to > figure out which extension allocated a dsm segment. Imo we should change > that before 9.4 is out. I am not suggesting to use it to identify > segments, but just as an identifier, passed in into dsm_create(). > > Imo there should be a corresponding pg_dynshmem_allocations to > pg_shmem_allocations. Well, right now a dsm_control_item is 8 bytes. If we add a name field of our usual 64 bytes, they'll each be 9 times bigger. We're not talking about a lot of bytes in absolute terms, but I guess I'm not in favor of an 800% size increase without somewhat more justification than you've provided here. Who is using dynamic shared memory for enough different things at this point to get confused? I'm quite in favor of having something like this for the main shared memory segment, but I think that's 9.5 material at this point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sun, May 4, 2014 at 7:50 AM, Andres Freund <andres@2ndquadrant.com> wrote: >> Thinking about this, I think it was a mistake to not add a 'name' field >> to dynamic shared memory's dsm_control_item. > Well, right now a dsm_control_item is 8 bytes. If we add a name field > of our usual 64 bytes, they'll each be 9 times bigger. And the controlled shared segment is likely to be how big exactly? It's probably not even possible for it to be smaller than a page size, 4K or so depending on the OS. I agree with Andres that a name would be a good idea; complaining about the space needed to hold it is penny-wise and pound-foolish. > I'm quite in favor of having something like this for the main shared > memory segment, but I think that's 9.5 material at this point. If you're prepared to break the current APIs later to add a name parameter (which would have to be required, if it's to be useful at all), then sure, put the question off till 9.5. regards, tom lane
On 2014-05-05 15:04:07 -0400, Robert Haas wrote: > On Sun, May 4, 2014 at 7:50 AM, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2014-05-04 13:44:17 +0200, Andres Freund wrote: > >> postgres=# SELECT * FROM pg_shmem_allocations ORDER BY size DESC; > >> key | off | size | allocated > >> -------------------------------------+-------------+-------------+----------- > >> Buffer Blocks | 286242528 | 17179869184 | t > >> Buffer Descriptors | 152024800 | 134217728 | t > >> ... > >> OldSerXidControlData | 17584357344 | 16 | t > >> (44 rows) > > > > Thinking about this, I think it was a mistake to not add a 'name' field > > to dynamic shared memory's dsm_control_item. Right now it's very hard to > > figure out which extension allocated a dsm segment. Imo we should change > > that before 9.4 is out. I am not suggesting to use it to identify > > segments, but just as an identifier, passed in into dsm_create(). > > > > Imo there should be a corresponding pg_dynshmem_allocations to > > pg_shmem_allocations. > > Well, right now a dsm_control_item is 8 bytes. If we add a name field > of our usual 64 bytes, they'll each be 9 times bigger. We're not > talking about a lot of bytes in absolute terms, but I guess I'm not in > favor of an 800% size increase without somewhat more justification > than you've provided here. Who is using dynamic shared memory for > enough different things at this point to get confused? The kernel side overhead of creating a shared memory segment are so much higher that this really isn't a meaningful saving. Also, are you really considering a couple hundred bytes to be a problem? I think it's quite a sensible thing for an administrator to ask where all the memory has gone. The more users for dsm there the more important that'll get. Right now pretty much the only thing a admin could do is to poke around in /proc to see which backend has mapped the segment and try to figure out via the logs what caused it to do so. Not nice. > I'm quite in favor of having something like this for the main shared > memory segment, but I think that's 9.5 material at this point. Clearly. For one the version I posted here missed allocations which aren't done via ShmemInitStruct (lwlock main array and hash table allocations primarily). For another it's too late ;) Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-05-05 15:09:02 -0400, Tom Lane wrote: > > I'm quite in favor of having something like this for the main shared > > memory segment, but I think that's 9.5 material at this point. > > If you're prepared to break the current APIs later to add a name parameter > (which would have to be required, if it's to be useful at all), then sure, > put the question off till 9.5. I understood Robert to mean that it's too late for my proposed view for 9.4 - and I agree - but I wholeheartedly agree with you that we should add a name parameter to the dsm API *now*. We can just Assert() that it's nonzero if we don't think it's useful for now. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, May 5, 2014 at 3:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Sun, May 4, 2014 at 7:50 AM, Andres Freund <andres@2ndquadrant.com> wrote: >>> Thinking about this, I think it was a mistake to not add a 'name' field >>> to dynamic shared memory's dsm_control_item. > >> Well, right now a dsm_control_item is 8 bytes. If we add a name field >> of our usual 64 bytes, they'll each be 9 times bigger. > > And the controlled shared segment is likely to be how big exactly? It's > probably not even possible for it to be smaller than a page size, 4K or > so depending on the OS. I agree with Andres that a name would be a good > idea; complaining about the space needed to hold it is penny-wise and > pound-foolish. The control segment is sized to support a number of dynamic shared memory segments not exceeding 64 + 2 *MaxBackends. With default settings, that currently works out to 288 segments, or 2306 bytes. So, adding a 64-byte name to each of those structures would increase the size from 2k to about 20k. So, sure, that's not a lot of memory. But I'm still not convinced that's it's very useful. What I think is going to happen is that (1) most people won't be used dynamic shared memory at all, so they won't have any use for this; (2) those people who do run an extension that uses dynamic shared memory will most likely only be running one such extension, so they won't need a name to know what the segments are being used for; and (3) if and when we eventually get parallel query, dynamic shared memory segments will be widely used, but a bunch of segments that are all named "parallel_query" or "parallel_query.$PID" isn't going to be too informative. Now, all that having been said, I recognize that human-readable names are a generally useful thing, so I'm not going to hold my breath until I turn blue if other people really want this, and it may turn out to be useful someday. But if anyone is curious whether I'm *confident* that it will be useful someday: at this point, no. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, May 5, 2014 at 3:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> And the controlled shared segment is likely to be how big exactly? It's >> probably not even possible for it to be smaller than a page size, 4K or >> so depending on the OS. I agree with Andres that a name would be a good >> idea; complaining about the space needed to hold it is penny-wise and >> pound-foolish. > ... > Now, all that having been said, I recognize that human-readable names > are a generally useful thing, so I'm not going to hold my breath until > I turn blue if other people really want this, and it may turn out to > be useful someday. But if anyone is curious whether I'm *confident* > that it will be useful someday: at this point, no. I'm not confident that it'll be useful either. But I am confident that if we don't put it in now, and decide we want it later, there will be complaints when we change the API. Better to have an ignored parameter than no parameter. regards, tom lane
On Mon, May 5, 2014 at 6:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Mon, May 5, 2014 at 3:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> And the controlled shared segment is likely to be how big exactly? It's >>> probably not even possible for it to be smaller than a page size, 4K or >>> so depending on the OS. I agree with Andres that a name would be a good >>> idea; complaining about the space needed to hold it is penny-wise and >>> pound-foolish. >> ... >> Now, all that having been said, I recognize that human-readable names >> are a generally useful thing, so I'm not going to hold my breath until >> I turn blue if other people really want this, and it may turn out to >> be useful someday. But if anyone is curious whether I'm *confident* >> that it will be useful someday: at this point, no. > > I'm not confident that it'll be useful either. But I am confident that > if we don't put it in now, and decide we want it later, there will be > complaints when we change the API. Better to have an ignored parameter > than no parameter. I'm generally skeptical of that philosophy. If we put in an ignored parameter, people may pass pointers to NULL or to garbage or to an overly-long string, and they won't know it's broken until we make it do something; at which point their code will begin to fail without warning. Speaking as an employee of a company that maintains several PostgreSQL extensions that sometimes need to be updated for newer server versions, I'd rather have a clean API break that makes the build fail than a "soft" break that supposedly lets things continue working but maybe breaks them in subtler ways. Another problem with this idea is that we might never get around to making it do anything, and then the dead parameter is just a stupid and unnecessary wart. If we're going to do anything at all here for 9.4, I recommend ignoring the fact we're in feature freeze and going whole hog: add the name, add the monitoring view, and add the monitoring view for the main shared memory segment just for good measure. That way, if we get the design wrong or something, we have a chance of getting some feedback. If we're not going to do that, then I vote for doing nothing and considering later whether to break it for 9.5, by which time we may have some evidence as to whether and how this code is really being used. Anyone who expects PostgreSQL's C API to be completely stable is going to be regularly disappointed, as most recently demonstrated by the Enormous Header Churn of the 9.3 era. I don't particularly mind being the cause of further disappointment; as long as the breakage is obvious rather than subtle, the fix usually takes about 10 minutes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 5 May 2014 21:54, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, May 5, 2014 at 3:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> On Sun, May 4, 2014 at 7:50 AM, Andres Freund <andres@2ndquadrant.com> wrote: >>>> Thinking about this, I think it was a mistake to not add a 'name' field >>>> to dynamic shared memory's dsm_control_item. >> >>> Well, right now a dsm_control_item is 8 bytes. If we add a name field >>> of our usual 64 bytes, they'll each be 9 times bigger. >> >> And the controlled shared segment is likely to be how big exactly? It's >> probably not even possible for it to be smaller than a page size, 4K or >> so depending on the OS. I agree with Andres that a name would be a good >> idea; complaining about the space needed to hold it is penny-wise and >> pound-foolish. > > The control segment is sized to support a number of dynamic shared > memory segments not exceeding 64 + 2 *MaxBackends. With default > settings, that currently works out to 288 segments, or 2306 bytes. > So, adding a 64-byte name to each of those structures would increase > the size from 2k to about 20k. > > So, sure, that's not a lot of memory. But I'm still not convinced > that's it's very useful. What I think is going to happen is that (1) > most people won't be used dynamic shared memory at all, so they won't > have any use for this; (2) those people who do run an extension that > uses dynamic shared memory will most likely only be running one such > extension, so they won't need a name to know what the segments are > being used for; and (3) if and when we eventually get parallel query, > dynamic shared memory segments will be widely used, but a bunch of > segments that are all named "parallel_query" or "parallel_query.$PID" > isn't going to be too informative. Not sure your arguments hold any water. Most people don't use most features... and so we're not allowed features that can be debugged? How do you know people will only use one extension that uses dshmem? Why would we call multiple segments the same thing?? If names are a problem, lets give them numbers. Seems a minor point. Perhaps we can allocate space for names dynamically?? Not being able to tell segments apart from each other is just daft, if we are trying to supply bug free software for the world to use. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-05-05 23:20:43 -0400, Robert Haas wrote: > On Mon, May 5, 2014 at 6:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I'm not confident that it'll be useful either. But I am confident that > > if we don't put it in now, and decide we want it later, there will be > > complaints when we change the API. Better to have an ignored parameter > > than no parameter. > > I'm generally skeptical of that philosophy. If we put in an ignored > parameter, people may pass pointers to NULL or to garbage or to an > overly-long string, and they won't know it's broken until we make it > do something; at which point their code will begin to fail without > warning. If it were a complex change, maybe. But I don't think that's likely here. Assert(name != NULL && strlen(name) > 0 && strlen(name) < NAMEDATALEN); should perfectly do the trick. > If we're going to do anything at all here for 9.4, I recommend > ignoring the fact we're in feature freeze and going whole hog: add the > name, add the monitoring view, and add the monitoring view for the > main shared memory segment just for good measure. We can do that as well. If there's agreement on that path I'll update the patch to also show dynamic statements. > Anyone who expects PostgreSQL's C API to be > completely stable is going to be regularly disappointed, as most > recently demonstrated by the Enormous Header Churn of the 9.3 era. I > don't particularly mind being the cause of further disappointment; as > long as the breakage is obvious rather than subtle, the fix usually > takes about 10 minutes. Didn't you complain rather loudly about that change? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, May 6, 2014 at 7:45 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> The control segment is sized to support a number of dynamic shared >> memory segments not exceeding 64 + 2 *MaxBackends. With default >> settings, that currently works out to 288 segments, or 2306 bytes. >> So, adding a 64-byte name to each of those structures would increase >> the size from 2k to about 20k. >> >> So, sure, that's not a lot of memory. But I'm still not convinced >> that's it's very useful. What I think is going to happen is that (1) >> most people won't be used dynamic shared memory at all, so they won't >> have any use for this; (2) those people who do run an extension that >> uses dynamic shared memory will most likely only be running one such >> extension, so they won't need a name to know what the segments are >> being used for; and (3) if and when we eventually get parallel query, >> dynamic shared memory segments will be widely used, but a bunch of >> segments that are all named "parallel_query" or "parallel_query.$PID" >> isn't going to be too informative. > > Not sure your arguments hold any water. I'm not, either. > Most people don't use most features... and so we're not allowed > features that can be debugged? I certainly didn't say that. > How do you know people will only use one extension that uses dshmem? I don't. If they do, that's a good argument for adding this. > Why would we call multiple segments the same thing?? It's not clear to me how someone is going to intelligently name multiple segments used by the same extension. Maybe they'll give them all the same name. Maybe they'll name them all extension_name.pid. More than likely, different extensions will use different conventions.:-( It might be sensible to add a "creator PID" field to the DSM control items. Of course, that PID might have exited, but it could still possibly be useful for debugging purposes. > If names are a problem, lets give them numbers. Seems a minor point. > Perhaps we can allocate space for names dynamically?? A static buffer, as proposed by Andres, seems a lot simper. > Not being able to tell segments apart from each other is just daft, if > we are trying to supply bug free software for the world to use. I can see I'm losing this argument. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 05/06/2014 02:59 PM, Robert Haas wrote: >> >Why would we call multiple segments the same thing?? > It's not clear to me how someone is going to intelligently name > multiple segments used by the same extension. Maybe they'll give them > all the same name. Maybe they'll name them all extension_name.pid. > More than likely, different extensions will use different conventions. > :-( That seems sensible to me. The best scheme will depend on how the segments are used. Best to leave it to the extension author. - Heikki
On 6 May 2014 13:06, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > The best scheme will depend on how the segments > are used. Best to leave it to the extension author. +1 -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hi, On 2014-05-06 13:56:41 +0200, Andres Freund wrote: > On 2014-05-05 23:20:43 -0400, Robert Haas wrote: > > On Mon, May 5, 2014 at 6:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > I'm not confident that it'll be useful either. But I am confident that > > > if we don't put it in now, and decide we want it later, there will be > > > complaints when we change the API. Better to have an ignored parameter > > > than no parameter. > > > > I'm generally skeptical of that philosophy. If we put in an ignored > > parameter, people may pass pointers to NULL or to garbage or to an > > overly-long string, and they won't know it's broken until we make it > > do something; at which point their code will begin to fail without > > warning. > > If it were a complex change, maybe. But I don't think that's likely > here. > Assert(name != NULL && strlen(name) > 0 && strlen(name) < NAMEDATALEN); > should perfectly do the trick. Attached are two patches: a) Patch addin a 'name' parameter to dsm_create(). I think we should apply this to 9.4. b) pg_dynamic_shmem_allocations and pg_static_shmem_allocations views. The previous version didn't include dsm support and didn't take the required lock. I am not so sure whether b) should be applied together with a) in 9.4, but I'd be happy enough to add docs if people agree with the naming. FWIW, I like dsm_create()'s internals more after this patch... postgres=# \d pg_dynamic_shmem_allocations View "pg_catalog.pg_dynamic_shmem_allocations" Column | Type | Modifiers --------+--------+----------- handle | bigint | name | text | size | bigint | refcnt | bigint | postgres=# \d pg_static_shmem_allocations View "pg_catalog.pg_static_shmem_allocations" Column | Type | Modifiers -----------+---------+----------- key | text | off | bigint | size | bigint | allocated | boolean | postgres=# SELECT * FROM pg_dynamic_shmem_allocations; handle | name | size | refcnt ------------+-------------+-------+-------- 1120921036 | test_shm_mq | 65656 | 1 (1 row) postgres=# SELECT * FROM pg_static_shmem_allocations ORDER BY key NULLS FIRST; key | off | size | allocated -------------------------------------+------------+------------+----------- | 2222605024 | 1727776 | f | | 34844752 | t Async Ctl | 2222539168 | 65856 | t Async Queue Control | 2222537784 | 1384 | t AutoVacuum Data | 2222533576 | 224 | t Backend Activity Buffer | 2217099552 | 114688 | t Backend Application Name Buffer | 2217085216 | 7168 | t Backend Client Host Name Buffer | 2217092384 | 7168 | t Backend Status Array | 2217061024 | 24192 | t Background Worker Data | 2217214256 | 1992 | t BTree Vacuum State | 2222535768 | 1356 | t Buffer Blocks | 51365312 | 2147483648 | t Buffer Descriptors | 34588096 | 16777216 | t Buffer Strategy Status | 2213546176 | 32 | t Checkpointer Data | 2217290656 | 5242920 | t CLOG Ctl | 33601152 | 525312 | t Control File | 16796384 | 240 | t Fast Path Strong Relation Lock Data | 2214767072 | 4100 | t FinishedSerializableTransactions | 2216841952 | 16 | t LOCK hash | 2213546208 | 2160 | t MultiXactMember Ctl | 34455488 | 131648 | t MultiXactOffset Ctl | 34389632 | 65856 | t OldSerXidControlData | 2216973632 | 16 | t OldSerXid SLRU Ctl | 2216841984 | 131648 | t PMSignalState | 2217285400 | 940 | t PREDICATELOCK hash | 2215182944 | 2160 | t PREDICATELOCKTARGET hash | 2214771176 | 2160 | t PredXactList | 2216348384 | 88 | t Prepared Transaction Table | 2217214240 | 16 | t Proc Array | 2217060536 | 488 | t Proc Header | 2216973648 | 88 | t PROCLOCK hash | 2214183264 | 2160 | t ProcSignalSlots | 2217286344 | 4284 | t RWConflictPool | 2216573120 | 24 | t SERIALIZABLEXID hash | 2216518720 | 2160 | t Shared Buffer Lookup Table | 2198848960 | 16496 | t Shared MultiXact State | 34587136 | 936 | t shmInvalBuffer | 2217216256 | 69144 | t SUBTRANS Ctl | 34126464 | 263168 | t Sync Scan Locations List | 2222537128 | 656 | t Wal Receiver Ctl | 2222534576 | 1192 | t Wal Sender Ctl | 2222533800 | 776 | t XLOG Ctl | 16796640 | 16804496 | t (43 rows) Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Andres Freund <andres@2ndquadrant.com> writes: > Attached are two patches: > a) Patch addin a 'name' parameter to dsm_create(). I think we should > apply this to 9.4. > b) pg_dynamic_shmem_allocations and pg_static_shmem_allocations > views. The previous version didn't include dsm support and didn't > take the required lock. > I am not so sure whether b) should be applied together with a) in 9.4, > but I'd be happy enough to add docs if people agree with the naming. FWIW, I vote for fixing (a) now but holding (b) for 9.5. regards, tom lane
On Tue, May 6, 2014 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andres Freund <andres@2ndquadrant.com> writes: >> Attached are two patches: >> a) Patch addin a 'name' parameter to dsm_create(). I think we should >> apply this to 9.4. >> b) pg_dynamic_shmem_allocations and pg_static_shmem_allocations >> views. The previous version didn't include dsm support and didn't >> take the required lock. > >> I am not so sure whether b) should be applied together with a) in 9.4, >> but I'd be happy enough to add docs if people agree with the naming. > > FWIW, I vote for fixing (a) now but holding (b) for 9.5. I guess I'll vote for applying both. I don't see a lot of risk, and I think doing one with out the other is somewhat pointless. Regarding patch 0002, I don't think we're using the term "static shmem" anywhere else, so I vote for dropping the word static, so that we have pg_get_shmem_allocations() and pg_get_dynamic_shmem_allocations(). Also, I think using the "allocated" column is pretty weird. There are always exactly two entries with allocated = false, one of which is for actual free memory and the other of which is for memory that actually IS allocated but without using ShmemIndex (maybe the latter was supposed to have allocated = true but still key = null?). I guess I'd vote for ditching the allocated column completely and outputting the memory allocated without ShmemIndex using some fixed tag (like "ShmemIndex" or "Bootstrap" or "Overhead" or something). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, May 6, 2014 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> FWIW, I vote for fixing (a) now but holding (b) for 9.5. > I guess I'll vote for applying both. I don't see a lot of risk, and I > think doing one with out the other is somewhat pointless. The difference is that there's not consensus about the details of the views ... as borne out by your next paragraph. Now admittedly, we could always redefine the views in 9.5, but I'd rather not be doing this sort of thing in haste. Something as user-visible as a system view really ought to have baked awhile before we ship it. Patch (a) is merely institutionalizing the expectation that DSM segments should have names, which is a much lower-risk bet. regards, tom lane
On 6 May 2014 20:44, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, May 6, 2014 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> FWIW, I vote for fixing (a) now but holding (b) for 9.5. > >> I guess I'll vote for applying both. I don't see a lot of risk, and I >> think doing one with out the other is somewhat pointless. > > The difference is that there's not consensus about the details of the > views ... as borne out by your next paragraph. > > Now admittedly, we could always redefine the views in 9.5, but > I'd rather not be doing this sort of thing in haste. Something > as user-visible as a system view really ought to have baked awhile > before we ship it. Patch (a) is merely institutionalizing the > expectation that DSM segments should have names, which is a much > lower-risk bet. As long as all the functions are exposed to allow b) to run as an extension, I don't see we lose anything by waiting a while. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-05-06 15:37:04 -0400, Robert Haas wrote: > On Tue, May 6, 2014 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Andres Freund <andres@2ndquadrant.com> writes: > >> Attached are two patches: > >> a) Patch addin a 'name' parameter to dsm_create(). I think we should > >> apply this to 9.4. > >> b) pg_dynamic_shmem_allocations and pg_static_shmem_allocations > >> views. The previous version didn't include dsm support and didn't > >> take the required lock. > > > >> I am not so sure whether b) should be applied together with a) in 9.4, > >> but I'd be happy enough to add docs if people agree with the naming. > > > > FWIW, I vote for fixing (a) now but holding (b) for 9.5. > > I guess I'll vote for applying both. I don't see a lot of risk, and I > think doing one with out the other is somewhat pointless. Fine with me. I guess if we don't do b) for now we can just do the additional parameter and the Assert() from the patch. Without actually storing the name to shared memory. > Regarding patch 0002, I don't think we're using the term "static > shmem" anywhere else, so I vote for dropping the word static, so that > we have pg_get_shmem_allocations() and > pg_get_dynamic_shmem_allocations(). Fine #2. > Also, I think using the > "allocated" column is pretty weird. There are always exactly two > entries with allocated = false Hm. There shouldn't be. And at least in my installation there isn't and I don't see a anything in the code that'd allow that? The example from my last email has: > postgres=# SELECT * FROM pg_static_shmem_allocations ORDER BY key NULLS FIRST; > key | off | size | allocated > -------------------------------------+------------+------------+----------- > | 2222605024 | 1727776 | f > | | 34844752 | t > Async Ctl | 2222539168 | 65856 | t >, one of which is for actual free memory > and the other of which is for memory that actually IS allocated but > without using ShmemIndex (maybe the latter was supposed to have > allocated = true but still key = null?). Yes, that's how I'd imagined it. > I guess I'd vote for > ditching the allocated column completely and outputting the memory > allocated without ShmemIndex using some fixed tag (like "ShmemIndex" > or "Bootstrap" or "Overhead" or something). My way feels slightly cleaner, but I'd be ok with that as well. There's no possible conflicts with an actual segment... In your variant the unallocated/slop memory would continue to have a NULL key? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-05-06 22:04:04 +0100, Simon Riggs wrote: > On 6 May 2014 20:44, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Robert Haas <robertmhaas@gmail.com> writes: > >> On Tue, May 6, 2014 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >>> FWIW, I vote for fixing (a) now but holding (b) for 9.5. > > > >> I guess I'll vote for applying both. I don't see a lot of risk, and I > >> think doing one with out the other is somewhat pointless. > > > > The difference is that there's not consensus about the details of the > > views ... as borne out by your next paragraph. > > > > Now admittedly, we could always redefine the views in 9.5, but > > I'd rather not be doing this sort of thing in haste. Something > > as user-visible as a system view really ought to have baked awhile > > before we ship it. Patch (a) is merely institutionalizing the > > expectation that DSM segments should have names, which is a much > > lower-risk bet. > > As long as all the functions are exposed to allow b) to run as an > extension, I don't see we lose anything by waiting a while. They aren't exposed. It's touching implementation details in both shmem.c and dsm.c. I think that's actually fine. Imo it's not too bad if we don't get either in 9.4. It's not a critical feature.What I *would* like to avoid is a pointless API break between 9.4 and 9.5. Because I will push for the patch in 9.5 CF1... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, May 6, 2014 at 6:09 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> I guess I'd vote for >> ditching the allocated column completely and outputting the memory >> allocated without ShmemIndex using some fixed tag (like "ShmemIndex" >> or "Bootstrap" or "Overhead" or something). > > My way feels slightly cleaner, but I'd be ok with that as well. There's > no possible conflicts with an actual segment... In your variant the > unallocated/slop memory would continue to have a NULL key? Yeah, that seems all right. One way to avoid conflict with an actual segment would be to add an after-the-fact entry into ShmemIndex representing the amount of memory that was used to bootstrap it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-05-07 17:48:15 -0400, Robert Haas wrote: > On Tue, May 6, 2014 at 6:09 PM, Andres Freund <andres@2ndquadrant.com> wrote: > >> I guess I'd vote for > >> ditching the allocated column completely and outputting the memory > >> allocated without ShmemIndex using some fixed tag (like "ShmemIndex" > >> or "Bootstrap" or "Overhead" or something). > > > > My way feels slightly cleaner, but I'd be ok with that as well. There's > > no possible conflicts with an actual segment... In your variant the > > unallocated/slop memory would continue to have a NULL key? > > Yeah, that seems all right. Hm. Not sure what you're ACKing here ;). > One way to avoid conflict with an actual segment would be to add an > after-the-fact entry into ShmemIndex representing the amount of memory > that was used to bootstrap it. There's lots of allocations from shmem that cannot be associated with any index entry though. Not just ShmemIndex's own entry. Most prominently most of the memory used for SharedBufHash isn't actually associated with the "Shared Buffer Lookup Table" entry - imo a dynahash.c defficiency. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, May 7, 2014 at 5:54 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-05-07 17:48:15 -0400, Robert Haas wrote: >> On Tue, May 6, 2014 at 6:09 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> >> I guess I'd vote for >> >> ditching the allocated column completely and outputting the memory >> >> allocated without ShmemIndex using some fixed tag (like "ShmemIndex" >> >> or "Bootstrap" or "Overhead" or something). >> > >> > My way feels slightly cleaner, but I'd be ok with that as well. There's >> > no possible conflicts with an actual segment... In your variant the >> > unallocated/slop memory would continue to have a NULL key? >> >> Yeah, that seems all right. > > Hm. Not sure what you're ACKing here ;). The idea of giving the unallocated memory a NULL key. >> One way to avoid conflict with an actual segment would be to add an >> after-the-fact entry into ShmemIndex representing the amount of memory >> that was used to bootstrap it. > > There's lots of allocations from shmem that cannot be associated with > any index entry though. Not just ShmemIndex's own entry. Most > prominently most of the memory used for SharedBufHash isn't actually > associated with the "Shared Buffer Lookup Table" entry - imo a > dynahash.c defficiency. Hmm, I don't know what to do about that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-05-08 07:58:34 -0400, Robert Haas wrote: > On Wed, May 7, 2014 at 5:54 PM, Andres Freund <andres@2ndquadrant.com> wrote: > > Hm. Not sure what you're ACKing here ;). > > The idea of giving the unallocated memory a NULL key. Ok. A new version of the patches implementing that are attached. Including a couple of small fixups and docs. The latter aren't extensive, but that doesn't seem to be warranted anyway. > > There's lots of allocations from shmem that cannot be associated with > > any index entry though. Not just ShmemIndex's own entry. Most > > prominently most of the memory used for SharedBufHash isn't actually > > associated with the "Shared Buffer Lookup Table" entry - imo a > > dynahash.c defficiency. > Hmm, I don't know what to do about that. Well, we have to live with it for now :) Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
At 2014-05-08 15:28:22 +0200, andres@2ndquadrant.com wrote: > > > > Hm. Not sure what you're ACKing here ;). > > > > The idea of giving the unallocated memory a NULL key. > > Ok. A new version of the patches implementing that are attached. > Including a couple of small fixups and docs. The latter aren't > extensive, but that doesn't seem to be warranted anyway. I realise now that this email didn't actually have an attachment. Could you please repost the latest version of this patch? Thanks. -- Abhijit
On Mon, Jul 14, 2014 at 6:20 AM, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote: > At 2014-05-08 15:28:22 +0200, andres@2ndquadrant.com wrote: >> > > Hm. Not sure what you're ACKing here ;). >> > >> > The idea of giving the unallocated memory a NULL key. >> >> Ok. A new version of the patches implementing that are attached. >> Including a couple of small fixups and docs. The latter aren't >> extensive, but that doesn't seem to be warranted anyway. > > I realise now that this email didn't actually have an attachment. Could > you please repost the latest version of this patch? That's odd. I received two attachments with that email. Of course, I was copied directly, but why would the archives have lost the attachments? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas wrote: > On Mon, Jul 14, 2014 at 6:20 AM, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote: > > At 2014-05-08 15:28:22 +0200, andres@2ndquadrant.com wrote: > >> > > Hm. Not sure what you're ACKing here ;). > >> > > >> > The idea of giving the unallocated memory a NULL key. > >> > >> Ok. A new version of the patches implementing that are attached. > >> Including a couple of small fixups and docs. The latter aren't > >> extensive, but that doesn't seem to be warranted anyway. > > > > I realise now that this email didn't actually have an attachment. Could > > you please repost the latest version of this patch? > > That's odd. I received two attachments with that email. Of course, I > was copied directly, but why would the archives have lost the > attachments? The attachments are there on the archives, and also on my mbox -- and unlike Robert, I was not copied. I think this is a problem on Abhijit's end. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
At 2014-07-14 16:48:09 -0400, alvherre@2ndquadrant.com wrote: > > The attachments are there on the archives, and also on my mbox -- and > unlike Robert, I was not copied. I think this is a problem on > Abhijit's end. Yes, it is. I apologise for the noise. -- Abhijit
On Thu, May 8, 2014 at 10:28 PM, Andres Freund <andres@2ndquadrant.com> wrote: > Well, we have to live with it for now :) I just had a look at the first patch and got some comments: 1) Instead of using an assertion here, wouldn't it be better to error out if name is NULL, and truncate the name if it is longer than SHMEM_INDEX_KEYSIZE - 1 (including '\0')? scanstr in scansup.c? Assert(IsUnderPostmaster); + Assert(name != NULL && strlen(name) > 0 && + strlen(name) < SHMEM_INDEX_KEYSIZE - 1); 2) The addition of a field to track the size of a dsm should be explicitly mentioned, this is useful for the 2nd patch. 3) The refactoring done in dsm_create to find an unused slot should be done as a separate patch for clarity. 4) Using '\0' here would be more adapted: + item->name[SHMEM_INDEX_KEYSIZE - 1] = 0; Regards, -- Michael
And here are some comments about patch 2: - Patch applies with some hunks. - Some typos are present s#memory segments..#memory segments. (double dots) s#NULL#<literal>NULL</> (in the docs as this refers to a value) - Your thoughts about providing separate patches for each view? What this patch does is straight-forward, but pg_shmem_allocations does not actually depend on the first patch adding size and name to the dsm fields. So IMO it makes sense to separate each feature properly. - off should be renamed to offset for pg_get_shmem_allocations. - Is it really worth showing unused shared memory? I'd rather rip out the last portion of pg_get_shmem_allocations. - For refcnt in pg_get_dynamic_shmem_allocations, could you add a comment mentioning that refcnt = 1 means that the item is moribund and 0 is unused, and that reference count for active dsm segments only begins from 2? I would imagine that this is enough, instead of using some define's defining the ID from which a dsm item is considered as active. Regards, -- Michael
Hi On Thu, May 8, 2014 at 4:28 PM, Andres Freund <andres@2ndquadrant.com> wrote: > Ok. A new version of the patches implementing that are > attached. Including a couple of small fixups and docs. The latter aren't > extensive, but that doesn't seem to be warranted anyway. Is it really actually useful to expose the segment off(set) to users? Seems to me like unnecessary internal details leaking out. Regards, Marti
On 2014-08-15 11:12:11 +0300, Marti Raudsepp wrote: > Hi > > On Thu, May 8, 2014 at 4:28 PM, Andres Freund <andres@2ndquadrant.com> wrote: > > Ok. A new version of the patches implementing that are > > attached. Including a couple of small fixups and docs. The latter aren't > > extensive, but that doesn't seem to be warranted anyway. > > Is it really actually useful to expose the segment off(set) to users? > Seems to me like unnecessary internal details leaking out. Yes. This is clearly developer oriented and I'd more than once wished I could see where some stray pointer is pointing to... That's not really possible without something like this. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-08-14 22:16:31 -0700, Michael Paquier wrote: > And here are some comments about patch 2: > - Patch applies with some hunks. > - Some typos are present > s#memory segments..#memory segments. (double dots) > s#NULL#<literal>NULL</> (in the docs as this refers to a value) Will fix. > - Your thoughts about providing separate patches for each view? What > this patch does is straight-forward, but pg_shmem_allocations does not > actually depend on the first patch adding size and name to the dsm > fields. So IMO it makes sense to separate each feature properly. I don't know, seems a bit like busywork to me. Postgres doesn't really very granular commits... > - off should be renamed to offset for pg_get_shmem_allocations. ok. > - Is it really worth showing unused shared memory? I'd rather rip out > the last portion of pg_get_shmem_allocations. It's actually really helpful. There's a couple situations where you possibly can run out of that spare memory and into troubles. Which currently aren't diagnosable. Similarly we currently can't diagnose whether we're superflously allocate too much 'reserve' shared memory. > - For refcnt in pg_get_dynamic_shmem_allocations, could you add a > comment mentioning that refcnt = 1 means that the item is moribund and > 0 is unused, and that reference count for active dsm segments only > begins from 2? I would imagine that this is enough, instead of using > some define's defining the ID from which a dsm item is considered as > active. Ok. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Aug 15, 2014 at 4:20 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-08-15 11:12:11 +0300, Marti Raudsepp wrote: >> Hi >> On Thu, May 8, 2014 at 4:28 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> > Ok. A new version of the patches implementing that are >> > attached. Including a couple of small fixups and docs. The latter aren't >> > extensive, but that doesn't seem to be warranted anyway. >> >> Is it really actually useful to expose the segment off(set) to users? >> Seems to me like unnecessary internal details leaking out. > > Yes. This is clearly developer oriented and I'd more than once wished I > could see where some stray pointer is pointing to... That's not really > possible without something like this. Unfortunately, that information also has some security implications. I'm sure someone trying to exploit any future stack-overrun vulnerability will be very happy to have more rather than less information about the layout of the process address space. I fully agree with the idea of exposing the amount of free memory in the shared memory segment (as discussed in other emails); that's critical information. But I think exposing address space layout information is of much less general utility and, really, far too risky. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-08-18 11:56:44 -0400, Robert Haas wrote: > On Fri, Aug 15, 2014 at 4:20 AM, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2014-08-15 11:12:11 +0300, Marti Raudsepp wrote: > >> Hi > >> On Thu, May 8, 2014 at 4:28 PM, Andres Freund <andres@2ndquadrant.com> wrote: > >> > Ok. A new version of the patches implementing that are > >> > attached. Including a couple of small fixups and docs. The latter aren't > >> > extensive, but that doesn't seem to be warranted anyway. > >> > >> Is it really actually useful to expose the segment off(set) to users? > >> Seems to me like unnecessary internal details leaking out. > > > > Yes. This is clearly developer oriented and I'd more than once wished I > > could see where some stray pointer is pointing to... That's not really > > possible without something like this. > > Unfortunately, that information also has some security implications. > I'm sure someone trying to exploit any future stack-overrun > vulnerability will be very happy to have more rather than less > information about the layout of the process address space. > > I fully agree with the idea of exposing the amount of free memory in > the shared memory segment (as discussed in other emails); that's > critical information. But I think exposing address space layout > information is of much less general utility and, really, far too > risky. Meh. For one it's just the offsets, not the actual addresses. It's also something you can relatively easily compute at home by looking at a couple of settings everyone can see. For another, I'd be perfectly content making this superuser only. And if somebody can execute queries as superuser, address layout information really isn't needed anymore to execute arbitrary code. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-08-18 11:56:44 -0400, Robert Haas wrote: >> I fully agree with the idea of exposing the amount of free memory in >> the shared memory segment (as discussed in other emails); that's >> critical information. But I think exposing address space layout >> information is of much less general utility and, really, far too >> risky. > Meh. For one it's just the offsets, not the actual addresses. It's also > something you can relatively easily compute at home by looking at a > couple of settings everyone can see. For another, I'd be perfectly > content making this superuser only. And if somebody can execute queries > as superuser, address layout information really isn't needed anymore to > execute arbitrary code. I agree that this has to be superuser-only if it's there at all. Should we consider putting it into an extension rather than having it in the core system? That would offer some additional protection for production systems, which really shouldn't have much need for this IMO. regards, tom lane
On 2014-08-18 12:27:12 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-08-18 11:56:44 -0400, Robert Haas wrote: > >> I fully agree with the idea of exposing the amount of free memory in > >> the shared memory segment (as discussed in other emails); that's > >> critical information. But I think exposing address space layout > >> information is of much less general utility and, really, far too > >> risky. > > > Meh. For one it's just the offsets, not the actual addresses. It's also > > something you can relatively easily compute at home by looking at a > > couple of settings everyone can see. For another, I'd be perfectly > > content making this superuser only. And if somebody can execute queries > > as superuser, address layout information really isn't needed anymore to > > execute arbitrary code. > > I agree that this has to be superuser-only if it's there at all. > > Should we consider putting it into an extension rather than having > it in the core system? That would offer some additional protection > for production systems, which really shouldn't have much need for > this IMO. I'd considered that somewhere upthread and decided that it'd require exposing to much internals from shmem.c/dsm.c without a corresponding benefit. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-08-18 12:27:12 -0400, Tom Lane wrote: >> Should we consider putting it into an extension rather than having >> it in the core system? That would offer some additional protection >> for production systems, which really shouldn't have much need for >> this IMO. > I'd considered that somewhere upthread and decided that it'd require > exposing to much internals from shmem.c/dsm.c without a corresponding > benefit. Well, we could have the implementation code in those modules but not provide any SQL-level access to it without installing an extension. The only extra thing visible in the .h files would be a function or two. regards, tom lane
On 2014-08-18 12:33:44 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-08-18 12:27:12 -0400, Tom Lane wrote: > >> Should we consider putting it into an extension rather than having > >> it in the core system? That would offer some additional protection > >> for production systems, which really shouldn't have much need for > >> this IMO. > > > I'd considered that somewhere upthread and decided that it'd require > > exposing to much internals from shmem.c/dsm.c without a corresponding > > benefit. > > Well, we could have the implementation code in those modules but not > provide any SQL-level access to it without installing an extension. > The only extra thing visible in the .h files would be a function or two. That'd require wrapper functions in the extension afaics. Not that that is prohibitive, but a bit inconvenient. At least I don't see another way to create a sql function referring to a builtin C implementation. I don't think PG_FUNCTION_INFO_V1() can reliably made work. We could have the underlying function in pg_proc, but not create the view... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-08-18 12:33:44 -0400, Tom Lane wrote: >> Well, we could have the implementation code in those modules but not >> provide any SQL-level access to it without installing an extension. >> The only extra thing visible in the .h files would be a function or two. > That'd require wrapper functions in the extension afaics. Sure. I'd want to put as much of the logic as possible in the extension, anyway. regards, tom lane
On Mon, Aug 18, 2014 at 12:00 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> Unfortunately, that information also has some security implications. >> I'm sure someone trying to exploit any future stack-overrun >> vulnerability will be very happy to have more rather than less >> information about the layout of the process address space. >> > > Meh. For one it's just the offsets, not the actual addresses. It's also > something you can relatively easily compute at home by looking at a > couple of settings everyone can see. For another, I'd be perfectly > content making this superuser only. And if somebody can execute queries > as superuser, address layout information really isn't needed anymore to > execute arbitrary code. I'm just not sure it should be in there at all. If we punt this off into an extension, it won't be available in a lot of situations where it's really needed. But although the basic functionality would have been quite useful to me on any number of occasions, I can't recall a single occasion upon which I would have cared about the offset at all. I wouldn't mind having a MemoryContextStats()-type function that could be used to print this information out by attaching to the backend with gdb, but the utility of exposing it at the SQL level seems very marginal to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-08-18 12:41:58 -0400, Robert Haas wrote: > On Mon, Aug 18, 2014 at 12:00 PM, Andres Freund <andres@2ndquadrant.com> wrote: > >> Unfortunately, that information also has some security implications. > >> I'm sure someone trying to exploit any future stack-overrun > >> vulnerability will be very happy to have more rather than less > >> information about the layout of the process address space. > >> > > > > Meh. For one it's just the offsets, not the actual addresses. It's also > > something you can relatively easily compute at home by looking at a > > couple of settings everyone can see. For another, I'd be perfectly > > content making this superuser only. And if somebody can execute queries > > as superuser, address layout information really isn't needed anymore to > > execute arbitrary code. > > I'm just not sure it should be in there at all. You realize that you can pretty much recompute the offsets from the sizes of the individual allocations anyway? Yes, you need to add some rounding, but that's about it. We could randomize the returned elements, but that'd be rather annoying because it'd loose information. > If we punt this off > into an extension, it won't be available in a lot of situations where > it's really needed. But although the basic functionality would have > been quite useful to me on any number of occasions, I can't recall a > single occasion upon which I would have cared about the offset at all. > I wouldn't mind having a MemoryContextStats()-type function that could > be used to print this information out by attaching to the backend with > gdb, but the utility of exposing it at the SQL level seems very > marginal to me. I'd use for it in the past when trying to figure out what some pointer pointed to. It's easy enough to figure out that it's in the main shared memory segment, but after that it get's rather hard. And even if you don't count that, it gives a sensible order to the returned rows from the SRF. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Aug 18, 2014 at 12:46 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-08-18 12:41:58 -0400, Robert Haas wrote: >> On Mon, Aug 18, 2014 at 12:00 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> >> Unfortunately, that information also has some security implications. >> >> I'm sure someone trying to exploit any future stack-overrun >> >> vulnerability will be very happy to have more rather than less >> >> information about the layout of the process address space. >> >> >> > >> > Meh. For one it's just the offsets, not the actual addresses. It's also >> > something you can relatively easily compute at home by looking at a >> > couple of settings everyone can see. For another, I'd be perfectly >> > content making this superuser only. And if somebody can execute queries >> > as superuser, address layout information really isn't needed anymore to >> > execute arbitrary code. >> >> I'm just not sure it should be in there at all. > > You realize that you can pretty much recompute the offsets from the > sizes of the individual allocations anyway? Sure, if you know the segment base. Do you? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-08-18 12:50:27 -0400, Robert Haas wrote: > On Mon, Aug 18, 2014 at 12:46 PM, Andres Freund <andres@2ndquadrant.com> wrote: > > You realize that you can pretty much recompute the offsets from the > > sizes of the individual allocations anyway? > > Sure, if you know the segment base. Do you? Err? The offset doesn't give you the base either? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Robert Haas <robertmhaas@gmail.com> writes: > I wouldn't mind having a MemoryContextStats()-type function that could > be used to print this information out by attaching to the backend with > gdb, but the utility of exposing it at the SQL level seems very > marginal to me. +1 for doing it like that. regards, tom lane
On Mon, Aug 18, 2014 at 12:51 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-08-18 12:50:27 -0400, Robert Haas wrote: >> On Mon, Aug 18, 2014 at 12:46 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> > You realize that you can pretty much recompute the offsets from the >> > sizes of the individual allocations anyway? >> >> Sure, if you know the segment base. Do you? > > Err? The offset doesn't give you the base either? Oh! I thought you were printing actual pointer addresses. If you're just printing offsets relative to wherever the segment happens to be mapped, I don't care about that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > I thought you were printing actual pointer addresses. If you're just > printing offsets relative to wherever the segment happens to be > mapped, I don't care about that. Well, that just means that it's not an *obvious* security risk. I still like the idea of providing something comparable to MemoryContextStats, rather than creating a SQL interface. The problem with a SQL interface is you can't interrogate it unless (1) you are not already inside a query and (2) the client is interactive and under your control. Something you can call easily from gdb is likely to be much more useful in practice. regards, tom lane
On 2014-08-18 13:27:07 -0400, Tom Lane wrote: > I still like the idea of providing something comparable to > MemoryContextStats, rather than creating a SQL interface. The problem > with a SQL interface is you can't interrogate it unless (1) you are not > already inside a query and (2) the client is interactive and under your > control. Something you can call easily from gdb is likely to be much > more useful in practice. That might be true in some cases, but in many cases interfaces that can only be used via gdb *SUCK*. A good reason to use the interface proposed here is to investigate which extensions are allocating how much shared memory. A pretty normal question to ask as a sysadmin/DBA. And DBA type of stuff should never have to involve gdb. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Aug 18, 2014 at 1:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I thought you were printing actual pointer addresses. If you're just >> printing offsets relative to wherever the segment happens to be >> mapped, I don't care about that. > > Well, that just means that it's not an *obvious* security risk. > > I still like the idea of providing something comparable to > MemoryContextStats, rather than creating a SQL interface. The problem > with a SQL interface is you can't interrogate it unless (1) you are not > already inside a query and (2) the client is interactive and under your > control. Something you can call easily from gdb is likely to be much > more useful in practice. Since the shared memory segment isn't changing at runtime, I don't see this as being a big problem. It could possibly be an issue for dynamic shared memory segments, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Aug 18, 2014 at 1:12 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Aug 18, 2014 at 1:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> I thought you were printing actual pointer addresses. If you're just >>> printing offsets relative to wherever the segment happens to be >>> mapped, I don't care about that. >> >> Well, that just means that it's not an *obvious* security risk. >> >> I still like the idea of providing something comparable to >> MemoryContextStats, rather than creating a SQL interface. The problem >> with a SQL interface is you can't interrogate it unless (1) you are not >> already inside a query and (2) the client is interactive and under your >> control. Something you can call easily from gdb is likely to be much >> more useful in practice. > > Since the shared memory segment isn't changing at runtime, I don't see > this as being a big problem. It could possibly be an issue for > dynamic shared memory segments, though. Patch has been reviewed some time ago, extra ideas as well as potential security risks discussed as well but no new version has been sent, hence marking it as returned with feedback. -- Michael
On 2014-09-19 23:07:07 -0500, Michael Paquier wrote: > On Mon, Aug 18, 2014 at 1:12 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > On Mon, Aug 18, 2014 at 1:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Robert Haas <robertmhaas@gmail.com> writes: > >>> I thought you were printing actual pointer addresses. If you're just > >>> printing offsets relative to wherever the segment happens to be > >>> mapped, I don't care about that. > >> > >> Well, that just means that it's not an *obvious* security risk. > >> > >> I still like the idea of providing something comparable to > >> MemoryContextStats, rather than creating a SQL interface. The problem > >> with a SQL interface is you can't interrogate it unless (1) you are not > >> already inside a query and (2) the client is interactive and under your > >> control. Something you can call easily from gdb is likely to be much > >> more useful in practice. > > > > Since the shared memory segment isn't changing at runtime, I don't see > > this as being a big problem. It could possibly be an issue for > > dynamic shared memory segments, though. > Patch has been reviewed some time ago, extra ideas as well as > potential security risks discussed as well but no new version has been > sent, hence marking it as returned with feedback. Here's a rebased version. I remember why I didn't call the column "offset" (as Michael complained about upthread), it's a keyword... Regards, Andres
Attachment
On Fri, May 6, 2016 at 8:01 AM, Andres Freund <andres@anarazel.de> wrote: > Here's a rebased version. I remember why I didn't call the column > "offset" (as Michael complained about upthread), it's a keyword... Oh, an old patch resurrected from the dead... Reading again the patch + * Increase number of initilized slots if we didn't reuse a previously + * used one. s/initilized/initialized + Number of backends attached to this segment (1 signals a moribund + entry, 2 signals one user, ...). moribund? Or target for removal. REVOKE ALL .. FROM PUBLIC; REVOKE EXECUTE .. ON FUNCTION FROM PUBLIC; Revoking he execution of those views and their underlying functions would be a good thing I think, this can give hints on the system activity, particularly for DSM segments. -- Michael
On Thu, May 5, 2016 at 7:01 PM Andres Freund <andres@anarazel.de> wrote: > Here's a rebased version. I remember why I didn't call the column > "offset" (as Michael complained about upthread), it's a keyword... This never got applied, and that annoyed me again today, so here's a new version that I've whacked around somewhat and propose to commit. I ripped out the stuff pertaining to dynamic shared memory segments, both because I think it might need some more thought and discussion, and because the part the pertains to the main shared memory segment is the part I keep wishing we had. We can add that other part later if we're all agreed on it, but let's go ahead and add this part now. Other things I changed: - Doc edits. - Added REVOKE statements as proposed by Michael (and I agree). - Can't patch pg_proc.h any more, gotta patch pg_proc.dat. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
Robert Haas <robertmhaas@gmail.com> writes: > This never got applied, and that annoyed me again today, so here's a > new version that I've whacked around somewhat and propose to commit. > ... > Other things I changed: > - Doc edits. > - Added REVOKE statements as proposed by Michael (and I agree). > - Can't patch pg_proc.h any more, gotta patch pg_proc.dat. If we're disallowing public access to the view (which I agree on), doesn't that need to be mentioned in the docs? I think there's standard boilerplate we use for other such views. Also, there's an introductory section in catalogs.sgml that should have an entry for this view. Also, likely the function should be volatile not stable. I'm not sure that it makes any difference in the view's usage, but in principle the answers could change intraquery. I didn't really read the patch in any detail, but those things hopped out at me. regards, tom lane
Hi, On 2019-11-15 14:43:09 -0500, Robert Haas wrote: > On Thu, May 5, 2016 at 7:01 PM Andres Freund <andres@anarazel.de> wrote: > > Here's a rebased version. I remember why I didn't call the column > > "offset" (as Michael complained about upthread), it's a keyword... > > This never got applied, and that annoyed me again today, so here's a > new version that I've whacked around somewhat and propose to commit. I > ripped out the stuff pertaining to dynamic shared memory segments, > both because I think it might need some more thought and discussion, > and because the part the pertains to the main shared memory segment is > the part I keep wishing we had. We can add that other part later if > we're all agreed on it, but let's go ahead and add this part now. Oh, nice! Makes sense to me to split off the dsm part. Mildly related: I really wish we had a postmaster option that'd parse the config file, and exit after computing the amount of shared memory that'll be required. Would be really useful to reliably compute the number of huge pages needed. Right now one basically needs to start pg and parse error messages, to do so. > + <sect1 id="view-pg-shmem-allocations"> > + <title><structname>pg_shmem_allocations</structname></title> > + > + <indexterm zone="view-pg-shmem-allocations"> > + <primary>pg_shmem_allocations</primary> > + </indexterm> > + > + <para> > + The <structname>pg_shmem_allocations</structname> view shows allocations > + made from the server's main shared memory segment. This includes both > + memory allocated by <productname>postgres</productname> itself and memory > + allocated by extensions using the mechanisms detailed in > + <xref linkend="xfunc-shared-addin" />. > + </para> > + > + <para> > + Note that this view does not include memory allocated using the dynamic > + shared memory infrastructure. > + </para> Perhaps add the equivalent of <para> By default, the <structname>pg_config</structname> view can be read only by superusers. </para> now that you've added the revoke (with which I agree). > + <tbody> > + <row> > + <entry><structfield>name</structfield></entry> > + <entry><type>text</type></entry> > + <entry>The name of the shared memory allocation. NULL for unused memory > + and <anonymous> for anonymous allocations.</entry> > + </row> > + <row> > + <entry><structfield>off</structfield></entry> > + <entry><type>bigint</type></entry> > + <entry>The offset at which the allocation starts. NULL for anonymous > + allocations.</entry> > + </row> > + > + <row> > + <entry><structfield>size</structfield></entry> > + <entry><type>bigint</type></entry> > + <entry>Size of the allocation</entry> > + </row> > + </tbody> > + </tgroup> > + </table> Should we note that there can be only one entry for unused and anonymous memory? And that off will be NULL for anonymous memory? > + /* output all allocated entries */ > + while ((ent = (ShmemIndexEnt *) hash_seq_search(&hstat)) != NULL) > + { > + Datum values[PG_GET_SHMEM_SIZES_COLS]; > + bool nulls[PG_GET_SHMEM_SIZES_COLS]; It's very mildly odd to have three of these values/nulls arrays... > +# shared memory usage > +{ oid => '8613', > + descr => 'allocations from the main shared memory segment', > + proname => 'pg_get_shmem_allocations', 'prorows' => 10, 'proretset' => 't', > + provolatile => 's', 'prorettype' => 'record', 'proargtypes' => '', > + proallargtypes => '{text,int8,int8}', proargmodes => '{o,o,o}', > + proargnames => '{name,off,size}', > + prosrc => 'pg_get_shmem_allocations' }, Hm. I think the function is actually volatile, rather than stable? Queries can trigger shmem allocations internally, right? Greetings, Andres Freund
On Fri, Nov 15, 2019 at 11:59:34AM -0800, Andres Freund wrote: > On 2019-11-15 14:43:09 -0500, Robert Haas wrote: >> This never got applied, and that annoyed me again today, so here's a >> new version that I've whacked around somewhat and propose to commit. I >> ripped out the stuff pertaining to dynamic shared memory segments, >> both because I think it might need some more thought and discussion, >> and because the part the pertains to the main shared memory segment is >> the part I keep wishing we had. We can add that other part later if >> we're all agreed on it, but let's go ahead and add this part now. > > Oh, nice! Makes sense to me to split off the dsm part. last Friday we had a conference in Tokyo, and this has been actually mentioned when we had an after-dinner with a couple of other hackers. Then a couple of hours later this thread rises from the ashes. +/* SQL SRF showing allocated shared memory */ +Datum +pg_get_shmem_allocations(PG_FUNCTION_ARGS) This could be more talkative. >> +# shared memory usage >> +{ oid => '8613', >> + descr => 'allocations from the main shared memory segment', >> + proname => 'pg_get_shmem_allocations', 'prorows' => 10, 'proretset' => 't', >> + provolatile => 's', 'prorettype' => 'record', 'proargtypes' => '', >> + proallargtypes => '{text,int8,int8}', proargmodes => '{o,o,o}', >> + proargnames => '{name,off,size}', >> + prosrc => 'pg_get_shmem_allocations' }, > > Hm. I think the function is actually volatile, rather than stable? > Queries can trigger shmem allocations internally, right? +1. -- Michael
Attachment
Hi, On 2019-11-18 21:49:55 +0900, Michael Paquier wrote: > +/* SQL SRF showing allocated shared memory */ > +Datum > +pg_get_shmem_allocations(PG_FUNCTION_ARGS) > This could be more talkative. I don't really see what it'd say, except restate the function name as a sentence? I think that kind of comment has negative, not positive value. - Andres
On Fri, Nov 15, 2019 at 2:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I didn't really read the patch in any detail, but those things > hopped out at me. Thanks for the reviews. Here is a new version. Changes: - doc: Add an entry to the table of system views, per Tom. - doc: Wrap reference to "<anonymous>" in <literal> tags, per self-review. - doc: Note that off is NULL for <anonymous> allocations, per Andres. - doc: Remove extra "or" at the end of a sentence, per self-review. - doc: Note that the view is by default super-user only, per both Tom and Andres. - code: Declare values/nulls arrays only once at function scope instead of 3x, and tighten up code, per Andres and self-review. - dat: Mark as volatile, per Tom and Andres and Michael. Non-changes: - I didn't add a comment saying that there would be only one entry for anonymous and free memory, as requested by Andres, because it seemed like that's what users would expect anyway and I couldn't think of a way of saying it that didn't sound dumb. - I didn't expand the comment for the C function pg_get_shmem_allocations, as requested by Michael, because I agree with Andres that there doesn't seem to be anything particularly helpful to say there. I'll go ahead and commit this if nobody has further comments. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
On 2019-Dec-18, Robert Haas wrote: > - code: Declare values/nulls arrays only once at function scope > instead of 3x, and tighten up code, per Andres and self-review. Really small nit: I'd rather have the "nulls" assignment in all cases even when the previous value is still valid. This tight coding seems weirdly asymmetrical. Can we please stop splitting this error message in two? + errmsg("materialize mode required, but it is not " \ + "allowed in this context"))); (What's with the newline escape there anyway?) Shmem structs are cacheline-aligned; the padding space is not AFAICS considered in ShmemIndexEnt->size. The reported size would be under-reported (or reported as "anonymous"?) I think we should include the alignment padding, for clarity. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Dec 18, 2019 at 10:59 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > On 2019-Dec-18, Robert Haas wrote: > > - code: Declare values/nulls arrays only once at function scope > > instead of 3x, and tighten up code, per Andres and self-review. > > Really small nit: I'd rather have the "nulls" assignment in all cases > even when the previous value is still valid. This tight coding seems > weirdly asymmetrical. I agree that the asymmetry of it is a bit bothersome, but I think having extra code setting things to null is worse. It makes the code significantly bigger, which to me is more problematic than the asymmetry. > Can we please stop splitting this error message in two? > > + errmsg("materialize mode required, but it is not " \ > + "allowed in this context"))); > > (What's with the newline escape there anyway?) That message is like that everywhere in the tree, including the escape, except for a couple of instances in contrib which deviate. If you want to go change them all, feel free, and I'll adjust this to match the then-prevailing style. > Shmem structs are cacheline-aligned; the padding space is not AFAICS > considered in ShmemIndexEnt->size. The reported size would be > under-reported (or reported as "anonymous"?) I think we should include > the alignment padding, for clarity. It seems to me that you could plausibly define this view to show either (a) the amount of space that the caller actually tried to allocate or (b) the amount of space that the allocator decided to allocate, after padding, and it's not obvious that (b) is a better definition than (a). That having been said, you're correct that the padding space is currently reported as <anonymous>, and that does seem wrong. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Dec 18, 2019 at 10:59 AM Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: >> Can we please stop splitting this error message in two? >> >> + errmsg("materialize mode required, but it is not " \ >> + "allowed in this context"))); >> >> (What's with the newline escape there anyway?) > That message is like that everywhere in the tree, including the > escape, except for a couple of instances in contrib which deviate. If > you want to go change them all, feel free, and I'll adjust this to > match the then-prevailing style. I agree with Alvaro that that is *not* project style, particularly not the newline escape. Like Robert, I'm not quite fussed enough to go change it, but +1 if Alvaro wants to. > It seems to me that you could plausibly define this view to show > either (a) the amount of space that the caller actually tried to > allocate or (b) the amount of space that the allocator decided to > allocate, after padding, and it's not obvious that (b) is a better > definition than (a). > That having been said, you're correct that the padding space is > currently reported as <anonymous>, and that does seem wrong. It seems like it'd be worth subdividing "<anonymous>" into the actual anonymous allocations and the allocator overhead (which is both padding and whatever the shmem allocator itself eats). Maybe call the latter "<overhead>". After which, I'd be tempted to call the free space "<free>" rather than giving it a null name. regards, tom lane
On Wed, Dec 18, 2019 at 12:06 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > It seems like it'd be worth subdividing "<anonymous>" into the actual > anonymous allocations and the allocator overhead (which is both > padding and whatever the shmem allocator itself eats). Maybe call > the latter "<overhead>". After which, I'd be tempted to call the > free space "<free>" rather than giving it a null name. Here's a new version that takes a slightly different approach: it adds an additional column to the view for "allocated_size," so that you can see both what was requested and what you actually got. With this approach, you can do interesting things like: select *, off - lag(off + allocated_size, 1) over () as hole_size from (select * from pg_shmem_allocations order by 2) x; ...which didn't work very well with the previous version. All of this makes me think that we might want to do some followup to (1) convert anonymous allocations to non-anonymous allocations, for inspectability, and (2) do some renaming to get better stylistic agreement between the names of various shared memory chunks. But I think that work is properly done after this patch is committed, not before. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
At Wed, 18 Dec 2019 12:30:51 -0500, Robert Haas <robertmhaas@gmail.com> wrote in > On Wed, Dec 18, 2019 at 12:06 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > It seems like it'd be worth subdividing "<anonymous>" into the actual > > anonymous allocations and the allocator overhead (which is both > > padding and whatever the shmem allocator itself eats). Maybe call > > the latter "<overhead>". After which, I'd be tempted to call the > > free space "<free>" rather than giving it a null name. > > Here's a new version that takes a slightly different approach: it adds > an additional column to the view for "allocated_size," so that you can > see both what was requested and what you actually got. With this The doc is saying that "size" is "Size of the allocation" and "allocated_size" is "size including padding". It seems somewhat confusing to me. I'm not sure what wording is best but I think people use net/gross wordings to describe like that. > approach, you can do interesting things like: > > select *, off - lag(off + allocated_size, 1) over () as hole_size from > (select * from pg_shmem_allocations order by 2) x; > > ...which didn't work very well with the previous version. > All of this makes me think that we might want to do some followup to > (1) convert anonymous allocations to non-anonymous allocations, for > inspectability, and (2) do some renaming to get better stylistic > agreement between the names of various shared memory chunks. But I > think that work is properly done after this patch is committed, not > before. I agree to (2), but regarding (1), most or perhaps all of the anonymous allocations seems belonging to one of the shared hashes but are recognized as holes shown by the above statement. So the current output of the view is wrong in that sense. I think (1) should be resolved before adding the view. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Wed, Dec 18, 2019 at 9:53 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > The doc is saying that "size" is "Size of the allocation" and > "allocated_size" is "size including padding". It seems somewhat > confusing to me. I'm not sure what wording is best but I think people > use net/gross wordings to describe like that. I don't think I'd find that particularly clear. It seems to me that if the second size includes padding, then the first one differs in not including padding, so I'm not really sure where the confusion is. I thought about writing, for the first one, that is the requested size of the allocation, but that seemed like it might confuse someone by suggesting that the actual size of the allocation might be less than what was requested. I also thought about describing the first one as the size excluding padding, but that seems redundant. Maybe it would be good to change the second one to say "Size of the allocation including padding added by the allocator itself." > > All of this makes me think that we might want to do some followup to > > (1) convert anonymous allocations to non-anonymous allocations, for > > inspectability, and (2) do some renaming to get better stylistic > > agreement between the names of various shared memory chunks. But I > > think that work is properly done after this patch is committed, not > > before. > > I agree to (2), but regarding (1), most or perhaps all of the > anonymous allocations seems belonging to one of the shared hashes but > are recognized as holes shown by the above statement. So the current > output of the view is wrong in that sense. I think (1) should be > resolved before adding the view. I guess I don't understand how this makes the output wrong. Either the allocations have a name, or they are anonymous. This dumps everything that has a name. What would you suggest? It seems to me that it's more appropriate for this patch to just tell us about what's in shared memory, not change what's in shared memory. If we want to do the latter, that's a job for another patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
At Mon, 30 Dec 2019 13:52:50 -0500, Robert Haas <robertmhaas@gmail.com> wrote in > On Wed, Dec 18, 2019 at 9:53 PM Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote: > > The doc is saying that "size" is "Size of the allocation" and > > "allocated_size" is "size including padding". It seems somewhat > > confusing to me. I'm not sure what wording is best but I think people > > use net/gross wordings to describe like that. > > I don't think I'd find that particularly clear. It seems to me that if > the second size includes padding, then the first one differs in not > including padding, so I'm not really sure where the confusion is. I > thought about writing, for the first one, that is the requested size > of the allocation, but that seemed like it might confuse someone by > suggesting that the actual size of the allocation might be less than > what was requested. I also thought about describing the first one as > the size excluding padding, but that seems redundant. Maybe it would > be good to change the second one to say "Size of the allocation > including padding added by the allocator itself." Ugh.. Thanks for the explanation and I'm convinced that the current wording is the best. > > > All of this makes me think that we might want to do some followup to > > > (1) convert anonymous allocations to non-anonymous allocations, for > > > inspectability, and (2) do some renaming to get better stylistic > > > agreement between the names of various shared memory chunks. But I > > > think that work is properly done after this patch is committed, not > > > before. > > > > I agree to (2), but regarding (1), most or perhaps all of the > > anonymous allocations seems belonging to one of the shared hashes but > > are recognized as holes shown by the above statement. So the current > > output of the view is wrong in that sense. I think (1) should be > > resolved before adding the view. > > I guess I don't understand how this makes the output wrong. Either the > allocations have a name, or they are anonymous. This dumps everything > that has a name. What would you suggest? It seems to me that it's more > appropriate for this patch to just tell us about what's in shared > memory, not change what's in shared memory. If we want to do the > latter, that's a job for another patch. Sorry for the strange sentense. "I agree to (2)" meant that "I agree that (2) is done in later patch". About (1), I perplexed by the word "hole", which seemed to me as a region that is not allocated to anything. But no columns with the name actually is not in the view, so the view is not wrong in the first place. I agree to the patch as-is. Thanks for the explanatin. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Wed, Jan 8, 2020 at 2:30 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > I agree to the patch as-is. Thanks for the explanatin. OK. Thanks for the review, and for your thoughtful consideration of my comments. I went ahead and committed this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi, On 2020-01-09 11:13:46 -0500, Robert Haas wrote: > On Wed, Jan 8, 2020 at 2:30 AM Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote: > > I agree to the patch as-is. Thanks for the explanatin. > > OK. Thanks for the review, and for your thoughtful consideration of my comments. > > I went ahead and committed this. Wee! Greetings, Andres Freund