Thread: Fix for pg_stat_activity putting client hostaddr into appname field

Fix for pg_stat_activity putting client hostaddr into appname field

From
Edmund Horner
Date:
I noticed when querying pg_stat_activity (in 10.1):

$ SELECT pid, application_name, client_hostname, backend_type FROM
pg_stat_activity;

  pid  |     application_name     |    client_hostname    |         backend_type
-------+--------------------------+-----------------------+------------------------------
 10207 |                          |                       | autovacuum launcher
 10209 |                          |                       | logical
replication launcher
 10211 | psql                     | localhost             | client backend
 10212 | DBeaver 4.3.4 - Main     | ms006593.met.co.nz    | client backend
 10213 | DBeaver 4.3.4 - Metadata | ms006593.met.co.nz    | client backend
 10219 | psql                     | at-ice-db01.met.co.nz | client backend
 10205 | ms006593.met.co.nz       |                       | background writer
 10204 | ms006593.met.co.nz       |                       | checkpointer
 10206 | at-ice-db01.met.co.nz    |                       | walwriter

I've tracked this down to bootstrap/pgstat.c.

We create shared memory segments BackendAppnameBuffer,
BackendClientHostnameBuffer, and BackendActivityBufferSize with enough
room for MaxBackends strings each.  In each PgBackendStatus item, we
point st_appname, st_clienthostname, and st_activity_raw to the
appropriate offset within this blocks.

But the stats array includes auxiliary processes, which means it has
NumBackendStatSlots items.  The pointers for the aux process strings
are out of range.  In the case of my query, the pointers for
st_appname in the aux processes happen to point into the
BackendClientHostnameBuffer segment.

This patch uses NumBackendStatSlots for the sizes of those segments,
rather than MaxBackends.

I considered whether aux processes really need those strings
(especially st_clienthostname), but decided it was more consistent
just to assume that they might.  (It's an extra 3 kB... if we want to
save that we can put a bunch of if statements in pgstat_bestart and
other places.)

The patch also allocates local memory for st_clienthostname in
pgstat_read_current_status.  These strings shouldn't change very
often, but it seems safer and more consistent to treat them as we
treat the other two string fields.  Without the string copy, I think a
client disconnect and be replaced after the stats have been copied,
resulting in the new hostname appearing alongside the copied stats
fields from the old connection.

Cheers,
Edmund

Attachment

Re: Fix for pg_stat_activity putting client hostaddr into appname field

From
Edmund Horner
Date:
I sent the original in haste, and now I need to make some corrections... sigh.

> Subject: Fix for pg_stat_activity putting client hostaddr into appname field

Actually, it's the hostname appears in the appname field.

> I noticed when querying pg_stat_activity (in 10.1):

10.1 was where I first noticed the bug, but it's still present in master.

> I've tracked this down to bootstrap/pgstat.c.

Should be postmaster/pgstat.c.

> In the case of my query, the pointers for st_appname in the aux processes happen to point into the
BackendClientHostnameBuffersegment.
 

To be clear, I think this is a memory error.  These rogue pointers
could do a lot more damage than merely pointing to the wrong strings.

> It's an extra 3 kB ...

A rough estimate, that was also wrong.  7 aux processes * (1024 bytes
activity + 64 for hostname + 64 for appname) = about 8 kB.

I do apologise for the confusion!

Edmund


Re: Fix for pg_stat_activity putting client hostaddr into appnamefield

From
Michael Paquier
Date:
On Tue, Mar 27, 2018 at 03:47:07PM +1300, Edmund Horner wrote:
> But the stats array includes auxiliary processes, which means it has
> NumBackendStatSlots items.  The pointers for the aux process strings
> are out of range.  In the case of my query, the pointers for
> st_appname in the aux processes happen to point into the
> BackendClientHostnameBuffer segment.
>
> This patch uses NumBackendStatSlots for the sizes of those segments,
> rather than MaxBackends.

I am adding in CC Robert and Kuntal who worked on that (issue added as
well to the older bug section in v11 open item list).

I have read through your patch and it seems to me that you are right
here.  The error comes from the original commit fc70a4b0, which has
added auxiliary processes to pg_stat_activity.  Even
BackendStatusShmemSize uses NumBackendStatSlots for the calculation of
the array's lengths, but CreateSharedBackendStatus still mixes it with
NumBackends.

> I considered whether aux processes really need those strings
> (especially st_clienthostname), but decided it was more consistent
> just to assume that they might.  (It's an extra 3 kB... if we want to
> save that we can put a bunch of if statements in pgstat_bestart and
> other places.)

BackendStatusShmemSize has been originally changed to use
NumBackendStatSlots, so the intention is visibly to be consistent in the
number of backends used, even if this means consuming a bit more memory,
so the idea was to focus on consistency and simplicity.

> The patch also allocates local memory for st_clienthostname in
> pgstat_read_current_status.  These strings shouldn't change very
> often, but it seems safer and more consistent to treat them as we
> treat the other two string fields.  Without the string copy, I think a
> client disconnect and be replaced after the stats have been copied,
> resulting in the new hostname appearing alongside the copied stats
> fields from the old connection.

Agreed on that as well.
--
Michael

Attachment

Re: Fix for pg_stat_activity putting client hostaddr into appname field

From
Edmund Horner
Date:
On 29 March 2018 at 20:46, Michael Paquier <michael@paquier.xyz> wrote:
> On Tue, Mar 27, 2018 at 03:47:07PM +1300, Edmund Horner wrote:
>> But the stats array includes auxiliary processes, which means it has
>> NumBackendStatSlots items.  The pointers for the aux process strings
>> are out of range.  In the case of my query, the pointers for
>> st_appname in the aux processes happen to point into the
>> BackendClientHostnameBuffer segment.
>>
>> This patch uses NumBackendStatSlots for the sizes of those segments,
>> rather than MaxBackends.
>
> I am adding in CC Robert and Kuntal who worked on that (issue added as
> well to the older bug section in v11 open item list).

Thanks for making a note of it, Michael.  I guess it should be
considered for the next patch release of v10 too?


Re: Fix for pg_stat_activity putting client hostaddr into appnamefield

From
Michael Paquier
Date:
On Fri, Apr 06, 2018 at 10:46:27AM +1200, Edmund Horner wrote:
> Thanks for making a note of it, Michael.  I guess it should be
> considered for the next patch release of v10 too?

The issue is tracked on the wiki, so at least we are sure that we won't
lose sight of it :)

After that, the timing a patch is applied will depend on the time other
folks involved with this code will be able to comment on the patch and
this thread, which may take some time as everybody is busy with closing
the last commit fest for the development cycle of v11.

As far as I can see, this patch should be applied to v10, so my review
may help in pushing forward with it more quickly.  There is also a
couple of weeks ahead until the next minor release which is on the 10th
of May, so there is some room ahead:
https://www.postgresql.org/developer/roadmap/
--
Michael

Attachment

Re: Fix for pg_stat_activity putting client hostaddr into appnamefield

From
Heikki Linnakangas
Date:
On 29/03/18 10:46, Michael Paquier wrote:
> On Tue, Mar 27, 2018 at 03:47:07PM +1300, Edmund Horner wrote:
>> I considered whether aux processes really need those strings
>> (especially st_clienthostname), but decided it was more consistent
>> just to assume that they might.  (It's an extra 3 kB... if we want to
>> save that we can put a bunch of if statements in pgstat_bestart and
>> other places.)
> 
> BackendStatusShmemSize has been originally changed to use
> NumBackendStatSlots, so the intention is visibly to be consistent in the
> number of backends used, even if this means consuming a bit more memory,
> so the idea was to focus on consistency and simplicity.

Yep, agreed.

I've committed this, and backpatched to v10.

>> The patch also allocates local memory for st_clienthostname in
>> pgstat_read_current_status.  These strings shouldn't change very
>> often, but it seems safer and more consistent to treat them as we
>> treat the other two string fields.  Without the string copy, I think a
>> client disconnect and be replaced after the stats have been copied,
>> resulting in the new hostname appearing alongside the copied stats
>> fields from the old connection.
> 
> Agreed on that as well.

Yeah, that's also a bug. I pushed your fix for that as a separate 
commit, and backpatched to all supported versions.

Thanks for the debugging and the patch, Edmund!

- Heikki


Re: Fix for pg_stat_activity putting client hostaddr into appnamefield

From
Michael Paquier
Date:
On Thu, Apr 12, 2018 at 12:01:29AM +0300, Heikki Linnakangas wrote:
> Thanks for the debugging and the patch, Edmund!

Thanks for the commit, Heikki.
--
Michael

Attachment