Re: BUG #10123: Weird entries in pg_stat_activity - Mailing list pgsql-bugs

From Maxim Boguk
Subject Re: BUG #10123: Weird entries in pg_stat_activity
Date
Msg-id CAK-MWwSVCAX9m4S4C948gZ-DfKFLpho=fEoq1AJ8F17tzCTvmg@mail.gmail.com
Whole thread Raw
In response to Re: BUG #10123: Weird entries in pg_stat_activity  (Maxim Boguk <maxim.boguk@gmail.com>)
List pgsql-bugs
On Fri, Apr 25, 2014 at 11:04 AM, Maxim Boguk <maxim.boguk@gmail.com> wrote:

>
>
>
> On Fri, Apr 25, 2014 at 4:38 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> maxim.boguk@gmail.com writes:
>> > During debug of the weird nagios database alerts about long
>> transactions I
>> > found that sometime pg_stat_activity contains very weird entries with
>> > xact_start NOT NULL and state='idle'...
>> > ...
>> > What's more: i tested that problem on another databases - and found they
>> > have same effect (so it doesn't look like a problem local to that single
>> > exact database).
>>
>> So can you provide a self-contained test case for this?  I tried to
>> reproduce it here without much success, and I don't see a code path
>> other than PREPARE TRANSACTION that could lead to this.
>>
>>                         regards, tom lane
>>
>
> I going to work on test case over weekend.
> Initial quick and dirty attempts to reproduce outside of complicated
> production environments provided no results yet.
>
>  What I almost sure now that reproducing of the bug requires a mix of very
> active connections to the database (with a lot of requests per second) with
> semi-idle connections.
> And that transient problem affects only these semi-idle connections.
>
> I will provide an update when I get more information.
>
> Kind Regards,
> Maksym
>


Unfortunately even if problem can be seen in at least 10 different project
and db-servers, creating reproducible test case had not been successful.
So I decided work with gdb on the real database.

At this moment I dig only to the pgstat.c :
pgstat_read_current_status(void) function.

gdb:

set print pretty on
b pgstat.c:2796 if ((localentry->st_procpid>0) &&
(localentry->st_xact_start_timestamp>0) &&
(localentry->st_state==STATE_IDLE))
cont

2796 line in 9.3.4 source is pgstat_read_current_status function just after
                memcpy(localentry, (char *) beentry,
sizeof(PgBackendStatus));

After short time running:
select pg_sleep(0.01); select now() as ts,now()-xact_start as tx_age,
now()-state_change as change_age,* from pg_stat_activity where state='idle'
and xact_start is not null;
in loop in process under gdb I got conditional breakpoint fired with the
next results:

(gdb) cont
Continuing.

Breakpoint 1, pgstat_read_current_status () at
/tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/postmaster/pgstat.c:2796
2796                                    strcpy(localappname, (char *)
beentry->st_appname);
(gdb) p *localentry
$1 = {
  st_changecount = 1222944,
  st_procpid = 16215,
  st_proc_start_timestamp = 454324143169490,
  st_xact_start_timestamp = 454324206165363,
  st_activity_start_timestamp = 454324206165363,
  st_state_start_timestamp = 454324206165650,
  st_databaseid = 16423,
  st_userid = 25078444,
  st_clientaddr = {
    addr = {
      ss_family = 2,
      __ss_align = 0,
      __ss_padding = '\000' <repeats 111 times>
    },
    salen = 16
  },
  st_clienthostname = 0x7fac4344fbe0 "",
  st_waiting = 0 '\000',
  st_state = STATE_IDLE,
  st_appname = 0x7fac624d8ff0 "",
  st_activity = 0x7fac624e4bb0 "SELECT
\"last_host_hour_accumulation_flag\".\"id\",
\"last_host_hour_accumulation_flag\".\"day\",
\"last_host_hour_accumulation_flag\".\"hour\",
\"last_host_hour_accumulation_flag\".\"type_id\", \"last_host_hour_accum"...
}

Please note that the:
  st_xact_start_timestamp = 454324206165363,
  st_activity_start_timestamp = 454324206165363,

and in the same time:
st_state = STATE_IDLE

On the next iteration after 0.01 second   st_proc_start_timestamp and
st_activity_start_timestamp returned to 0 again.

Now I have no good idea how to perform the future debug because
BackendStatusArray is shared memory structure with every backend writing
information into it (so I can't set a conditional breakpoint in specific
process)
Ideas welcome.

Kind Regards,
Maksym

pgsql-bugs by date:

Previous
From: "Paragon Corporation"
Date:
Subject: Re: BUG #10436: Can't compile 9.4 with this mingw64-w32 gcc 4.8.1, 4.8.2, 4.8.3
Next
From: Tom Lane
Date:
Subject: Re: BUG #10436: Can't compile 9.4 with this mingw64-w32 gcc 4.8.1, 4.8.2, 4.8.3