Thread: weird long time query

weird long time query

From
Kaijiang Chen
Date:
I'm using postgres 9.4.17 on centos 7.
I check the running queries with the following SQL:
SELECT
    procpid,  
    start,  
    now() - start AS lap,  
    current_query  
FROM
    (SELECT
        backendid,  
        pg_stat_get_backend_pid(S.backendid) AS procpid,  
        pg_stat_get_backend_activity_start(S.backendid) AS start,  
        pg_stat_get_backend_activity(S.backendid) AS current_query  
    FROM
        (SELECT pg_stat_get_backend_idset() AS backendid) AS S  
    ) AS S  
WHERE
    current_query <> '<IDLE>'
ORDER BY
    lap DESC;

Then, I found a SQL that has run for some days (and still running):
procpid       | 32638
start         | 2019-11-25 16:29:29.529318+08
lap           | 21 days 18:24:54.707369
current_query | DEALLOCATE pdo_stmt_00000388

I tried to kill it with: SELECT pg_cancel_backend(32638) but it takes no effects.

What's this query and what shall I do for it?

Best Wishes
Kaijiang

Re: weird long time query

From
Pavel Stehule
Date:


út 17. 12. 2019 v 11:45 odesílatel Kaijiang Chen <chenkaijiang@gmail.com> napsal:
I'm using postgres 9.4.17 on centos 7.
I check the running queries with the following SQL:
SELECT
    procpid,  
    start,  
    now() - start AS lap,  
    current_query  
FROM
    (SELECT
        backendid,  
        pg_stat_get_backend_pid(S.backendid) AS procpid,  
        pg_stat_get_backend_activity_start(S.backendid) AS start,  
        pg_stat_get_backend_activity(S.backendid) AS current_query  
    FROM
        (SELECT pg_stat_get_backend_idset() AS backendid) AS S  
    ) AS S  
WHERE
    current_query <> '<IDLE>'
ORDER BY
    lap DESC;

I think so this query is weird - probably this query was finished

you should to use constraint

WHERE state <> 'idle';

Regards

Pavel


Then, I found a SQL that has run for some days (and still running):
procpid       | 32638
start         | 2019-11-25 16:29:29.529318+08
lap           | 21 days 18:24:54.707369
current_query | DEALLOCATE pdo_stmt_00000388

I tried to kill it with: SELECT pg_cancel_backend(32638) but it takes no effects.

What's this query and what shall I do for it?

Best Wishes
Kaijiang

Re: weird long time query

From
Tom Lane
Date:
Kaijiang Chen <chenkaijiang@gmail.com> writes:
> I'm using postgres 9.4.17 on centos 7.
> I check the running queries with the following SQL:
> SELECT
>     procpid,
>     start,
>     now() - start AS lap,
>     current_query
> FROM
>     (SELECT
>         backendid,
>         pg_stat_get_backend_pid(S.backendid) AS procpid,
>         pg_stat_get_backend_activity_start(S.backendid) AS start,
>         pg_stat_get_backend_activity(S.backendid) AS current_query
>     FROM
>         (SELECT pg_stat_get_backend_idset() AS backendid) AS S
>     ) AS S
> WHERE
>     current_query <> '<IDLE>'
> ORDER BY
>     lap DESC;

Don't know where you got this query from, but it's wrong for any PG
version more recent than (I think) 9.1.  We don't use "<IDLE>" as an
indicator of idle sessions anymore; rather, those can be identified
by having state = 'idle'.  What's in the query column for such a session
is its last query.

> Then, I found a SQL that has run for some days (and still running):
> procpid       | 32638
> start         | 2019-11-25 16:29:29.529318+08
> lap           | 21 days 18:24:54.707369
> current_query | DEALLOCATE pdo_stmt_00000388

It's not running.  That was the last query it ran, back in November :-(
You could zap the session with pg_terminate_backend(), but
pg_cancel_backend() is not going to have any effect because there's
no active query.

            regards, tom lane



Re: weird long time query

From
Kaijiang Chen
Date:
I think I should also report it as a bug since logically, it couldn't exist.

On Wed, Dec 18, 2019 at 1:04 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Kaijiang Chen <chenkaijiang@gmail.com> writes:
> I'm using postgres 9.4.17 on centos 7.
> I check the running queries with the following SQL:
> SELECT
>     procpid,
>     start,
>     now() - start AS lap,
>     current_query
> FROM
>     (SELECT
>         backendid,
>         pg_stat_get_backend_pid(S.backendid) AS procpid,
>         pg_stat_get_backend_activity_start(S.backendid) AS start,
>         pg_stat_get_backend_activity(S.backendid) AS current_query
>     FROM
>         (SELECT pg_stat_get_backend_idset() AS backendid) AS S
>     ) AS S
> WHERE
>     current_query <> '<IDLE>'
> ORDER BY
>     lap DESC;

Don't know where you got this query from, but it's wrong for any PG
version more recent than (I think) 9.1.  We don't use "<IDLE>" as an
indicator of idle sessions anymore; rather, those can be identified
by having state = 'idle'.  What's in the query column for such a session
is its last query.

> Then, I found a SQL that has run for some days (and still running):
> procpid       | 32638
> start         | 2019-11-25 16:29:29.529318+08
> lap           | 21 days 18:24:54.707369
> current_query | DEALLOCATE pdo_stmt_00000388

It's not running.  That was the last query it ran, back in November :-(
You could zap the session with pg_terminate_backend(), but
pg_cancel_backend() is not going to have any effect because there's
no active query.

                        regards, tom lane

How to prevent POSTGRES killing linux system from accepting too much inserts?

From
"James(王旭)"
Date:
Hello,
I encountered into this kernel message, and I cannot login into the Linux system anymore:


Dec 17 23:01:50 hq-pg kernel: sh (6563): drop_caches: 1
Dec 17 23:02:30 hq-pg kernel: INFO: task sync:6573 blocked for more than 120 seconds.
Dec 17 23:02:30 hq-pg kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 17 23:02:30 hq-pg kernel: sync            D ffff965ebabd1040     0  6573   6572 0x00000080
Dec 17 23:02:30 hq-pg kernel: Call Trace:
Dec 17 23:02:30 hq-pg kernel: [<ffffffffa48760a0>] ? generic_write_sync+0x70/0x70

After some google I guess it's the problem that IO speed is low, while the insert requests are coming too much quickly.So PG put these into cache first then kernel called sync.
I know I can queue the requests, so that POSTGRES will not accept these requests which will result in an increase in system cache.
But is there any way I can tell POSTGRES, that you can only handle 20000 records per second, or 4M per second, please don't accept inserts more than that speed.
For me, POSTGRES just waiting is much better than current behavior.

Any help will be much appreciated.


Thanks,
James

How to prevent POSTGRES killing linux system from accepting too much inserts?

From
"James(王旭)"
Date:
Hello,
I encountered into this kernel message, and I cannot login into the Linux system anymore:


Dec 17 23:01:50 hq-pg kernel: sh (6563): drop_caches: 1
Dec 17 23:02:30 hq-pg kernel: INFO: task sync:6573 blocked for more than 120 seconds.
Dec 17 23:02:30 hq-pg kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 17 23:02:30 hq-pg kernel: sync            D ffff965ebabd1040     0  6573   6572 0x00000080
Dec 17 23:02:30 hq-pg kernel: Call Trace:
Dec 17 23:02:30 hq-pg kernel: [<ffffffffa48760a0>] ? generic_write_sync+0x70/0x70

After some google I guess it's the problem that IO speed is low, while the insert requests are coming too much quickly.So PG put these into cache first then kernel called sync.
I know I can queue the requests, so that POSTGRES will not accept these requests which will result in an increase in system cache.
But is there any way I can tell POSTGRES, that you can only handle 20000 records per second, or 4M per second, please don't accept inserts more than that speed.
For me, POSTGRES just waiting is much better than current behavior.

Any help will be much appreciated.


Thanks,
James

Re: How to prevent POSTGRES killing linux system from accepting toomuch inserts?

From
Merlin Moncure
Date:
On Wed, Dec 18, 2019 at 3:53 AM James(王旭) <wangxu@gu360.com> wrote:
>
> Hello,
>>
>> I encountered into this kernel message, and I cannot login into the Linux system anymore:
>>
>>
>>
>>> Dec 17 23:01:50 hq-pg kernel: sh (6563): drop_caches: 1
>>>
>>> Dec 17 23:02:30 hq-pg kernel: INFO: task sync:6573 blocked for more than 120 seconds.
>>>
>>> Dec 17 23:02:30 hq-pg kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>
>>> Dec 17 23:02:30 hq-pg kernel: sync            D ffff965ebabd1040     0  6573   6572 0x00000080
>>>
>>> Dec 17 23:02:30 hq-pg kernel: Call Trace:
>>>
>>> Dec 17 23:02:30 hq-pg kernel: [<ffffffffa48760a0>] ? generic_write_sync+0x70/0x70
>>
>>
>> After some google I guess it's the problem that IO speed is low, while the insert requests are coming too much
quickly.SoPG put these into cache first then kernel called sync. 
>>
>> I know I can queue the requests, so that POSTGRES will not accept these requests which will result in an increase in
systemcache. 
>>
>> But is there any way I can tell POSTGRES, that you can only handle 20000 records per second, or 4M per second,
pleasedon't accept inserts more than that speed. 
>>
>> For me, POSTGRES just waiting is much better than current behavior.
>>
>>
>> Any help will be much appreciated.

This is more a problem with the o/s than with postgres itself.

synchronous_commit is one influential parameter that can possibly help
mitigate the issue with some safety tradeoffs (read the docs).   For
linux, one possible place to look is tuning dirty_background_ratio and
related parameters.  The idea is you want the o/s to be more
aggressive about syncing to reduce the impact of i/o storm; basically
you are trading off some burst performance for consistency of
performance.  Another place to look is checkpoint behavior.   Do some
searches, there is tons of information about this on the net.

merlin



Re: How to prevent POSTGRES killing linux system from accepting toomuch inserts?

From
Merlin Moncure
Date:
On Wed, Dec 18, 2019 at 3:53 AM James(王旭) <wangxu@gu360.com> wrote:
>
> Hello,
>>
>> I encountered into this kernel message, and I cannot login into the Linux system anymore:
>>
>>
>>
>>> Dec 17 23:01:50 hq-pg kernel: sh (6563): drop_caches: 1
>>>
>>> Dec 17 23:02:30 hq-pg kernel: INFO: task sync:6573 blocked for more than 120 seconds.
>>>
>>> Dec 17 23:02:30 hq-pg kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>
>>> Dec 17 23:02:30 hq-pg kernel: sync            D ffff965ebabd1040     0  6573   6572 0x00000080
>>>
>>> Dec 17 23:02:30 hq-pg kernel: Call Trace:
>>>
>>> Dec 17 23:02:30 hq-pg kernel: [<ffffffffa48760a0>] ? generic_write_sync+0x70/0x70
>>
>>
>> After some google I guess it's the problem that IO speed is low, while the insert requests are coming too much
quickly.SoPG put these into cache first then kernel called sync. 
>>
>> I know I can queue the requests, so that POSTGRES will not accept these requests which will result in an increase in
systemcache. 
>>
>> But is there any way I can tell POSTGRES, that you can only handle 20000 records per second, or 4M per second,
pleasedon't accept inserts more than that speed. 
>>
>> For me, POSTGRES just waiting is much better than current behavior.
>>
>>
>> Any help will be much appreciated.

This is more a problem with the o/s than with postgres itself.

synchronous_commit is one influential parameter that can possibly help
mitigate the issue with some safety tradeoffs (read the docs).   For
linux, one possible place to look is tuning dirty_background_ratio and
related parameters.  The idea is you want the o/s to be more
aggressive about syncing to reduce the impact of i/o storm; basically
you are trading off some burst performance for consistency of
performance.  Another place to look is checkpoint behavior.   Do some
searches, there is tons of information about this on the net.

merlin



On Wed, Dec 18, 2019 at 4:53 AM James(王旭) <wangxu@gu360.com> wrote:
Hello,
I encountered into this kernel message, and I cannot login into the Linux system anymore:


Dec 17 23:01:50 hq-pg kernel: sh (6563): drop_caches: 1
Dec 17 23:02:30 hq-pg kernel: INFO: task sync:6573 blocked for more than 120 seconds.
Dec 17 23:02:30 hq-pg kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 17 23:02:30 hq-pg kernel: sync            D ffff965ebabd1040     0  6573   6572 0x00000080
Dec 17 23:02:30 hq-pg kernel: Call Trace:
Dec 17 23:02:30 hq-pg kernel: [<ffffffffa48760a0>] ? generic_write_sync+0x70/0x70

After some google I guess it's the problem that IO speed is low, while the insert requests are coming too much quickly.So PG put these into cache first then kernel called sync

Could you expand on what you found in the googling, with links?  I've never seen these in my kernel log, and I don't know what they mean other than the obvious that it is something to do with IO.  Also, what kernel and file system are you using?
 
.
I know I can queue the requests, so that POSTGRES will not accept these requests which will result in an increase in system cache.
But is there any way I can tell POSTGRES, that you can only handle 20000 records per second, or 4M per second, please don't accept inserts more than that speed.
For me, POSTGRES just waiting is much better than current behavior.

I don't believe there is a setting from within PostgreSQL to do this.

There was a proposal for a throttle on WAL generation back in February, but with no recent discussion or (visible) progress:


I think the real answer here to get a better IO system, or maybe a better kernel.  Otherwise, once you find a painful workaround for one symptom you will just smack into another one.

Cheers,

Jeff
On Wed, Dec 18, 2019 at 4:53 AM James(王旭) <wangxu@gu360.com> wrote:
Hello,
I encountered into this kernel message, and I cannot login into the Linux system anymore:


Dec 17 23:01:50 hq-pg kernel: sh (6563): drop_caches: 1
Dec 17 23:02:30 hq-pg kernel: INFO: task sync:6573 blocked for more than 120 seconds.
Dec 17 23:02:30 hq-pg kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 17 23:02:30 hq-pg kernel: sync            D ffff965ebabd1040     0  6573   6572 0x00000080
Dec 17 23:02:30 hq-pg kernel: Call Trace:
Dec 17 23:02:30 hq-pg kernel: [<ffffffffa48760a0>] ? generic_write_sync+0x70/0x70

After some google I guess it's the problem that IO speed is low, while the insert requests are coming too much quickly.So PG put these into cache first then kernel called sync

Could you expand on what you found in the googling, with links?  I've never seen these in my kernel log, and I don't know what they mean other than the obvious that it is something to do with IO.  Also, what kernel and file system are you using?
 
.
I know I can queue the requests, so that POSTGRES will not accept these requests which will result in an increase in system cache.
But is there any way I can tell POSTGRES, that you can only handle 20000 records per second, or 4M per second, please don't accept inserts more than that speed.
For me, POSTGRES just waiting is much better than current behavior.

I don't believe there is a setting from within PostgreSQL to do this.

There was a proposal for a throttle on WAL generation back in February, but with no recent discussion or (visible) progress:


I think the real answer here to get a better IO system, or maybe a better kernel.  Otherwise, once you find a painful workaround for one symptom you will just smack into another one.

Cheers,

Jeff

Re: How to prevent POSTGRES killing linux system from accepting toomuch inserts?

From
Osahon Oduware
Date:
THE TRUTH CANNOT BE HIDDEN
**Explosion in my car.....

On Wed, 18 Dec 2019, 10:54 James(王旭), <wangxu@gu360.com> wrote:
Hello,
I encountered into this kernel message, and I cannot login into the Linux system anymore:


Dec 17 23:01:50 hq-pg kernel: sh (6563): drop_caches: 1
Dec 17 23:02:30 hq-pg kernel: INFO: task sync:6573 blocked for more than 120 seconds.
Dec 17 23:02:30 hq-pg kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 17 23:02:30 hq-pg kernel: sync            D ffff965ebabd1040     0  6573   6572 0x00000080
Dec 17 23:02:30 hq-pg kernel: Call Trace:
Dec 17 23:02:30 hq-pg kernel: [<ffffffffa48760a0>] ? generic_write_sync+0x70/0x70

After some google I guess it's the problem that IO speed is low, while the insert requests are coming too much quickly.So PG put these into cache first then kernel called sync.
I know I can queue the requests, so that POSTGRES will not accept these requests which will result in an increase in system cache.
But is there any way I can tell POSTGRES, that you can only handle 20000 records per second, or 4M per second, please don't accept inserts more than that speed.
For me, POSTGRES just waiting is much better than current behavior.

Any help will be much appreciated.


Thanks,
James

Re: How to prevent POSTGRES killing linux system from accepting toomuch inserts?

From
Osahon Oduware
Date:
THE TRUTH CANNOT BE HIDDEN
**Explosion in my car.....

On Wed, 18 Dec 2019, 10:54 James(王旭), <wangxu@gu360.com> wrote:
Hello,
I encountered into this kernel message, and I cannot login into the Linux system anymore:


Dec 17 23:01:50 hq-pg kernel: sh (6563): drop_caches: 1
Dec 17 23:02:30 hq-pg kernel: INFO: task sync:6573 blocked for more than 120 seconds.
Dec 17 23:02:30 hq-pg kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 17 23:02:30 hq-pg kernel: sync            D ffff965ebabd1040     0  6573   6572 0x00000080
Dec 17 23:02:30 hq-pg kernel: Call Trace:
Dec 17 23:02:30 hq-pg kernel: [<ffffffffa48760a0>] ? generic_write_sync+0x70/0x70

After some google I guess it's the problem that IO speed is low, while the insert requests are coming too much quickly.So PG put these into cache first then kernel called sync.
I know I can queue the requests, so that POSTGRES will not accept these requests which will result in an increase in system cache.
But is there any way I can tell POSTGRES, that you can only handle 20000 records per second, or 4M per second, please don't accept inserts more than that speed.
For me, POSTGRES just waiting is much better than current behavior.

Any help will be much appreciated.


Thanks,
James