Thread: Selecting large tables gets killed
Hi All,
Here is a strange behaviour with master branch with head at
commit d3c4c471553265e7517be24bae64b81967f6df40
Author: Peter Eisentraut <peter_e@gmx.net>
Date: Mon Feb 10 21:47:19 2014 -0500
The OS is
[ashutosh@ubuntu repro]uname -a
Linux ubuntu 3.2.0-59-generic #90-Ubuntu SMP Tue Jan 7 22:43:51 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
This is a VM hosted on Mac-OS X 10.7.5Here is a strange behaviour with master branch with head at
commit d3c4c471553265e7517be24bae64b81967f6df40
Author: Peter Eisentraut <peter_e@gmx.net>
Date: Mon Feb 10 21:47:19 2014 -0500
The OS is
[ashutosh@ubuntu repro]uname -a
Linux ubuntu 3.2.0-59-generic #90-Ubuntu SMP Tue Jan 7 22:43:51 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
[ashutosh@ubuntu repro]cat big_select_killed.sql
drop table big_tab;
create table big_tab (val int, val2 int, str varchar);
insert into big_tab select x, x, lpad('string', 100, x::text)
from generate_series(1, 10000000) x;
select * from big_tab;
The last select causes the "Killed" message.
[ashutosh@ubuntu repro]psql -d postgres -f big_select_killed.sql
DROP TABLE
CREATE TABLE
INSERT 0 10000000
Killed
There is a message in server log
FATAL: connection to client lost
STATEMENT: select * from big_tab;
Any SELECT selecting all the rows is getting psql killed but not SELECT count(*)
[ashutosh@ubuntu repro]psql -d postgres
psql (9.4devel)
Type "help" for help.
postgres=# select count(*) from big_tab;
count
----------
10000000
(1 row)
postgres=# select * from big_tab;
Killed
[ashutosh@ubuntu repro]psql -d postgres
psql (9.4devel)
Type "help" for help.
Below is the buffer cache size and the relation size (if anyone cares)
postgres=# show shared_buffers;
shared_buffers
----------------
128MB
(1 row)
postgres=# select pg_relation_size('big_tab'::regclass);
pg_relation_size
------------------
1412415488
(1 row)
postgres=# select pg_relation_size('big_tab'::regclass)/1024/1024; -- IN MBs to be simple
?column?
----------
1346
(1 row)
drop table big_tab;
create table big_tab (val int, val2 int, str varchar);
insert into big_tab select x, x, lpad('string', 100, x::text)
from generate_series(1, 10000000) x;
select * from big_tab;
The last select causes the "Killed" message.
[ashutosh@ubuntu repro]psql -d postgres -f big_select_killed.sql
DROP TABLE
CREATE TABLE
INSERT 0 10000000
Killed
There is a message in server log
FATAL: connection to client lost
STATEMENT: select * from big_tab;
Any SELECT selecting all the rows is getting psql killed but not SELECT count(*)
[ashutosh@ubuntu repro]psql -d postgres
psql (9.4devel)
Type "help" for help.
postgres=# select count(*) from big_tab;
count
----------
10000000
(1 row)
postgres=# select * from big_tab;
Killed
[ashutosh@ubuntu repro]psql -d postgres
psql (9.4devel)
Type "help" for help.
Below is the buffer cache size and the relation size (if anyone cares)
postgres=# show shared_buffers;
shared_buffers
----------------
128MB
(1 row)
postgres=# select pg_relation_size('big_tab'::regclass);
pg_relation_size
------------------
1412415488
(1 row)
postgres=# select pg_relation_size('big_tab'::regclass)/1024/1024; -- IN MBs to be simple
?column?
----------
1346
(1 row)
There are no changes in default configuration. Using unix sockets
[ashutosh@ubuntu repro]ls /tmp/.s.PGSQL.5432*
/tmp/.s.PGSQL.5432 /tmp/.s.PGSQL.5432.lock
[ashutosh@ubuntu repro]ls /tmp/.s.PGSQL.5432*
/tmp/.s.PGSQL.5432 /tmp/.s.PGSQL.5432.lock
Looks like a bug in psql to me. Does anybody see that behaviour?
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Hi,
I tried reproduce this bug on CENTOS 6.4 as well as on UBUNTU 13.04.
My Base machine is Window 7 and CentOs, Ubuntu is in VM.
CENTOS :
[amul@localhost postgresql]$ uname -a
Linux localhost.localdomain 2.6.32-358.6.1.el6.x86_64 #1 SMP Tue Apr 23 19:29:00 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
UBUNTU:
[amul@localhost postgresql]$ uname -a
Linux localhost.localdomain 2.6.32-358.6.1.el6.x86_64 #1 SMP Tue Apr 23 19:29:00 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
I didn't face any kill problem on both VM machine.
Even by making this table(big_tab) bigger.
.
my select output some how as follow
.
postgres=# select * from big_tab;
val | val2 | str
----------+----------+------------------------------------------------------------------------------------------------------
1 | 1 | 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111string
2 | 2 | 2222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222string
.
.
<skipped>
And other info
amul@amul:~/work/postgresql$ psql postgres
I installed from HEAD(ae5266f25910d6e084692a7cdbd02b9e52800046)
I failed to reproduce it, do I missing something?
Regards,
Amul Sul
I am sorry,
My Ubuntu info was wrong in previous mail, correct one as follow
>UBUNTU:
>[amul@localhost postgresql]$ uname -a
>Linux localhost.localdomain 2.6.32-358.6.1.el6.x86_64 #1 SMP Tue Apr 23 19:29:00 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
amul@amul:~/work/postgresql$ uname -a
Linux amul 3.11.0-12-generic #19-Ubuntu SMP Wed Oct 9 16:20:46 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
Regards,
Amul Sul
I found a very simple repro on my machine
postgres=# select x, x, lpad('string', 100, x::text) from generate_series(1, 10000000) x;
Killed
So this is just about fetching huge data through psql.
But if I reduce the number of rows by 10 times, it gives result without getting killed.
[ashutosh@ubuntu repro]psql -d postgres
psql (9.4devel)
Type "help" for help.
postgres=# select x, x, lpad('string', 100, x::text) from generate_series(1, 1000000) x;
x | x | lpad
---------+---------+------------------------------------------------------------------------------------------------------
1 | 1 | 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111string
postgres=# select x, x, lpad('string', 100, x::text) from generate_series(1, 10000000) x;
Killed
So this is just about fetching huge data through psql.
But if I reduce the number of rows by 10 times, it gives result without getting killed.
[ashutosh@ubuntu repro]psql -d postgres
psql (9.4devel)
Type "help" for help.
postgres=# select x, x, lpad('string', 100, x::text) from generate_series(1, 1000000) x;
x | x | lpad
---------+---------+------------------------------------------------------------------------------------------------------
1 | 1 | 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111string
May be each setup has it's own breaking point. So trying with larger number might reproduce the issue.
I tried to debug it with gdb, but all it showed me was that psql received a SIGKILL signal. I am not sure why.
I tried to debug it with gdb, but all it showed me was that psql received a SIGKILL signal. I am not sure why.
On Thu, Feb 20, 2014 at 2:13 PM, amul sul <sul_amul@yahoo.co.in> wrote:
Hi,I tried reproduce this bug on CENTOS 6.4 as well as on UBUNTU 13.04.My Base machine is Window 7 and CentOs, Ubuntu is in VM.CENTOS :[amul@localhost postgresql]$ uname -aLinux localhost.localdomain 2.6.32-358.6.1.el6.x86_64 #1 SMP Tue Apr 23 19:29:00 UTC 2013 x86_64 x86_64 x86_64 GNU/LinuxUBUNTU:[amul@localhost postgresql]$ uname -aLinux localhost.localdomain 2.6.32-358.6.1.el6.x86_64 #1 SMP Tue Apr 23 19:29:00 UTC 2013 x86_64 x86_64 x86_64 GNU/LinuxI didn't face any kill problem on both VM machine.Even by making this table(big_tab) bigger..my select output some how as follow.postgres=# select * from big_tab;val | val2 | str----------+----------+------------------------------------------------------------------------------------------------------1 | 1 | 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111string2 | 2 | 2222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222string..<skipped>And other infoamul@amul:~/work/postgresql$ psql postgresI installed from HEAD(ae5266f25910d6e084692a7cdbd02b9e52800046)I failed to reproduce it, do I missing something?Regards,
Amul Sul
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
2014-02-20 16:16 GMT+09:00 Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>: > Hi All, > Here is a strange behaviour with master branch with head at (...) > Looks like a bug in psql to me. Does anybody see that behaviour? It's not a bug, it's your VM's OS killing off a process which is using up too much memory. Check /var/log/messages to see what the kernel has to say about it. Regards Ian Barwick
On Thu, Feb 20, 2014 at 2:32 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:
-- May be each setup has it's own breaking point. So trying with larger number might reproduce the issue.
I tried to debug it with gdb, but all it showed me was that psql received a SIGKILL signal. I am not sure why.
Is the psql process running out of memory ? AFAIK OOM killer sends SIGKILL.
Thanks,
Pavan
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee
Ian, Pavan,
That's correct, OS is killing the processYou are correct, the OS is killing the process
3766 Feb 20 14:30:14 ubuntu kernel: [23820.175868] Out of memory: Kill process 34080 (psql) score 756 or sacrifice child
3767 Feb 20 14:30:14 ubuntu kernel: [23820.175871] Killed process 34080 (psql) total-vm:1644712kB, anon-rss:820336kB, file-rss:0kB
psql documentation talks about a special variable FETCH_COUNT
--
FETCH_COUNT
If this variable is set to an integer value > 0, the results of SELECT queries are fetched and displayed in groups of that many rows, rather than the default behavior of collecting the entire result set before display. Therefore only a limited amount of memory is used, regardless of the size of the result set. Settings of 100 to 1000 are commonly used when enabling this feature. Keep in mind that when using this feature, a query might fail after having already displayed some rows.
--
FETCH_COUNT
If this variable is set to an integer value > 0, the results of SELECT queries are fetched and displayed in groups of that many rows, rather than the default behavior of collecting the entire result set before display. Therefore only a limited amount of memory is used, regardless of the size of the result set. Settings of 100 to 1000 are commonly used when enabling this feature. Keep in mind that when using this feature, a query might fail after having already displayed some rows.
--
If I set some positive value for this variable, psql runs smoothly with any size of data. But unset that variable, and it gets killed. But it's nowhere written explicitly that psql can run out of memory while collecting the result set. Either the documentation or the behaviour should be modified.
On Thu, Feb 20, 2014 at 2:35 PM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
--On Thu, Feb 20, 2014 at 2:32 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:May be each setup has it's own breaking point. So trying with larger number might reproduce the issue.
I tried to debug it with gdb, but all it showed me was that psql received a SIGKILL signal. I am not sure why.Is the psql process running out of memory ? AFAIK OOM killer sends SIGKILL.Thanks,Pavan
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
--On 20. Februar 2014 14:49:28 +0530 Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: > If I set some positive value for this variable, psql runs smoothly with > any size of data. But unset that variable, and it gets killed. But it's > nowhere written explicitly that psql can run out of memory while > collecting the result set. Either the documentation or the behaviour > should be modified. Maybe somewhere in the future we should consider single row mode for psql, see <http://www.postgresql.org/docs/9.3/static/libpq-single-row-mode.html> However, i think nobody has tackled this yet, afair. -- Thanks Bernd
On Thu, Feb 20, 2014 at 3:26 PM, Bernd Helmle <mailings@oopsware.de> wrote:
Maybe somewhere in the future we should consider single row mode for psql, see
--On 20. Februar 2014 14:49:28 +0530 Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:If I set some positive value for this variable, psql runs smoothly with
any size of data. But unset that variable, and it gets killed. But it's
nowhere written explicitly that psql can run out of memory while
collecting the result set. Either the documentation or the behaviour
should be modified.
<http://www.postgresql.org/docs/9.3/static/libpq-single-row-mode.html>
That seems a good idea. We will get rid of FETCH_COUNT then, wouldn't we?
However, i think nobody has tackled this yet, afair.
--
Thanks
Bernd
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
On Thu, Feb 20, 2014 at 12:07 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: > That seems a good idea. We will get rid of FETCH_COUNT then, wouldn't we? No, I don't think we want to do that. FETCH_COUNT values greater than 1 are still useful to get reasonably tabulated output without hogging too much memory. For example: db=# \set FETCH_COUNT 3 db=# select repeat('a', i) a, 'x'x from generate_series(1,9)i; a | x -----+---a | xaa | xaaa | xaaaa | xaaaaa | xaaaaaa | xaaaaaaa | xaaaaaaaa | xaaaaaaaaa | x Regards, Marti
Marti Raudsepp <marti@juffo.org> writes: > On Thu, Feb 20, 2014 at 12:07 PM, Ashutosh Bapat > <ashutosh.bapat@enterprisedb.com> wrote: >> That seems a good idea. We will get rid of FETCH_COUNT then, wouldn't we? > No, I don't think we want to do that. FETCH_COUNT values greater than > 1 are still useful to get reasonably tabulated output without hogging > too much memory. Yeah. The other reason that you can't just transparently change the behavior is error handling: people are used to seeing either all or none of the output of a query. In single-row mode that guarantee fails, since some rows might get output before the server detects an error. regards, tom lane
--On 20. Februar 2014 09:51:47 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yeah. The other reason that you can't just transparently change the > behavior is error handling: people are used to seeing either all or > none of the output of a query. In single-row mode that guarantee > fails, since some rows might get output before the server detects > an error. That's true. I'd never envisioned to this transparently either, exactly of this reason. However, i find to have single row mode somewhere has some attractiveness, be it only to have some code around that shows how to do it right. But i fear we might complicate things in psql beyond what we really want. -- Thanks Bernd
On Thu, Feb 20, 2014 at 9:00 PM, Bernd Helmle <mailings@oopsware.de> wrote:
That's true. I'd never envisioned to this transparently either, exactly of this reason. However, i find to have single row mode somewhere has some attractiveness, be it only to have some code around that shows how to do it right. But i fear we might complicate things in psql beyond what we really want.
--On 20. Februar 2014 09:51:47 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote:Yeah. The other reason that you can't just transparently change the
behavior is error handling: people are used to seeing either all or
none of the output of a query. In single-row mode that guarantee
fails, since some rows might get output before the server detects
an error.
Yes. Fixing this bug doesn't seem to be worth the code complexity it will add, esp. when the work around exists.
OR, other option is when sufficiently large output is encountered (larger than some predefined value MAX_ROWS or something), psql behaves as if FETCH_COUNT is set to MAX_ROWS. Documenting this behaviour wouldn't be a problem and would not be a problem, I guess.
--
Thanks
Bernd
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company