Thread: Why is this query running slowly?

Why is this query running slowly?

From
Yang Zhang
Date:
I have a simple query that's been running for a while, which is fine,
but it seems to be running very slowly, which is a problem:

mydb=# explain select user_id from den where user_id not in (select
duid from user_mappings) and timestamp between '2009-04-01' and
'2010-04-01';

           QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on den  (cost=711.58..66062724212.74 rows=22634720 width=4)
   Filter: (("timestamp" >= '2009-04-01 00:00:00'::timestamp without
time zone) AND ("timestamp" <= '2010-04-01 00:00:00'::timestamp
without time zone) AND (NOT (SubPlan 1)))
   SubPlan 1
     ->  Materialize  (cost=711.58..1223.38 rows=36780 width=4)
           ->  Seq Scan on user_mappings  (cost=0.00..530.80 rows=36780 width=4)

user_mappings is fairly small:

mydb=# select count(*) from user_mappings;
 count
-------
 36780
(1 row)

The client is on the same LAN, but only sees packets trickling in:

$ sudo tcpdump -i lo port 5434
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 96 bytes
00:44:05.475584 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
3757958746:3757966938, ack 3697417382, win 265, options [nop,nop,TS
val 194791416 ecr 194791044], length 8192
00:44:05.475684 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
8192, win 771, options [nop,nop,TS val 194791416 ecr 194791416],
length 0
00:44:08.867976 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
8192:15779, ack 1, win 265, options [nop,nop,TS val 194791756 ecr
194791416], length 7587
00:44:08.868019 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
15779, win 771, options [nop,nop,TS val 194791756 ecr 194791756],
length 0
00:44:08.870336 IP nuthouse.40349 > nuthouse.5434: Flags [P.], seq
1:19, ack 15779, win 771, options [nop,nop,TS val 194791756 ecr
194791756], length 18
00:44:08.870343 IP nuthouse.5434 > nuthouse.40349: Flags [.], ack 19,
win 265, options [nop,nop,TS val 194791756 ecr 194791756], length 0
00:44:12.578220 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
15779:23971, ack 19, win 265, options [nop,nop,TS val 194792127 ecr
194791756], length 8192
00:44:12.618116 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
23971, win 771, options [nop,nop,TS val 194792131 ecr 194792127],
length 0
00:44:16.664645 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
23971:32163, ack 19, win 265, options [nop,nop,TS val 194792535 ecr
194792131], length 8192
00:44:16.664755 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
32163, win 771, options [nop,nop,TS val 194792535 ecr 194792535],
length 0
00:44:20.465773 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
32163:40355, ack 19, win 265, options [nop,nop,TS val 194792915 ecr
194792535], length 8192
00:44:20.465878 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
40355, win 720, options [nop,nop,TS val 194792915 ecr 194792915],
length 0
00:44:24.115273 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
40355:48547, ack 19, win 265, options [nop,nop,TS val 194793280 ecr
194792915], length 8192
00:44:24.115380 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
48547, win 720, options [nop,nop,TS val 194793280 ecr 194793280],
length 0
00:44:27.749084 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
48547:56739, ack 19, win 265, options [nop,nop,TS val 194793644 ecr
194793280], length 8192
00:44:27.749192 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
56739, win 720, options [nop,nop,TS val 194793644 ecr 194793644],
length 0
00:44:31.618124 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
56739:64931, ack 19, win 265, options [nop,nop,TS val 194794031 ecr
194793644], length 8192
00:44:31.618166 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
64931, win 720, options [nop,nop,TS val 194794031 ecr 194794031],
length 0
00:44:35.349958 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
64931:73123, ack 19, win 265, options [nop,nop,TS val 194794404 ecr
194794031], length 8192
00:44:35.350054 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
73123, win 720, options [nop,nop,TS val 194794404 ecr 194794404],
length 0
00:44:39.311627 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
73123:81315, ack 19, win 265, options [nop,nop,TS val 194794800 ecr
194794404], length 8192
00:44:39.311734 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
81315, win 771, options [nop,nop,TS val 194794800 ecr 194794800],
length 0
00:44:43.178319 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
81315:89507, ack 19, win 265, options [nop,nop,TS val 194795187 ecr
194794800], length 8192
00:44:43.178429 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
89507, win 771, options [nop,nop,TS val 194795187 ecr 194795187],
length 0
00:44:46.798511 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
89507:97699, ack 19, win 265, options [nop,nop,TS val 194795549 ecr
194795187], length 8192
00:44:46.798622 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
97699, win 720, options [nop,nop,TS val 194795549 ecr 194795549],
length 0
00:44:50.430346 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
97699:105891, ack 19, win 265, options [nop,nop,TS val 194795912 ecr
194795549], length 8192
00:44:50.430455 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
105891, win 720, options [nop,nop,TS val 194795912 ecr 194795912],
length 0
00:44:54.211644 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
105891:114083, ack 19, win 265, options [nop,nop,TS val 194796290 ecr
194795912], length 8192
00:44:54.211792 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
114083, win 720, options [nop,nop,TS val 194796290 ecr 194796290],
length 0
00:44:57.948539 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
114083:122275, ack 19, win 265, options [nop,nop,TS val 194796664 ecr
194796290], length 8192
...

And I verified that the query is not blocked:

mydb=# select * from pg_stat_activity;
 datid | datname | procpid | usesysid | usename |
               current_query                                     |
waiting |          xact_start           |          query_start
 |         backend_start         | client_addr | client_port

-------+---------+---------+----------+---------+---------------------------------------------------------------------------------------+---------+-------------------------------+-------------------------------+-------------------------------+-------------+-------------
 55244 | mydb    |    5128 |    16384 | yang    |
                                                                 | f
    | 2011-09-14 15:44:46.586014-07 | 2011-09-15 00:46:22.878715-07 |
2011-09-14 15:44:46.549151-07 | ::1         |       56791
                                                :       select user_id
from den where user_id not in (select duid from user_mappings)
                                                :       and timestamp
between '2009-04-01'::date and '2010-04-01'::date
 55244 | mydb    |   28142 |    16384 | yang    | select * from
pg_stat_activity;
 | f       | 2011-09-15 00:46:40.100652-07 | 2011-09-15
00:46:40.100652-07 | 2011-09-15 00:46:35.913309-07 |             |
     -1
(2 rows)

(The ports don't match up since this is tunneled via ssh.  `pv <
/dev/zero | ssh nuthouse 'cat>/dev/null'` shows sustained bandwidth at
~30-40MB/s.)

I know that `den` is large but things seem to be running much slower
than I'd expect.  Nothing in the logs.  Is there any way to inspect
what's going on?  Hesitant to kill the query in case it's almost done,
though I doubt it (didn't have the foresight to expose this
information in the client process - wasn't expecting to run into
this).  Thanks in advance.

--
Yang Zhang
http://yz.mit.edu/

Re: Why is this query running slowly?

From
Yang Zhang
Date:
If it matters, the client is a JDBC app doing:

    con = DriverManager.getConnection("jdbc:postgresql://localhost:5434/mydb",
"yang", password)
    con.setAutoCommit(false)
    val st = con.prepareStatement("""
      select user_id from den where user_id not in (select duid from
user_mappings)
      and timestamp between '2009-04-01'::date and '2010-04-01'::date;
    """, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY)
    st.setFetchSize(8192)
    rs = st.executeQuery()

On Thu, Sep 15, 2011 at 12:53 AM, Yang Zhang <yanghatespam@gmail.com> wrote:
> I have a simple query that's been running for a while, which is fine,
> but it seems to be running very slowly, which is a problem:
>
> mydb=# explain select user_id from den where user_id not in (select
> duid from user_mappings) and timestamp between '2009-04-01' and
> '2010-04-01';
>
>           QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Seq Scan on den  (cost=711.58..66062724212.74 rows=22634720 width=4)
>   Filter: (("timestamp" >= '2009-04-01 00:00:00'::timestamp without
> time zone) AND ("timestamp" <= '2010-04-01 00:00:00'::timestamp
> without time zone) AND (NOT (SubPlan 1)))
>   SubPlan 1
>     ->  Materialize  (cost=711.58..1223.38 rows=36780 width=4)
>           ->  Seq Scan on user_mappings  (cost=0.00..530.80 rows=36780 width=4)
>
> user_mappings is fairly small:
>
> mydb=# select count(*) from user_mappings;
>  count
> -------
>  36780
> (1 row)
>
> The client is on the same LAN, but only sees packets trickling in:
>
> $ sudo tcpdump -i lo port 5434
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on lo, link-type EN10MB (Ethernet), capture size 96 bytes
> 00:44:05.475584 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
> 3757958746:3757966938, ack 3697417382, win 265, options [nop,nop,TS
> val 194791416 ecr 194791044], length 8192
> 00:44:05.475684 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
> 8192, win 771, options [nop,nop,TS val 194791416 ecr 194791416],
> length 0
> 00:44:08.867976 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
> 8192:15779, ack 1, win 265, options [nop,nop,TS val 194791756 ecr
> 194791416], length 7587
> 00:44:08.868019 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
> 15779, win 771, options [nop,nop,TS val 194791756 ecr 194791756],
> length 0
> 00:44:08.870336 IP nuthouse.40349 > nuthouse.5434: Flags [P.], seq
> 1:19, ack 15779, win 771, options [nop,nop,TS val 194791756 ecr
> 194791756], length 18
> 00:44:08.870343 IP nuthouse.5434 > nuthouse.40349: Flags [.], ack 19,
> win 265, options [nop,nop,TS val 194791756 ecr 194791756], length 0
> 00:44:12.578220 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
> 15779:23971, ack 19, win 265, options [nop,nop,TS val 194792127 ecr
> 194791756], length 8192
> 00:44:12.618116 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
> 23971, win 771, options [nop,nop,TS val 194792131 ecr 194792127],
> length 0
> 00:44:16.664645 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
> 23971:32163, ack 19, win 265, options [nop,nop,TS val 194792535 ecr
> 194792131], length 8192
> 00:44:16.664755 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
> 32163, win 771, options [nop,nop,TS val 194792535 ecr 194792535],
> length 0
> 00:44:20.465773 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
> 32163:40355, ack 19, win 265, options [nop,nop,TS val 194792915 ecr
> 194792535], length 8192
> 00:44:20.465878 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
> 40355, win 720, options [nop,nop,TS val 194792915 ecr 194792915],
> length 0
> 00:44:24.115273 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
> 40355:48547, ack 19, win 265, options [nop,nop,TS val 194793280 ecr
> 194792915], length 8192
> 00:44:24.115380 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
> 48547, win 720, options [nop,nop,TS val 194793280 ecr 194793280],
> length 0
> 00:44:27.749084 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
> 48547:56739, ack 19, win 265, options [nop,nop,TS val 194793644 ecr
> 194793280], length 8192
> 00:44:27.749192 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
> 56739, win 720, options [nop,nop,TS val 194793644 ecr 194793644],
> length 0
> 00:44:31.618124 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
> 56739:64931, ack 19, win 265, options [nop,nop,TS val 194794031 ecr
> 194793644], length 8192
> 00:44:31.618166 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
> 64931, win 720, options [nop,nop,TS val 194794031 ecr 194794031],
> length 0
> 00:44:35.349958 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
> 64931:73123, ack 19, win 265, options [nop,nop,TS val 194794404 ecr
> 194794031], length 8192
> 00:44:35.350054 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
> 73123, win 720, options [nop,nop,TS val 194794404 ecr 194794404],
> length 0
> 00:44:39.311627 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
> 73123:81315, ack 19, win 265, options [nop,nop,TS val 194794800 ecr
> 194794404], length 8192
> 00:44:39.311734 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
> 81315, win 771, options [nop,nop,TS val 194794800 ecr 194794800],
> length 0
> 00:44:43.178319 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
> 81315:89507, ack 19, win 265, options [nop,nop,TS val 194795187 ecr
> 194794800], length 8192
> 00:44:43.178429 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
> 89507, win 771, options [nop,nop,TS val 194795187 ecr 194795187],
> length 0
> 00:44:46.798511 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
> 89507:97699, ack 19, win 265, options [nop,nop,TS val 194795549 ecr
> 194795187], length 8192
> 00:44:46.798622 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
> 97699, win 720, options [nop,nop,TS val 194795549 ecr 194795549],
> length 0
> 00:44:50.430346 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
> 97699:105891, ack 19, win 265, options [nop,nop,TS val 194795912 ecr
> 194795549], length 8192
> 00:44:50.430455 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
> 105891, win 720, options [nop,nop,TS val 194795912 ecr 194795912],
> length 0
> 00:44:54.211644 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
> 105891:114083, ack 19, win 265, options [nop,nop,TS val 194796290 ecr
> 194795912], length 8192
> 00:44:54.211792 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
> 114083, win 720, options [nop,nop,TS val 194796290 ecr 194796290],
> length 0
> 00:44:57.948539 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
> 114083:122275, ack 19, win 265, options [nop,nop,TS val 194796664 ecr
> 194796290], length 8192
> ...
>
> And I verified that the query is not blocked:
>
> mydb=# select * from pg_stat_activity;
>  datid | datname | procpid | usesysid | usename |
>               current_query                                     |
> waiting |          xact_start           |          query_start
>  |         backend_start         | client_addr | client_port
>
-------+---------+---------+----------+---------+---------------------------------------------------------------------------------------+---------+-------------------------------+-------------------------------+-------------------------------+-------------+-------------
>  55244 | mydb    |    5128 |    16384 | yang    |
>                                                                 | f
>    | 2011-09-14 15:44:46.586014-07 | 2011-09-15 00:46:22.878715-07 |
> 2011-09-14 15:44:46.549151-07 | ::1         |       56791
>                                                :       select user_id
> from den where user_id not in (select duid from user_mappings)
>                                                :       and timestamp
> between '2009-04-01'::date and '2010-04-01'::date
>  55244 | mydb    |   28142 |    16384 | yang    | select * from
> pg_stat_activity;
>  | f       | 2011-09-15 00:46:40.100652-07 | 2011-09-15
> 00:46:40.100652-07 | 2011-09-15 00:46:35.913309-07 |             |
>     -1
> (2 rows)
>
> (The ports don't match up since this is tunneled via ssh.  `pv <
> /dev/zero | ssh nuthouse 'cat>/dev/null'` shows sustained bandwidth at
> ~30-40MB/s.)
>
> I know that `den` is large but things seem to be running much slower
> than I'd expect.  Nothing in the logs.  Is there any way to inspect
> what's going on?  Hesitant to kill the query in case it's almost done,
> though I doubt it (didn't have the foresight to expose this
> information in the client process - wasn't expecting to run into
> this).  Thanks in advance.
>
> --
> Yang Zhang
> http://yz.mit.edu/
>



--
Yang Zhang
http://yz.mit.edu/

Re: Why is this query running slowly?

From
Yang Zhang
Date:
Should also add that while the client is under no load (CPU<1%, load
~0.1, mem ~20%), the server looks pretty busy (CPU 90-100% of 1 core,
load ~1.5, mem ~70-80%), but PG is the only thing using resources.

On Thu, Sep 15, 2011 at 12:58 AM, Yang Zhang <yanghatespam@gmail.com> wrote:
> If it matters, the client is a JDBC app doing:
>
>    con = DriverManager.getConnection("jdbc:postgresql://localhost:5434/mydb",
> "yang", password)
>    con.setAutoCommit(false)
>    val st = con.prepareStatement("""
>      select user_id from den where user_id not in (select duid from
> user_mappings)
>      and timestamp between '2009-04-01'::date and '2010-04-01'::date;
>    """, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY)
>    st.setFetchSize(8192)
>    rs = st.executeQuery()
>
> On Thu, Sep 15, 2011 at 12:53 AM, Yang Zhang <yanghatespam@gmail.com> wrote:
>> I have a simple query that's been running for a while, which is fine,
>> but it seems to be running very slowly, which is a problem:
>>
>> mydb=# explain select user_id from den where user_id not in (select
>> duid from user_mappings) and timestamp between '2009-04-01' and
>> '2010-04-01';
>>
>>           QUERY PLAN
>>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>  Seq Scan on den  (cost=711.58..66062724212.74 rows=22634720 width=4)
>>   Filter: (("timestamp" >= '2009-04-01 00:00:00'::timestamp without
>> time zone) AND ("timestamp" <= '2010-04-01 00:00:00'::timestamp
>> without time zone) AND (NOT (SubPlan 1)))
>>   SubPlan 1
>>     ->  Materialize  (cost=711.58..1223.38 rows=36780 width=4)
>>           ->  Seq Scan on user_mappings  (cost=0.00..530.80 rows=36780 width=4)
>>
>> user_mappings is fairly small:
>>
>> mydb=# select count(*) from user_mappings;
>>  count
>> -------
>>  36780
>> (1 row)
>>
>> The client is on the same LAN, but only sees packets trickling in:
>>
>> $ sudo tcpdump -i lo port 5434
>> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
>> listening on lo, link-type EN10MB (Ethernet), capture size 96 bytes
>> 00:44:05.475584 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
>> 3757958746:3757966938, ack 3697417382, win 265, options [nop,nop,TS
>> val 194791416 ecr 194791044], length 8192
>> 00:44:05.475684 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
>> 8192, win 771, options [nop,nop,TS val 194791416 ecr 194791416],
>> length 0
>> 00:44:08.867976 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
>> 8192:15779, ack 1, win 265, options [nop,nop,TS val 194791756 ecr
>> 194791416], length 7587
>> 00:44:08.868019 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
>> 15779, win 771, options [nop,nop,TS val 194791756 ecr 194791756],
>> length 0
>> 00:44:08.870336 IP nuthouse.40349 > nuthouse.5434: Flags [P.], seq
>> 1:19, ack 15779, win 771, options [nop,nop,TS val 194791756 ecr
>> 194791756], length 18
>> 00:44:08.870343 IP nuthouse.5434 > nuthouse.40349: Flags [.], ack 19,
>> win 265, options [nop,nop,TS val 194791756 ecr 194791756], length 0
>> 00:44:12.578220 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
>> 15779:23971, ack 19, win 265, options [nop,nop,TS val 194792127 ecr
>> 194791756], length 8192
>> 00:44:12.618116 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
>> 23971, win 771, options [nop,nop,TS val 194792131 ecr 194792127],
>> length 0
>> 00:44:16.664645 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
>> 23971:32163, ack 19, win 265, options [nop,nop,TS val 194792535 ecr
>> 194792131], length 8192
>> 00:44:16.664755 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
>> 32163, win 771, options [nop,nop,TS val 194792535 ecr 194792535],
>> length 0
>> 00:44:20.465773 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
>> 32163:40355, ack 19, win 265, options [nop,nop,TS val 194792915 ecr
>> 194792535], length 8192
>> 00:44:20.465878 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
>> 40355, win 720, options [nop,nop,TS val 194792915 ecr 194792915],
>> length 0
>> 00:44:24.115273 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
>> 40355:48547, ack 19, win 265, options [nop,nop,TS val 194793280 ecr
>> 194792915], length 8192
>> 00:44:24.115380 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
>> 48547, win 720, options [nop,nop,TS val 194793280 ecr 194793280],
>> length 0
>> 00:44:27.749084 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
>> 48547:56739, ack 19, win 265, options [nop,nop,TS val 194793644 ecr
>> 194793280], length 8192
>> 00:44:27.749192 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
>> 56739, win 720, options [nop,nop,TS val 194793644 ecr 194793644],
>> length 0
>> 00:44:31.618124 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
>> 56739:64931, ack 19, win 265, options [nop,nop,TS val 194794031 ecr
>> 194793644], length 8192
>> 00:44:31.618166 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
>> 64931, win 720, options [nop,nop,TS val 194794031 ecr 194794031],
>> length 0
>> 00:44:35.349958 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
>> 64931:73123, ack 19, win 265, options [nop,nop,TS val 194794404 ecr
>> 194794031], length 8192
>> 00:44:35.350054 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
>> 73123, win 720, options [nop,nop,TS val 194794404 ecr 194794404],
>> length 0
>> 00:44:39.311627 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
>> 73123:81315, ack 19, win 265, options [nop,nop,TS val 194794800 ecr
>> 194794404], length 8192
>> 00:44:39.311734 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
>> 81315, win 771, options [nop,nop,TS val 194794800 ecr 194794800],
>> length 0
>> 00:44:43.178319 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
>> 81315:89507, ack 19, win 265, options [nop,nop,TS val 194795187 ecr
>> 194794800], length 8192
>> 00:44:43.178429 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
>> 89507, win 771, options [nop,nop,TS val 194795187 ecr 194795187],
>> length 0
>> 00:44:46.798511 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
>> 89507:97699, ack 19, win 265, options [nop,nop,TS val 194795549 ecr
>> 194795187], length 8192
>> 00:44:46.798622 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
>> 97699, win 720, options [nop,nop,TS val 194795549 ecr 194795549],
>> length 0
>> 00:44:50.430346 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
>> 97699:105891, ack 19, win 265, options [nop,nop,TS val 194795912 ecr
>> 194795549], length 8192
>> 00:44:50.430455 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
>> 105891, win 720, options [nop,nop,TS val 194795912 ecr 194795912],
>> length 0
>> 00:44:54.211644 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
>> 105891:114083, ack 19, win 265, options [nop,nop,TS val 194796290 ecr
>> 194795912], length 8192
>> 00:44:54.211792 IP nuthouse.40349 > nuthouse.5434: Flags [.], ack
>> 114083, win 720, options [nop,nop,TS val 194796290 ecr 194796290],
>> length 0
>> 00:44:57.948539 IP nuthouse.5434 > nuthouse.40349: Flags [P.], seq
>> 114083:122275, ack 19, win 265, options [nop,nop,TS val 194796664 ecr
>> 194796290], length 8192
>> ...
>>
>> And I verified that the query is not blocked:
>>
>> mydb=# select * from pg_stat_activity;
>>  datid | datname | procpid | usesysid | usename |
>>               current_query                                     |
>> waiting |          xact_start           |          query_start
>>  |         backend_start         | client_addr | client_port
>>
-------+---------+---------+----------+---------+---------------------------------------------------------------------------------------+---------+-------------------------------+-------------------------------+-------------------------------+-------------+-------------
>>  55244 | mydb    |    5128 |    16384 | yang    |
>>                                                                 | f
>>    | 2011-09-14 15:44:46.586014-07 | 2011-09-15 00:46:22.878715-07 |
>> 2011-09-14 15:44:46.549151-07 | ::1         |       56791
>>                                                :       select user_id
>> from den where user_id not in (select duid from user_mappings)
>>                                                :       and timestamp
>> between '2009-04-01'::date and '2010-04-01'::date
>>  55244 | mydb    |   28142 |    16384 | yang    | select * from
>> pg_stat_activity;
>>  | f       | 2011-09-15 00:46:40.100652-07 | 2011-09-15
>> 00:46:40.100652-07 | 2011-09-15 00:46:35.913309-07 |             |
>>     -1
>> (2 rows)
>>
>> (The ports don't match up since this is tunneled via ssh.  `pv <
>> /dev/zero | ssh nuthouse 'cat>/dev/null'` shows sustained bandwidth at
>> ~30-40MB/s.)
>>
>> I know that `den` is large but things seem to be running much slower
>> than I'd expect.  Nothing in the logs.  Is there any way to inspect
>> what's going on?  Hesitant to kill the query in case it's almost done,
>> though I doubt it (didn't have the foresight to expose this
>> information in the client process - wasn't expecting to run into
>> this).  Thanks in advance.
>>
>> --
>> Yang Zhang
>> http://yz.mit.edu/
>>
>
>
>
> --
> Yang Zhang
> http://yz.mit.edu/
>



--
Yang Zhang
http://yz.mit.edu/

Re: Why is this query running slowly?

From
John R Pierce
Date:
On 09/15/11 12:53 AM, Yang Zhang wrote:
> mydb=# explain select user_id from den where user_id not in (select
> duid from user_mappings) and timestamp between '2009-04-01' and
> '2010-04-01';
>
>             QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>   Seq Scan on den  (cost=711.58..66062724212.74 rows=22634720 width=4)
>     Filter: (("timestamp">= '2009-04-01 00:00:00'::timestamp without
> time zone) AND ("timestamp"<= '2010-04-01 00:00:00'::timestamp
> without time zone) AND (NOT (SubPlan 1)))
>     SubPlan 1
>       ->   Materialize  (cost=711.58..1223.38 rows=36780 width=4)
>             ->   Seq Scan on user_mappings  (cost=0.00..530.80 rows=36780 width=4)
>
> user_mappings is fairly small:

so you have to check if every one of the 22 million rows of 'den' in
that time range and is not one of the 36000 rows of user_mappings?

how many rows are in that time range?

are timestamp and duid indexed?  have the tables been analyzed any time
recently?


--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast


Re: Why is this query running slowly?

From
"Tomas Vondra"
Date:
On 15 Září 2011, 9:53, Yang Zhang wrote:
> I have a simple query that's been running for a while, which is fine,
> but it seems to be running very slowly, which is a problem:
>
> mydb=# explain select user_id from den where user_id not in (select
> duid from user_mappings) and timestamp between '2009-04-01' and
> '2010-04-01';
>
>            QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Seq Scan on den  (cost=711.58..66062724212.74 rows=22634720 width=4)
>    Filter: (("timestamp" >= '2009-04-01 00:00:00'::timestamp without
> time zone) AND ("timestamp" <= '2010-04-01 00:00:00'::timestamp
> without time zone) AND (NOT (SubPlan 1)))
>    SubPlan 1
>      ->  Materialize  (cost=711.58..1223.38 rows=36780 width=4)
>            ->  Seq Scan on user_mappings  (cost=0.00..530.80 rows=36780
> width=4)
>
> user_mappings is fairly small:

The problem is that for each of the 22634720 rows in "den" a separate
uncorrelated subquery (a seq scan on user_mappings) has to be executed.
Althogh the subquery is not very expensive, multiplied by the number of
rows in "den" the total cost is extreme.

The only solution is to get rid of the "not in" subquery - try to turn it
to a join like this:

SELECT user_id FROM den LEFT JOIN user_mappings ON (user_id = duid)
WHERE (timestamp BETWEEN '2009-04-01' AND '2010-04-01')
  AND (duid IS NULL)

That should give the same result I guess.

Tomas


Re: Why is this query running slowly?

From
Toby Corkindale
Date:
On 15/09/11 17:53, Yang Zhang wrote:
> I have a simple query that's been running for a while, which is fine,
> but it seems to be running very slowly, which is a problem:
>
> mydb=# explain select user_id from den where user_id not in (select
> duid from user_mappings) and timestamp between '2009-04-01' and
> '2010-04-01';

Do you have indexes on user_id and timestamp?

>
>             QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>   Seq Scan on den  (cost=711.58..66062724212.74 rows=22634720 width=4)
>     Filter: (("timestamp">= '2009-04-01 00:00:00'::timestamp without
> time zone) AND ("timestamp"<= '2010-04-01 00:00:00'::timestamp
> without time zone) AND (NOT (SubPlan 1)))
>     SubPlan 1
>       ->   Materialize  (cost=711.58..1223.38 rows=36780 width=4)
>             ->   Seq Scan on user_mappings  (cost=0.00..530.80 rows=36780 width=4)
>
> user_mappings is fairly small:
>
> mydb=# select count(*) from user_mappings;
>   count
> -------
>   36780
> (1 row)
>
>
> I know that `den` is large but things seem to be running much slower
> than I'd expect.  Nothing in the logs.  Is there any way to inspect
> what's going on?  Hesitant to kill the query in case it's almost done,
> though I doubt it (didn't have the foresight to expose this
> information in the client process - wasn't expecting to run into
> this).  Thanks in advance.
>


--
.signature

Re: Why is this query running slowly?

From
Yang Zhang
Date:
On Thu, Sep 15, 2011 at 1:22 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
> On 15 Září 2011, 9:53, Yang Zhang wrote:
>> I have a simple query that's been running for a while, which is fine,
>> but it seems to be running very slowly, which is a problem:
>>
>> mydb=# explain select user_id from den where user_id not in (select
>> duid from user_mappings) and timestamp between '2009-04-01' and
>> '2010-04-01';
>>
>>            QUERY PLAN
>>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>  Seq Scan on den  (cost=711.58..66062724212.74 rows=22634720 width=4)
>>    Filter: (("timestamp" >= '2009-04-01 00:00:00'::timestamp without
>> time zone) AND ("timestamp" <= '2010-04-01 00:00:00'::timestamp
>> without time zone) AND (NOT (SubPlan 1)))
>>    SubPlan 1
>>      ->  Materialize  (cost=711.58..1223.38 rows=36780 width=4)
>>            ->  Seq Scan on user_mappings  (cost=0.00..530.80 rows=36780
>> width=4)
>>
>> user_mappings is fairly small:
>
> The problem is that for each of the 22634720 rows in "den" a separate
> uncorrelated subquery (a seq scan on user_mappings) has to be executed.
> Althogh the subquery is not very expensive, multiplied by the number of
> rows in "den" the total cost is extreme.
>
> The only solution is to get rid of the "not in" subquery - try to turn it
> to a join like this:
>
> SELECT user_id FROM den LEFT JOIN user_mappings ON (user_id = duid)
> WHERE (timestamp BETWEEN '2009-04-01' AND '2010-04-01')
>  AND (duid IS NULL)
>
> That should give the same result I guess.

This worked great, thank you.  Too bad the planner isn't smart enough
to do this yet!

>
> Tomas
>
>



--
Yang Zhang
http://yz.mit.edu/

Re: Why is this query running slowly?

From
"Tomas Vondra"
Date:
On 15 Září 2011, 11:07, Yang Zhang wrote:
> On Thu, Sep 15, 2011 at 1:22 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
>> On 15 Září 2011, 9:53, Yang Zhang wrote:
>>> I have a simple query that's been running for a while, which is fine,
>>> but it seems to be running very slowly, which is a problem:
>>>
>>> mydb=# explain select user_id from den where user_id not in (select
>>> duid from user_mappings) and timestamp between '2009-04-01' and
>>> '2010-04-01';
>>>
>>>            QUERY PLAN
>>>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>  Seq Scan on den  (cost=711.58..66062724212.74 rows=22634720 width=4)
>>>    Filter: (("timestamp" >= '2009-04-01 00:00:00'::timestamp without
>>> time zone) AND ("timestamp" <= '2010-04-01 00:00:00'::timestamp
>>> without time zone) AND (NOT (SubPlan 1)))
>>>    SubPlan 1
>>>      ->  Materialize  (cost=711.58..1223.38 rows=36780 width=4)
>>>            ->  Seq Scan on user_mappings  (cost=0.00..530.80 rows=36780
>>> width=4)
>>>
>>> user_mappings is fairly small:
>>
>> The problem is that for each of the 22634720 rows in "den" a separate
>> uncorrelated subquery (a seq scan on user_mappings) has to be executed.
>> Althogh the subquery is not very expensive, multiplied by the number of
>> rows in "den" the total cost is extreme.
>>
>> The only solution is to get rid of the "not in" subquery - try to turn
>> it
>> to a join like this:
>>
>> SELECT user_id FROM den LEFT JOIN user_mappings ON (user_id = duid)
>> WHERE (timestamp BETWEEN '2009-04-01' AND '2010-04-01')
>>  AND (duid IS NULL)
>>
>> That should give the same result I guess.
>
> This worked great, thank you.  Too bad the planner isn't smart enough
> to do this yet!

It's probably a bit more complicated I guess - there are probably cases
when this would not work this great.

Anyway, you should consider Toby Corkindale's recommendation and check if
there's an index on that timestamp column - that might further improve the
performance. (I don't think an index on user_id might be useful in this
case).

Tomas


Re: Why is this query running slowly?

From
Tom Lane
Date:
Yang Zhang <yanghatespam@gmail.com> writes:
> On Thu, Sep 15, 2011 at 1:22 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
>> [ use a left join instead of NOT IN ]

> This worked great, thank you.  Too bad the planner isn't smart enough
> to do this yet!

It never will be, because they're not equivalent queries.  NOT IN has
different (and very hard-to-optimize) behavior in the face of NULLs in
the right-hand table.

The moral of the story is avoid NOT IN.  You could use NOT EXISTS,
which is less of a notational change than recasting as a left join.

            regards, tom lane

Re: Why is this query running slowly?

From
"Tomas Vondra"
Date:
On 15 Září 2011, 16:26, Tom Lane wrote:
> Yang Zhang <yanghatespam@gmail.com> writes:
>> On Thu, Sep 15, 2011 at 1:22 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
>>> [ use a left join instead of NOT IN ]
>
>> This worked great, thank you.  Too bad the planner isn't smart enough
>> to do this yet!
>
> It never will be, because they're not equivalent queries.  NOT IN has
> different (and very hard-to-optimize) behavior in the face of NULLs in
> the right-hand table.

True, but in this case the subquery selects a PK column (at least I
believe so, we haven't seen the structure), so there should be no NULLs.
So the result should be the same.

Could the planner do this reasoning, i.e. see if there's a NOT NULL
constraint on the column and change the plan accordingly?

Tomas


Re: Why is this query running slowly?

From
Tom Lane
Date:
"Tomas Vondra" <tv@fuzzy.cz> writes:
> Could the planner do this reasoning, i.e. see if there's a NOT NULL
> constraint on the column and change the plan accordingly?

Possibly.  I'm not sure how expensive it would be.  Right now, a
structural change of that sort would have to be done pretty early in
planning, and we don't have a lot of info collected about the tables
at that point.  I'm afraid it would involve a special-purpose catalog
probe, which would add a lot of overhead.

            regards, tom lane