Thread: Differences in identical queries

Differences in identical queries

From
Rob Schall
Date:
Question for anyone...

I have to queries. One runs in about 2 seconds. The other takes upwards
of 2 minutes. I have a temp table that is created with 2 columns. This
table is joined with the larger database of call detail records.
However, these 2 queries are handled very differently.

The queries:
First----

calldetail=> EXPLAIN SELECT current.* FROM current JOIN anitmp ON
current.destnum=anitmp.ani AND istf=true;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..2026113.09 rows=500908 width=108)
   ->  Seq Scan on anitmp  (cost=0.00..33.62 rows=945 width=8)
         Filter: (istf = true)
   ->  Index Scan using i_destnum on current  (cost=0.00..2137.36
rows=531 width=108)
         Index Cond: (current.destnum = "outer".ani)
(5 rows)

Second----
calldetail=> EXPLAIN SELECT current.* FROM current JOIN anitmp ON
current.orignum=anitmp.ani AND istf=false;
                                QUERY PLAN
---------------------------------------------------------------------------
 Hash Join  (cost=35.99..3402035.53 rows=5381529 width=108)
   Hash Cond: ("outer".orignum = "inner".ani)
   ->  Seq Scan on current  (cost=0.00..907191.05 rows=10170805 width=108)
   ->  Hash  (cost=33.62..33.62 rows=945 width=8)
         ->  Seq Scan on anitmp  (cost=0.00..33.62 rows=945 width=8)
               Filter: (istf = false)
(6 rows)


The tables:
               Table "public.current"
  Column  |            Type             | Modifiers
----------+-----------------------------+-----------
 datetime | timestamp without time zone |
 orignum  | bigint                      |
 destnum  | bigint                      |
 billto   | bigint                      |
 cost     | numeric(6,4)                |
 duration | numeric(8,1)                |
 origcity | character(12)               |
 destcity | character(12)               |
 file     | character varying(30)       |
 linenum  | integer                     |
 carrier  | character(1)                |
Indexes:
    "i_destnum" btree (destnum)
    "i_orignum" btree (orignum)


    Table "public.anitmp"
 Column |  Type   | Modifiers
--------+---------+-----------
 ani    | bigint  |
 istf   | boolean |


Anyone have any ideas for me? I have indexes on each of the necessary
columns.

Rob

Re: Differences in identical queries

From
Richard Huxton
Date:
Rob Schall wrote:
> Question for anyone...
>
> I have to queries. One runs in about 2 seconds. The other takes upwards
> of 2 minutes. I have a temp table that is created with 2 columns. This
> table is joined with the larger database of call detail records.
> However, these 2 queries are handled very differently.

1. They're different queries - the second is expecting 10 times as many
rows as the first.
2. Can't tell if that is accurate - you need to supply EXPLAIN ANALYSE
output instead of EXPLAIN, so we can see what actually happened.

> calldetail=> EXPLAIN SELECT current.* FROM current JOIN anitmp ON
> current.destnum=anitmp.ani AND istf=true;
>                                       QUERY PLAN
> --------------------------------------------------------------------------------------
>  Nested Loop  (cost=0.00..2026113.09 rows=500908 width=108)

> calldetail=> EXPLAIN SELECT current.* FROM current JOIN anitmp ON
> current.orignum=anitmp.ani AND istf=false;
> ---------------------------------------------------------------------------
>  Hash Join  (cost=35.99..3402035.53 rows=5381529 width=108)


--
   Richard Huxton
   Archonet Ltd

Re: Differences in identical queries

From
Rob Schall
Date:
When I reanalyzed the anitmp table with just the 4 entries (2 with
istf=true and 2 with istf=false), both queries then ran the same way/time.
So it would appear, if you want to do a join or a subselect (IN), then
the number of items if will be comparing it to must be less than a
certain number. In my case, the full amount of false's that fail is
actually 40. So if you have a table of 2 million items (in current) and
want to join it to a table of 40 items, then it must do the longer
hashed join version and not the nested loop.

Below are the results with the anitmp table with 42 items. 40 set as
false, and 2 set as true. Is there a way to rewrite my query to have it
run as quick as the others?

Thanks



calldetail=> EXPLAIN ANALYZE SELECT current.* FROM current JOIN anitmp
ON current.orignum=anitmp.ani AND istf=false;
                                                            QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=1.63..1031833.26 rows=256563 width=108) (actual
time=1889.469..155380.749 rows=653 loops=1)
   Hash Cond: ("outer".orignum = "inner".ani)
   ->  Seq Scan on current  (cost=0.00..920527.00 rows=10873900
width=108) (actual time=670.402..136192.991 rows=10681150 loops=1)
   ->  Hash  (cost=1.52..1.52 rows=41 width=8) (actual time=0.187..0.187
rows=0 loops=1)
         ->  Seq Scan on anitmp  (cost=0.00..1.52 rows=41 width=8)
(actual time=0.014..0.108 rows=40 loops=1)
               Filter: (istf = false)
 Total runtime: 155381.960 ms
(7 rows)

calldetail=> EXPLAIN ANALYZE SELECT current.* FROM current JOIN anitmp
ON current.destnum=anitmp.ani AND istf=true;
                                                             QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..5718.45 rows=1413 width=108) (actual
time=76.116..1545.856 rows=1382 loops=1)
   ->  Seq Scan on anitmp  (cost=0.00..1.52 rows=3 width=8) (actual
time=0.025..0.042 rows=2 loops=1)
         Filter: (istf = true)
   ->  Index Scan using i_destnum on current  (cost=0.00..1899.74
rows=472 width=108) (actual time=60.133..769.442 rows=691 loops=2)
         Index Cond: (current.destnum = "outer".ani)
 Total runtime: 1548.321 ms
(6 rows)

Re: Differences in identical queries

From
Richard Huxton
Date:
Rob Schall wrote:
> When I reanalyzed the anitmp table with just the 4 entries (2 with
> istf=true and 2 with istf=false), both queries then ran the same way/time.
> So it would appear, if you want to do a join or a subselect (IN), then
> the number of items if will be comparing it to must be less than a
> certain number.

I'm still not sure how you're concluding that.

 > In my case, the full amount of false's that fail is
> actually 40. So if you have a table of 2 million items (in current) and
> want to join it to a table of 40 items, then it must do the longer
> hashed join version and not the nested loop.
>
> Below are the results with the anitmp table with 42 items. 40 set as
> false, and 2 set as true. Is there a way to rewrite my query to have it
> run as quick as the others?

> calldetail=> EXPLAIN ANALYZE SELECT current.* FROM current JOIN anitmp
> ON current.orignum=anitmp.ani AND istf=false;
>                                                             QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------
>  Hash Join  (cost=1.63..1031833.26 rows=256563 width=108) (actual
> time=1889.469..155380.749 rows=653 loops=1)

OK - so here's the root of the problem. The planner thinks it'll get
back 256,563 rows but actually gets 653. If you actually got more than
200,000 rows back then a seq-scan on current might well make sense.

>    Hash Cond: ("outer".orignum = "inner".ani)
>    ->  Seq Scan on current  (cost=0.00..920527.00 rows=10873900
> width=108) (actual time=670.402..136192.991 rows=10681150 loops=1)
>    ->  Hash  (cost=1.52..1.52 rows=41 width=8) (actual time=0.187..0.187
> rows=0 loops=1)
>          ->  Seq Scan on anitmp  (cost=0.00..1.52 rows=41 width=8)
> (actual time=0.014..0.108 rows=40 loops=1)
>                Filter: (istf = false)

Hmm - what sort of distribution of values do you have in "orignum" -
telephone numbers, so presumably they're quite distinct.

I'd be tempted to up the statistics on that column, reanalyse both
tables and see what happens.
   ALTER TABLE current ALTER COLUMN orignum SET STATISTICS=100;
You can set values up to 1000, start at 100 and step up. Not checked the
syntax on that last statement btw.

You can see the before and after effects by looking at:
SELECT * FROM pg_stats WHERE tablename='current' AND attname='orignum';

--
   Richard Huxton
   Archonet Ltd

Re: Differences in identical queries

From
Richard Huxton
Date:
Rob Schall wrote:
> I noticed the rows bit. I wasn't quite sure why it thought it would get
> so many back. Even after doing a analyze. I will alter the table stats
> and see if that helps.
>
> As for the values in orignum, they are all 10 digit numbers
> (nxx,npa,num). I probably could have stored them as a char, etc, but I
> didn't think that would help the cause at all. They are all over the
> board as far what the numbers are set to, but long story short, they are
> call detail numbers. So if they are our customer, then one of the
> numbers i tossed in anitmp (one of the customer's numbers) could have
> hundreds of entries. Not sure if this is what you were asking me.

Well, it must be thinking it'll get a lot of matches for your 40 rows,
so increasing the stats might well help.

--
   Richard Huxton
   Archonet Ltd

Re: Differences in identical queries

From
Rob Schall
Date:
The final answer wound up being....
  ALTER TABLE current ALTER COLUMN orignum SET STATISTICS 100;
and then an ANALYZE current;

Then the queries started running like a champ (split seconds, not minutes).

Thanks Richard!
Rob


Richard Huxton wrote:
> Rob Schall wrote:
>> When I reanalyzed the anitmp table with just the 4 entries (2 with
>> istf=true and 2 with istf=false), both queries then ran the same
>> way/time.
>> So it would appear, if you want to do a join or a subselect (IN), then
>> the number of items if will be comparing it to must be less than a
>> certain number.
>
> I'm still not sure how you're concluding that.
>
> > In my case, the full amount of false's that fail is
>> actually 40. So if you have a table of 2 million items (in current) and
>> want to join it to a table of 40 items, then it must do the longer
>> hashed join version and not the nested loop.
>>
>> Below are the results with the anitmp table with 42 items. 40 set as
>> false, and 2 set as true. Is there a way to rewrite my query to have it
>> run as quick as the others?
>
>> calldetail=> EXPLAIN ANALYZE SELECT current.* FROM current JOIN anitmp
>> ON current.orignum=anitmp.ani AND istf=false;
>>                                                             QUERY PLAN
>>
-----------------------------------------------------------------------------------------------------------------------------------
>>
>>  Hash Join  (cost=1.63..1031833.26 rows=256563 width=108) (actual
>> time=1889.469..155380.749 rows=653 loops=1)
>
> OK - so here's the root of the problem. The planner thinks it'll get
> back 256,563 rows but actually gets 653. If you actually got more than
> 200,000 rows back then a seq-scan on current might well make sense.
>
>>    Hash Cond: ("outer".orignum = "inner".ani)
>>    ->  Seq Scan on current  (cost=0.00..920527.00 rows=10873900
>> width=108) (actual time=670.402..136192.991 rows=10681150 loops=1)
>>    ->  Hash  (cost=1.52..1.52 rows=41 width=8) (actual time=0.187..0.187
>> rows=0 loops=1)
>>          ->  Seq Scan on anitmp  (cost=0.00..1.52 rows=41 width=8)
>> (actual time=0.014..0.108 rows=40 loops=1)
>>                Filter: (istf = false)
>
> Hmm - what sort of distribution of values do you have in "orignum" -
> telephone numbers, so presumably they're quite distinct.
>
> I'd be tempted to up the statistics on that column, reanalyse both
> tables and see what happens.
>   ALTER TABLE current ALTER COLUMN orignum SET STATISTICS=100;
> You can set values up to 1000, start at 100 and step up. Not checked
> the syntax on that last statement btw.
>
> You can see the before and after effects by looking at:
> SELECT * FROM pg_stats WHERE tablename='current' AND attname='orignum';
>