Home > mailing lists

Re: Query does not return rows unless a LIMIT statement is used. - Mailing list pgsql-novice

From	Tom Lane
Subject	Re: Query does not return rows unless a LIMIT statement is used.
Date	February 15, 2006 16:01:58
Msg-id	1940.1140022913@sss.pgh.pa.us Whole thread Raw
In response to	Re: Query does not return rows unless a LIMIT statement is used. (Richard Kut <rkut@intelerad.com>)
Responses	Re: Query does not return rows unless a LIMIT statement is used.
List	pgsql-novice

Tree view

Richard Kut <rkut@intelerad.com> writes:
> Here are the results with enable_seqscan = on:

Hmm, there's something pretty odd about your data, it seems like.
In the planner's preferred hash-join plan, it seems to be necessary
to scan through 654414 rows of pv1 to find the first 5324 rows that
join to pid:

>  Limit  (cost=27704.04..28790.59 rows=5324 width=691) (actual
> time=3865.677..17464.071 rows=5324 loops=1)
>    ->  Hash Join  (cost=27704.04..264408.14 rows=1159822 width=691) (actual
> time=3865.674..17453.938 rows=5324 loops=1)
>          Hash Cond: (("outer".patient_id_internal_id)::text =
> ("inner".patient_id_internal_id)::text)
>          ->  Seq Scan on pv1  (cost=0.00..80441.22 rows=1159822 width=407)
> (actual time=4.907..9361.791 rows=654414 loops=1)
                               ^^^^^^^^^^^
>          ->  Hash  (cost=14375.03..14375.03 rows=328403 width=284) (actual
> time=3853.090..3853.090 rows=328403 loops=1)
>                ->  Seq Scan on pid  (cost=0.00..14375.03 rows=328403
> width=284) (actual time=6.198..2658.306 rows=328403 loops=1)
>  Total runtime: 18120.833 ms

This is strange because all the other results look like there's about a
one-to-one correspondence.  In the case that is fast for you, the outer
scan appears to have matched every time:

>  Limit  (cost=0.00..33408.19 rows=5324 width=691) (actual time=0.054..106.972
> rows=5324 loops=1)
>    ->  Nested Loop  (cost=0.00..7277902.27 rows=1159822 width=691) (actual
> time=0.051..99.230 rows=5324 loops=1)
>          ->  Index Scan using patient_id_internal_id_idx on pv1
> (cost=0.00..290598.21 rows=1159822 width=407) (actual time=0.020..14.995
> rows=5324 loops=1)
  ^^^^^^^^^
>          ->  Index Scan using kdx_pid on pid  (cost=0.00..6.01 rows=1
> width=284) (actual time=0.009..0.010 rows=1 loops=5324)
>                Index Cond: ((pid.patient_id_internal_id)::text =
> ("outer".patient_id_internal_id)::text)
>  Total runtime: 111.137 ms

Do you have any idea why the hash case is so unlucky?

BTW, these examples sure look like they are not being done on exactly
the same tables... the names are different.

            regards, tom lane

pgsql-novice by date:

From: Richard Kut
Date: 15 February 2006, 15:35:01
Subject: Re: Query does not return rows unless a LIMIT statement is used.

From: Srinivas Iyyer
Date: 15 February 2006, 16:41:49
Subject: A function to count number of terms - something wrong please help

Re: Query does not return rows unless a LIMIT statement is used. - Mailing list pgsql-novice

Previous

Next