Re: Strange behavior of child table. - Mailing list pgsql-performance

From Jenish
Subject Re: Strange behavior of child table.
Date
Msg-id BANLkTi=QwiEyPHh8u14EA-tGbs=h6D2geA@mail.gmail.com
Whole thread Raw
In response to Strange behavior of child table.  (Jenish <jenishvyas@gmail.com>)
List pgsql-performance
Hi Jeff,

Thanks for the help.

This is the first post by me, and I did mistake unknowingly. I will
take care of it next time.

Again thanks a lot for the help.

--
Thanks & regards,
JENISH VYAS


On Thu, Jun 2, 2011 at 10:04 AM, Jeff Davis <pgsql@j-davis.com> wrote:
>
> In the future, please remember to CC the list when replying unless you
> have a reason not to. This thread is already off-list by now.
>
> Also, I just noticed that this plan has a sort, and the slow query in
> the previous email did not. That looks like it might have been a mistake
> when running the regular EXPLAIN (without ANALYZE), because the slow
> plan does not look correct without a sort. Anyway...
>
> On Thu, 2011-06-02 at 09:23 +0300, Jenish wrote:
> > Hi Jeff,
> >
> > This table is growing rapidly. Now the parent table is taking much
> > more time for the same query. below is the complite details.
>
>
> > "                    ->  Bitmap Heap Scan on game_round_actions_old
> > game_round_actions  (cost=73355.48..7277769.30 rows=2630099 width=65)
> > (actual time=78319.248..302586.235 rows=2304337 loops=1)"
> > "                          Recheck Cond: (table_id = 1)"
> > "                          ->  Bitmap Index Scan on
> > "PK_game_round_actions"  (cost=0.00..72697.95 rows=2630099 width=0)
> > (actual time=78313.095..78313.095 rows=2304337 loops=1)"
> > "                                Index Cond: (table_id = 1)"
>
> That is the part of the plan that is taking time. Compare that to the
> other plan:
>
> > 2)      Child query
> >  explain analyse Select * from game_round_actions_old where table_id =
> > 1 order by table_id,round_id limit 100
> >  "Limit  (cost=0.00..335.97 rows=100 width=65) (actual
> > time=0.035..0.216 rows=100 loops=1)"
> >  "  ->  Index Scan using "PK_game_round_actions" on
> > game_round_actions_old  (cost=0.00..8836452.71 rows=2630099 width=65)
> > (actual time=0.033..0.110 rows=100 loops=1)"
> >  "        Index Cond: (table_id = 1)"
>
> Notice that it's actually using the same index, but the slow plan is
> using a bitmap index scan, and the fast plan is using a normal (ordered)
> index scan.
>
> What's happening is that the top-level query is asking to ORDER BY
> table_id, round_id LIMIT 100. Querying the child table can get that
> order directly from the index, so it scans the index in order, fetches
> only 100 tuples, and then it's done.
>
> But when querying the parent table, it's getting tuples from two tables,
> and so the tuples aren't automatically in the right order to satisfy the
> ORDER BY. So, it's collecting all of the matching tuples, which is about
> 2.6M, then sorting them, then returning the first 100 -- much slower!
>
> A smarter approach is to scan both tables in the correct order
> individually, and merge the results until you get 100 tuples. That would
> make both queries run fast. 9.1 is smart enough to do that, but it's
> still in beta right now.
>
> The only answer right now is to rewrite your slow query to be more like
> the fast one. I think if you manually push down the ORDER BY ... LIMIT,
> it will do the job. Something like:
>
>  select * from
>    (select * from game_round_actions_old
>     where table_id = 1
>     order by table_id,round_id limit 100
>     UNION ALL
>     select * from game_round_actions_new
>     where table_id = 1
>     order by table_id,round_id limit 100)
>  order by table_id,round_id limit 100;
>
> might work. I haven't actually tested that query though.
>
> Regards,
>        Jeff Davis
>
>

pgsql-performance by date:

Previous
From: "Reuven M. Lerner"
Date:
Subject: Re: Speeding up loops in pl/pgsql function
Next
From: "Kevin Grittner"
Date:
Subject: Re: Problem query