Re: Looks like merge join planning time is too big, 55 seconds - Mailing list pgsql-performance

From Sergey Burladyan
Subject Re: Looks like merge join planning time is too big, 55 seconds
Date
Msg-id 874nb8uif7.fsf@seb.koffice.internal
Whole thread Raw
In response to Re: Looks like merge join planning time is too big, 55 seconds  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: Looks like merge join planning time is too big, 55 seconds
List pgsql-performance
Jeff Janes <jeff.janes@gmail.com> writes:

> On Fri, Aug 2, 2013 at 2:58 AM, Sergey Burladyan <eshkinkot@gmail.com> wrote:
> >
> > PS: I think my main problem is here:
> > select min(user_id) from items;
> >  min
> > -----
> >    1
> > (1 row)
> >
> > Time: 504.520 ms
>
> That is a long time, but still 100 fold less than the planner is taking.
>
> What about max(user_id)?

max is good, only rows with user_id = 0 was updated:

select max(user_id) from items;
Time: 59.646 ms

> > also, i cannot reindex it concurrently now, because it run autovacuum: VACUUM ANALYZE public.items (to prevent
wraparound)
>
> That is going to take a long time if you have the cost settings at
> their defaults.

Yes, I have custom setting, more slow, it will last about a week.

> But why is it bloated in this way?

Don't known. It has been updated many items last week. ~ 10% of table.

> It must be visiting many thousands of dead/invisible rows before
> finding the first visible one.  But, Btree index have a mechanism to
> remove dead tuples from indexes, so it doesn't follow them over and
> over again (see "kill_prior_tuple").  So is that mechanism not
> working, or are the tuples not dead but just invisible (i.e. inserted
> by a still open transaction)?

It is deleted, but VACUUM still not completed.

BTW, it is standby server, and it query plan (block read) is very
different from master:

Hot standby:

explain (analyze,verbose,buffers) select min(user_id) from items;

'Result  (cost=0.12..0.13 rows=1 width=0) (actual time=56064.514..56064.514 rows=1 loops=1)'
'  Output: $0'
'  Buffers: shared hit=3694164 read=6591224 written=121652'
'  InitPlan 1 (returns $0)'
'    ->  Limit  (cost=0.00..0.12 rows=1 width=8) (actual time=56064.502..56064.503 rows=1 loops=1)'
'          Output: public.items.user_id'
'          Buffers: shared hit=3694164 read=6591224 written=121652'
'          ->  Index Only Scan using items_user_id_idx on public.items  (cost=0.00..24165743.48 rows=200673143 width=8)
(actualtime=56064.499..56064.499 rows=1 loops=1)' 
'                Output: public.items.user_id'
'                Index Cond: (public.items.user_id IS NOT NULL)'
'                Heap Fetches: 8256426'
'                Buffers: shared hit=3694164 read=6591224 written=121652'
'Total runtime: 56064.571 ms'

Master:

'Result  (cost=0.12..0.13 rows=1 width=0) (actual time=202.759..202.759 rows=1 loops=1)'
'  Output: $0'
'  Buffers: shared hit=153577 read=1'
'  InitPlan 1 (returns $0)'
'    ->  Limit  (cost=0.00..0.12 rows=1 width=8) (actual time=202.756..202.757 rows=1 loops=1)'
'          Output: public.items.user_id'
'          Buffers: shared hit=153577 read=1'
'          ->  Index Only Scan using items_user_id_idx on public.items  (cost=0.00..24166856.02 rows=200680528 width=8)
(actualtime=202.756..202.756 rows=1 loops=1)' 
'                Output: public.items.user_id'
'                Index Cond: (public.items.user_id IS NOT NULL)'
'                Heap Fetches: 0'
'                Buffers: shared hit=153577 read=1'
'Total runtime: 202.786 ms'

And from backup, before index|heap bloated :)

 Result  (cost=0.87..0.88 rows=1 width=0) (actual time=16.002..16.003 rows=1 loops=1)
   Output: $0
   Buffers: shared hit=3 read=4
   InitPlan 1 (returns $0)
     ->  Limit  (cost=0.00..0.87 rows=1 width=8) (actual time=15.993..15.995 rows=1 loops=1)
           Output: public.items.user_id
           Buffers: shared hit=3 read=4
           ->  Index Only Scan using items_user_id_idx on public.items  (cost=0.00..169143085.72 rows=193309210
width=8)(actual time=15.987..15.987 rows=1 loops=1) 
                 Output: public.items.user_id
                 Index Cond: (public.items.user_id IS NOT NULL)
                 Heap Fetches: 1
                 Buffers: shared hit=3 read=4
 Total runtime: 16.057 ms


pgsql-performance by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: subselect requires offset 0 for good performance.
Next
From: Alvaro Herrera
Date:
Subject: Re: Looks like merge join planning time is too big, 55 seconds