Re: partitioning query planner almost always scans all tables - Mailing list pgsql-general

From Kyotaro HORIGUCHI
Subject Re: partitioning query planner almost always scans all tables
Date
Msg-id 20150122.200730.118328274.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: partitioning query planner almost always scans all tables  (Spiros Ioannou <sivann@inaccess.com>)
Responses Re: partitioning query planner almost always scans all tables  (Spiros Ioannou <sivann@inaccess.com>)
List pgsql-general
Hi,

> @Kyotaro HORIGUCHI
> thanks for your reply and time Kyotaro,

Not at all.

> Using the following query
> EXPLAIN ANALYZE  SELECT * FROM measurement_events WHERE
> measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' AND
> measurement_time >= '2015-01-01 00:00:00+0' LIMIT 1;
>
> produces this plan:
>
>  Limit  (cost=0.00..4.02 rows=1 width=67) (actual time=49.125..49.125
> rows=1 loops=1)
>    ->  Append  (cost=0.00..3644.05 rows=907 width=67) (actual
> time=49.122..49.122 rows=1 loops=1)
>          ->  Seq Scan on measurement_events  (cost=0.00..0.00 rows=1
> width=966) (actual time=0.003..0.003 rows=0 loops=1)
>                Filter: ((measurement_time >= '2015-01-01
> 02:00:00+02'::timestamp with time zone) AND (measurement_source_id =
> 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid))
>          ->  Bitmap Heap Scan on *measurement_events_p2015_01*
>  (cost=41.73..3546.10 rows=894 width=54) (actual time=49.119..49.119 rows=1
> loops=1)
>                Recheck Cond: ((measurement_source_id =
> 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >=
> '2015-01-01 02:00:00+02'::timestamp with time zone))
>                ->  Bitmap Index Scan on *measurement_events_p2015_01_pkey*
>  (cost=0.00..41.51 rows=894 width=0) (actual time=41.836..41.836 rows=997
> loops=1)
>                      Index Cond: ((measurement_source_id =
> 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >=
> '2015-01-01 02:00:00+02'::timestamp with time zone))
>          ->  Index Scan using *measurement_events_p2015_02_pkey* on
> *measurement_events_p2015_02*  (cost=0.14..8.16 rows=1 width=966) (never
> executed)
>                Index Cond: ((measurement_source_id =
> 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >=
> '2015-01-01 02:00:00+02'::timestamp with time zone))
>          ->  Index Scan using *measurement_events_p2015_03_pkey* on
> *measurement_events_p2015_03*  (cost=0.14..8.16 rows=1 width=966) (never
> executed)
>                Index Cond: ((measurement_source_id =
> 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >=
> '2015-01-01 02:00:00+02'::timestamp with time zone))
> ... (cut for brevity)
>
> 1) Do you know if this means that the query will stop on 1st find (since it
> is limit 1), or will it search all tables regardless results?

It saids that only the first table was scanned because 1 row had
been acquired. But it is unclear whether constraint exclusion
worked. All of the table shown above seems to have the data after
2015/1/1 and match the condition of your query. It's okay if
p_2014_12 and the earlier is not seen in the explain result and
the order in which the tables appear seems to me suggesting it is
okay. Please examine it on that ponit.

> 2) To improve on the above, do you (or anyone else) have any input on this:
>
> to get the latest value from all tables, we were using the following query
> (before partitioning):
>
> EXPLAIN ANALYZE  SELECT * FROM measurement_events WHERE
> measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' ORDER BY
> measurement_time DESC LIMIT 1;
>
> This seems to fail, scanning all tables. Do you think this can be improved
> at all ? The query plan of the above query is as follows:

The combination of sorting by measurement_source_id and limit
hinders constraint exclusion because the order of the column
across whole the inheritance is not known to planner. And the
below plan also dosn't show whether constraint exclusion worked
or not, by the same reason. But I suppose it worked.

Since constraint exclusion worked, it seems enough optmized. What
kind of optimizaition do you expect?


>
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=5.57..9.99 rows=1 width=921) (actual time=5.361..5.361 rows=1
> loops=1)
>    ->  Merge Append  (cost=5.57..451374.16 rows=102155 width=921) (actual
> time=5.359..5.359 rows=1 loops=1)
>          Sort Key: measurement_events.measurement_time
>          ->  Index Scan Backward using measurement_events_pkey on
> *measurement_events*  (cost=0.12..8.14 rows=1 width=966) (actual
> time=0.004..0.004 rows=0 loops=1)
>                Index Cond: (measurement_source_id =
> 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)
>          ->  Index Scan Backward using *measurement_events_p2014_01_pkey*
> on *measurement_events_p2014_01*  (cost=0.14..8.16 rows=1 width=966)
> (actual time=0.002..0.002 rows=0 loops=1)
>                Index Cond: (measurement_source_id =
> 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)
>          ->  Index Scan Backward using *measurement_events_p2014_02_pkey*
> on *measurement_events_p2014_02*  (cost=0.14..8.16 rows=1 width=966)
> (actual time=0.001..0.001 rows=0 loops=1)
>                Index Cond: (measurement_source_id =
> 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)
>          ->  Index Scan Backward using *measurement_events_p2014_03_pkey*
> on *measurement_events_p2014_03*  (cost=0.14..8.16 rows=1 width=966)
> (actual time=0.001..0.001 rows=0 loops=1)
>                Index Cond: (measurement_source_id =
> 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)
>
> ............. (cut for brevity)

--
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-general by date:

Previous
From: "robertlazarski ."
Date:
Subject: implicit cast works for insert, not for select
Next
From: Spiros Ioannou
Date:
Subject: Re: partitioning query planner almost always scans all tables