Home > mailing lists

Re: query planner not using index, instead using squential scan - Mailing list pgsql-performance

From	Vijaykumar Jain
Subject	Re: query planner not using index, instead using squential scan
Date	June 5, 2021 22:52:06
Msg-id	CAM+6J94s6ii8UW_9jNtzAUz=ZuGoTO+DMztvvgoA=H_Q_LWq_w@mail.gmail.com Whole thread Raw
In response to	Re: query planner not using index, instead using squential scan (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: query planner not using index, instead using squential scan (Ayub Khan <ayub.hp@gmail.com>)
List	pgsql-performance

Tree view

thanks Tom.

I was trying to simulate some scenarios to be able to explain how the plan would change with/without

Rows Removed by Filter: 73969 -- by using a different/correct index.

postgres=# \d t

Table "public.t"

Column | Type | Collation | Nullable | Default

------------+-----------------------------+-----------+----------+---------

id | integer | | not null |

created_on | timestamp without time zone | | |

col1 | text | | |

Indexes:

"t_pkey" PRIMARY KEY, btree (id)

"t_created_on_idx" btree (created_on) WHERE col1 = 'a'::text --- useless index as all rows have col1 = 'a', but to attempt lossy case

"t_created_on_idx1" btree (created_on)

Referenced by:

TABLE "t1" CONSTRAINT "t1_id_fkey" FOREIGN KEY (id) REFERENCES t(id)

postgres=# \d t1

Table "public.t1"

Column | Type | Collation | Nullable | Default

--------+---------+-----------+----------+---------

t1_id | integer | | not null |

id | integer | | |

col2 | text | | |

Indexes:

"t1_pkey" PRIMARY KEY, btree (t1_id)

Foreign-key constraints:

"t1_id_fkey" FOREIGN KEY (id) REFERENCES t(id)

postgres=# update t set col1 = 'a';

UPDATE 1000

postgres=# explain analyze select 1 from t join t1 on (t.id = t1.id) where created_on = '2021-06-01 12:48:45.141123';

QUERY PLAN

--------------------------------------------------------------------------------------------------------

Hash Join (cost=37.01..39.28 rows=1 width=4) (actual time=0.124..0.125 rows=0 loops=1)

Hash Cond: (t1.id = t.id)

-> Seq Scan on t1 (cost=0.00..2.00 rows=100 width=4) (actual time=0.004..0.008 rows=100 loops=1)

-> Hash (cost=37.00..37.00 rows=1 width=4) (actual time=0.109..0.109 rows=1 loops=1)

Buckets: 1024 Batches: 1 Memory Usage: 9kB

-> Seq Scan on t (cost=0.00..37.00 rows=1 width=4) (actual time=0.058..0.107 rows=1 loops=1)

Filter: (created_on = '2021-06-01 12:48:45.141123'::timestamp without time zone)

Rows Removed by Filter: 999 --- as no useful index, t_created_on_idx will fetch all pages and then remove rows from them, expensive

Planning Time: 0.111 ms

Execution Time: 0.162 ms

(10 rows)

postgres=# explain analyze select 1 from t join t1 on (t.id = t1.id) where created_on = '2021-06-01 12:48:45.141123';

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------

Hash Join (cost=8.32..33.47 rows=1 width=4) (actual time=0.025..0.026 rows=0 loops=1)

Hash Cond: (t1.id = t.id)

-> Seq Scan on t1 (cost=0.00..22.00 rows=1200 width=4) (actual time=0.009..0.009 rows=1 loops=1)

-> Hash (cost=8.31..8.31 rows=1 width=4) (actual time=0.014..0.014 rows=0 loops=1)

Buckets: 1024 Batches: 1 Memory Usage: 8kB

-> Index Scan using t_created_on_idx1 on t (cost=0.29..8.31 rows=1 width=4) (actual time=0.014..0.014 rows=0 loops=1)

Index Cond: (created_on = '2021-06-01 12:48:45.141123'::timestamp without time zone) -- exact match using btree index,

Planning Time: 0.255 ms

Execution Time: 0.071 ms

(9 rows)

but from Ayub's plan, the number of rows fetched are a lot, but is also removing rows post index scan.

if that can be improved with a btree index that does not filter unwanted rows, the run may be faster ?

but i guess if there are 156k rows, planner would a have found a win in seq scan.

Ayub,

just for the sake of understanding,

can you run the query using

postgres=# set enable_seqscan TO 0;

SET

postgres=# -- explain analyze <run the query>

postgres=# set enable_seqscan TO 1;

SET

On Sun, 6 Jun 2021 at 00:59, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ayub Khan <ayub.hp@gmail.com> writes:
> could someone clarify why the LEFT JOIN order_offer_map oom using
> (order_id) in the below query is using sequential scan instead of
> using index on order_id which is defined in order_offer_map table.

Probably because it estimates the hash join to restaurant_order is
faster than a nestloop join would be. I think it's likely right.
You'd need very optimistic assumptions about the cost of an
individual index probe into order_offer_map to conclude that 156K
of them would be faster than the 476ms that are being spent here
to read order_offer_map and join it to the result of the
indexscan on restaurant_order.

If, indeed, that *is* faster on your hardware, you might want
to dial down random_page_cost to get more-relevant estimates.

regards, tom lane

Thanks,

Vijay

Mumbai, India

pgsql-performance by date:

From: Tom Lane
Date: 05 June 2021, 22:29:30
Subject: Re: query planner not using index, instead using squential scan

From: Ayub Khan
Date: 06 June 2021, 00:14:18
Subject: Re: query planner not using index, instead using squential scan

Re: query planner not using index, instead using squential scan - Mailing list pgsql-performance

Previous

Next