Re: Query RE: Optimising UUID Lookups - Mailing list pgsql-performance

From Maxim Boguk
Subject Re: Query RE: Optimising UUID Lookups
Date
Msg-id CAK-MWwQgu5=W1_X1u0bsKd3hhucFAw46_AKTwmq-GpVif+qvxQ@mail.gmail.com
Whole thread Raw
In response to Re: Query RE: Optimising UUID Lookups  (Roland Dunn <roland.dunn@gmail.com>)
List pgsql-performance

(1) SELECT uuid FROM lookup WHERE state = 200 LIMIT 4000;

OUTPUT FROM EXPLAIN (ANALYZE, BUFFERS):
------------------------------------------------
 Limit  (cost=0.00..4661.02 rows=4000 width=16) (actual
time=0.009..1.036 rows=4000 loops=1)
   Buffers: shared hit=42
   ->  Seq Scan on lookup  (cost=0.00..1482857.00 rows=1272559
width=16) (actual time=0.008..0.777 rows=4000 loops=1)
         Filter: (state = 200)
         Rows Removed by Filter: 410
         Buffers: shared hit=42
 Total runtime: 1.196 ms
(7 rows)

Question: Why does this do a sequence scan and not an index scan when
there is a btree on state?

very likely that state=200 is very common value in the table
so seq scan of few pages (42 to be exact) is faster than performing index scan.​

 
(2) SELECT article_data.id, article_data.uuid, article_data.title,
article_data.text FROM article_data WHERE uuid = ANY
('{f0d5e665-4f21-4337-a54b-cf0b4757db65,..... 3999 more uuid's
....}'::uuid[]);


OUTPUT FROM EXPLAIN (ANALYZE, BUFFERS):
------------------------------------------------
 Index Scan using article_data_uuid_key on article_data
(cost=5.56..34277.00 rows=4000 width=581) (actual time=0.063..66029.031 rows=4000 loops=1)
   Index Cond: (uuid = ANY
(
​'...'
::uuid[]))
   Buffers: shared hit=16060
​​
read=4084 dirtied=292
 Total runtime: 66041.443 ms Question: 
​>>​
 Why is this so slow, even though it's reading from disk?


As I already suggested enable track_io_timing in the database and use explain (analyze, costs, buffer, timing)
to see how much exactly time had been spent during IO operations.

The time requred for single random IO operation for common HDD's are around 10ms, so reading ​read=4084 pages could easily took 60seconds especially if some other IO activity exist on the server.

--
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com
МойКруг: http://mboguk.moikrug.ru/

"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."


pgsql-performance by date:

Previous
From: Roland Dunn
Date:
Subject: Re: Query RE: Optimising UUID Lookups
Next
From: Feike Steenbergen
Date:
Subject: Index only scan sometimes switches to sequential scan for small amount of rows