Thread: Surprised by index choice for count(*)
Should I be? I would have thought the pk would have been chosen v. function index? explain analyse select count(*) from bc.segment s; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- Finalize Aggregate (cost=4428009.24..4428009.25 rows=1 width=8) (actual time=14786.395..14786.395 rows=1 loops=1) -> Gather (cost=4428009.03..4428009.24 rows=2 width=8) (actual time=14786.358..14786.386 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 -> Partial Aggregate (cost=4427009.03..4427009.04 rows=1 width=8) (actual time=14782.167..14782.167 rows=1 loops=3) -> Parallel Index Only Scan using fpv on segment s (cost=0.57..4210177.14 rows=86732753 width=0) (actual time=0.061..11352.855 rows=69386204 loops=3) Heap Fetches: 1780 Planning time: 0.221 ms Execution time: 14815.939 ms (9 rows) \d bc.segment Table "bc.segment" Column | Type | Modifiers ----------------+---------+-------------------- id | uuid | not null chrom | integer | not null markerset_id | uuid | not null probandset_id | uuid | not null startbase | integer | not null endbase | integer | not null firstmarker | integer | not null lastmarker | integer | not null events_less | bigint | not null default 0 events_equal | bigint | not null default 0 events_greater | bigint | not null default 0 Indexes: "segment_pkey" PRIMARY KEY, btree (id) "useg" UNIQUE, btree (probandset_id, chrom, startbase, endbase) "fpv" btree (pv(events_less, events_equal, events_greater, 0)) "segment_markerset_id_probandset_id_idx" btree (markerset_id, probandset_id) create or replace function public.pv(l bigint, e bigint, g bigint, o int) returns float as $$ select 1.0*(g+e+o)/(l+e+g+o)::float; $$ language sql ;
Should I be? I would have thought the pk would have been chosen v. function index?
Indexes:
"segment_pkey" PRIMARY KEY, btree (id)
"useg" UNIQUE, btree (probandset_id, chrom, startbase, endbase)
"fpv" btree (pv(events_less, events_equal, events_greater, 0))
"segment_markerset_id_probandset_id_idx" btree (markerset_id, probandset_id)
Haven't looked at the code but among those 4 indexes the one that was chosen is the one that comes first alphabetically. Ideally it would avoid multi-column indexes since they are larger; and I believe that a float is smaller than a uuid so not only alphabetically but the fpv index wins on size too.
That the index is functional is immaterial here.
David J.
Rob Sargent <robjsargent@gmail.com> writes: > Should I be? I would have thought the pk would have been chosen v. > function index? If I'm reading this correctly, the PK index contains uuids while the fpv index contains float4s, meaning the latter is probably half the size. So scanning it is a lot cheaper, at least according to the planner's cost model. regards, tom lane
Thank you both. Simple, as expected. And I’m easily surprised. Version 10 (perhaps obviously) for those scoring at home. > On May 1, 2018, at 10:11 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Rob Sargent <robjsargent@gmail.com> writes: >> Should I be? I would have thought the pk would have been chosen v. >> function index? > > If I'm reading this correctly, the PK index contains uuids while > the fpv index contains float4s, meaning the latter is probably half > the size. So scanning it is a lot cheaper, at least according to > the planner's cost model. > > regards, tom lane