Re: Index scan cost calculation - Mailing list pgsql-performance

From Glyn Astill
Subject Re: Index scan cost calculation
Date
Msg-id 1037814983.18183568.1448555675086.JavaMail.yahoo@mail.yahoo.com
Whole thread Raw
In response to Index scan cost calculation  (Glyn Astill <glynastill@yahoo.co.uk>)
List pgsql-performance
----- Original Message -----

> From: Glyn Astill <glynastill@yahoo.co.uk>
> To: Pgsql-performance <pgsql-performance@postgresql.org>
> Sent: Thursday, 26 November 2015, 16:11
> Subject: [PERFORM] Index scan cost calculation
>
> Hi All,
>
> Using pg 9.4.5 I'm looking at a table set up by a 3rd party application and
> trying to figure out why a particular index is being chosen over another for
> updates/deletes.
>
> From what I can see the reason is that plans using either index have the same
> exactly the same cost.  So rather I'm asking if there's something
> glaringly obvious I'm missing, or is there anything I can to to get better
> estimates.
>
> The table is as follows and has  ~ 50M rows, ~ 4.5GB in size:
>
> CREATE TABLE tickets.seats
> (
>   recnum serial NOT NULL,
>   show numeric(8,0) NOT NULL,
>   type numeric(4,0) NOT NULL,
>   block character varying(8) NOT NULL,
>   "row" numeric(14,0) NOT NULL,
>   seat numeric(8,0) NOT NULL,
>   flag character varying(15) NOT NULL,
>   transno numeric(8,0) NOT NULL,
>   best numeric(4,0) NOT NULL,
>   "user" character varying(15) NOT NULL,
>   "time" numeric(10,0) NOT NULL,
>   date date NOT NULL,
>   date_reserved timestamp NOT NULL
> );
>
> Indexes:
>   "seats_index01" PRIMARY KEY, btree (show, type, best, block,
> "row", seat)              // (1094 MB)
>   "seats_index00" UNIQUE, btree (recnum)
>                     // (2423 MB)
>   "seats_index02" UNIQUE, btree (show, type, best, block, flag,
> "row", seat, recnum)      // (2908 MB)

>


^^ If those first two sizes look wrong, it's because they are; they should be the other way around.

> default_statistics target is 100, and the following columns are non-default:
>
> attname | attstattarget
> --------+---------------
> show      |          1000
> type       |          1000
> block    |          2000
> row        |          1000
> seat       |          1000
> flag       |          1000
> best       |          1000
>
> Increasing these further appears to make no noticeable difference. (pg_stats
> here for these columns here: http://pastebin.com/2WQQec7N)
>
> An example query below shows that in some cases the seats_index02 index is being
> chosen:
>
> # analyze verbose seats;
> INFO:  analyzing "tickets.seats"
> INFO:  "seats": scanned 593409 of 593409 pages, containing 50926456
> live rows and 349030 dead rows; 600000 rows in sample, 50926456 estimated total
> rows
>
> # begin;
> BEGIN
> # explain analyze delete from seats where ("show" = 58919 AND
> "type" = 1 AND "best" = 10 AND "block" =
> 'GMA' AND "row" =26 AND "seat" = 15);
> QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Delete on seats  (cost=0.56..4.59 rows=1 width=6) (actual time=0.480..0.480
> rows=0 loops=1)
> ->  Index Scan using seats_index02 on seats  (cost=0.56..4.59 rows=1 width=6)
> (actual time=0.452..0.453 rows=1 loops=1)
> Index Cond: ((show = 58919::numeric) AND (type = 1::numeric) AND (best =
> 10::numeric) AND ((block)::text = 'GMA'::text) AND ("row" =
> 26::numeric) AND (seat = 15::numeric))
> Planning time: 2.172 ms
> Execution time: 0.531 ms
> (5 rows)
>
> But from my naive standpoint, seats_index01 is a better candidate:
>
> # abort; begin;
> ROLLBACK
> BEGIN
>
> # update pg_index set indisvalid = false where indexrelid =
> 'seats_index02'::regclass;
> # explain analyze delete from seats where ("show" = 58919 AND
> "type" = 1 AND "best" = 10 AND "block" =
> 'GMA' AND "row" =26 AND "seat" = 15);
> QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Delete on seats  (cost=0.56..4.59 rows=1 width=6) (actual time=0.103..0.103
> rows=0 loops=1)
> ->  Index Scan using seats_index01 on seats  (cost=0.56..4.59 rows=1 width=6)
> (actual time=0.078..0.080 rows=1 loops=1)
> Index Cond: ((show = 58919::numeric) AND (type = 1::numeric) AND (best =
> 10::numeric) AND ((block)::text = 'GMA'::text) AND ("row" =
> 26::numeric) AND (seat = 15::numeric))
> Planning time: 0.535 ms
> Execution time: 0.146 ms
> (5 rows)
>
>
> In this instance, the time difference is not huge, however in some seemingly
> random cases where there are a lot of rows with only the "seat" column
> differing the choice of seats_index02 is much larger ~ 70ms vs 0.something ms
> with seats_index01
>
> I suspect some of the seemingly random cases could be where there's been an
> update, followed by a delete since the last analyze, despite auto analyze
> running fairly frequently.
>
> Any suggestions appreciated.
>
> Thanks
> Glyn
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


pgsql-performance by date:

Previous
From: Glyn Astill
Date:
Subject: Index scan cost calculation
Next
From: Tom Lane
Date:
Subject: Re: Index scan cost calculation