Thread: Understanding EXPLAIN
I am trying to fully understand, how costs for queries are computed. Taking the following example: CREATE TABLE test (name varchar(250) primary key) ; INSERT INTO test (name) VALUES(generate_series(1, 1000)::text) ; ANALYZE test ; EXPLAIN SELECT * FROM test WHERE name = '4' ; I am getting the output: Index Scan using test_pkey on test (cost=0.00..8.27 rows=1 width=3) Index Cond: ((name)::text = '4'::text) The server has default cost parameters The value I want to understand is 8.27. From reading the book "PostgreSQL 9.0 High Performance" I know, that we have one index page read (random page read, cost=4.0) and one database row read (random page read, cost=4.0) which comes up to a total of 8.0. But where are the missing 0.27 from? If I modify the example to insert 10,000 rows, the cost stays the same. Only if I go for 100,000 rows will the computed cost increase to 8.29. Can anybody enlighten me, please ;-).
Robert Lichtenberger <r.lichtenberger@synedra.com> writes: > I am trying to fully understand, how costs for queries are computed. > Taking the following example: ... > Index Scan using test_pkey on test (cost=0.00..8.27 rows=1 width=3) > Index Cond: ((name)::text = '4'::text) > The value I want to understand is 8.27. From reading the book > "PostgreSQL 9.0 High Performance" I know, that we have one index page > read (random page read, cost=4.0) and one database row read (random page > read, cost=4.0) which comes up to a total of 8.0. But where are the > missing 0.27 from? I think you're neglecting CPU costs. We're going to charge at least one cpu_operator_cost, one cpu_index_tuple_cost, and one cpu_tuple_cost on top of the I/O costs. Now that still only adds 0.0175 at the default settings, but there are various other second-order contributions. For the whole story, there's no substitute for taking a look at the source code; see cost_index here: http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/optimizer/path/costsize.c;h=885d8558c319fd283df351c2c8e062a449b72d3c;hb=HEAD#l208 which largely depends on btcostestimate and genericcostestimate here: http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/utils/adt/selfuncs.c;h=6d78068476e520f7dd2da6c0c8d48d93e0649768;hb=HEAD#l6003 In a quick look through that, I think the largest second-order component is this charge in genericcostestimate: 6214 * We also add a CPU-cost component to represent the general costs of 6215 * starting an indexscan, such as analysis of btree index keys and initial 6216 * tree descent. This is estimated at 100x cpu_operator_cost, which is a 6217 * bit arbitrary but seems the right order of magnitude. (As noted above, 6218 * we don't charge any I/O for touching upper tree levels, but charging 6219 * nothing at all has been found too optimistic.) ... 6226 *indexTotalCost += num_sa_scans * 100.0 * cpu_operator_cost; which accounts for 0.25 cost units at the default cpu_operator_cost setting. regards, tom lane
Robert Lichtenberger <r.lichtenberger@synedra.com> wrote: > I am trying to fully understand, how costs for queries are computed. > Taking the following example: > > CREATE TABLE test (name varchar(250) primary key) ; > INSERT INTO test (name) VALUES(generate_series(1, 1000)::text) ; > ANALYZE test ; > EXPLAIN SELECT * FROM test WHERE name = '4' ; > > I am getting the output: > Index Scan using test_pkey on test (cost=0.00..8.27 rows=1 width=3) > Index Cond: ((name)::text = '4'::text) > > The server has default cost parameters > > The value I want to understand is 8.27. From reading the book > "PostgreSQL 9.0 High Performance" I know, that we have one index page > read (random page read, cost=4.0) and one database row read (random page > read, cost=4.0) which comes up to a total of 8.0. But where are the > missing 0.27 from? > > If I modify the example to insert 10,000 rows, the cost stays the same. > Only if I go for 100,000 rows will the computed cost increase to 8.29. > > Can anybody enlighten me, please ;-). There are some other costs, in your case cpu_tuple_cost and cpu_index_tuple_cost. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°