PostgreSQL planner - Mailing list pgsql-performance

From Misa Simic
Subject PostgreSQL planner
Date
Msg-id CAH3i69=mvgnGDuU_U0dCemWibpPc0uXf799F0ryvkbW6vTWqEg@mail.gmail.com
Whole thread Raw
Responses Re: PostgreSQL planner
List pgsql-performance
HI,

I have a wierd problem with PostgreSQL planner...

Problem showed up in Production on PG9.1 (Ubuntu)

But I have succeeded to get the same behavior on my PG 9.2 on Windows...

it is about 3 tables & onad one view - but view have volatile function:


CREATE TABLE t1
(
  calc_id serial NOT NULL,
  thing_id integer,
  CONSTRAINT t1_pk PRIMARY KEY (calc_id)
)
WITH (
  OIDS=FALSE
);
ALTER TABLE t1
  OWNER TO postgres;

-- Index: t1_thing_id_idx

-- DROP INDEX t1_thing_id_idx;

CREATE INDEX t1_thing_id_idx
  ON t1
  USING btree
  (thing_id);


other columns from this real table are discarted - and not important, what is important is that in the moment I want to run the query... I know calc_id (pk of this table - but don't know thing_id)...

to simplify test I filled t1 with 100 rows with same values in calc_id and thing_id...

Second table are transactions about things:

CREATE TABLE t2
(
  trans_id serial NOT NULL,
  thing_id integer,
  no_index integer,
  CONSTRAINT t2_pk PRIMARY KEY (trans_id)
)
WITH (
  OIDS=FALSE
);
ALTER TABLE t2
  OWNER TO postgres;

-- Index: t5_c2_idx

-- DROP INDEX t5_c2_idx;

CREATE INDEX t5_c2_idx
  ON t2
  USING btree
  (thing_id);


this table I have filled with 1m rows with rundom number in thing_id between 1 and 100

when we enter transaction about thing to t2, in some moment we could have additional info about the thing, in some moment not... so if we have additional info in the same time row is inserted in t2 and t3 with the same trans_id...

CREATE TABLE t3
(
  trans_id integer NOT NULL,
  c2_text text,
  CONSTRAINT t3_pk PRIMARY KEY (trans_id)
)
WITH (
  OIDS=FALSE
);

no additional indexes on t3...

now we have made a view:

CREATE OR REPLACE VIEW t2_left_t3_volatile AS 
 SELECT t2.trans_id, t2.thing_id, t2.no_index, t3.c2_text, random() AS random
   FROM t2
   LEFT JOIN t3 USING (trans_id);

And here we go:

we want see all transactions about the thing_id

EXPLAIN ANALYZE
SELECT * FROM t2_left_t3_volatile
WHERE thing_id = 20

everything is fine:

"Hash Left Join  (cost=452.46..13067.16 rows=12474 width=45) (actual time=6.537..62.633 rows=12038 loops=1)"
"  Hash Cond: (t2.trans_id = t3.trans_id)"
"  ->  Bitmap Heap Scan on t2  (cost=448.30..12985.03 rows=12474 width=12) (actual time=6.418..57.498 rows=12038 loops=1)"
"        Recheck Cond: (thing_id = 20)"
"        ->  Bitmap Index Scan on t5_c2_idx  (cost=0.00..445.18 rows=12474 width=0) (actual time=4.429..4.429 rows=12038 loops=1)"
"              Index Cond: (thing_id = 20)"
"  ->  Hash  (cost=2.96..2.96 rows=96 width=37) (actual time=0.086..0.086 rows=96 loops=1)"
"        Buckets: 1024  Batches: 1  Memory Usage: 7kB"
"        ->  Seq Scan on t3  (cost=0.00..2.96 rows=96 width=37) (actual time=0.016..0.045 rows=96 loops=1)"
"Total runtime: 63.217 ms"

but problem is - we don't know the thing id - we know calc_id:

EXPLAIN ANALYZE
SELECT * FROM t2_left_t3_volatile v INNER JOIN t1 USING (thing_id)
WHERE calc_id = 20

and planner picks:

"Hash Join  (cost=6.42..48367.52 rows=12111 width=4) (actual time=0.261..471.042 rows=12038 loops=1)"
"  Hash Cond: (t2.thing_id = t1.thing_id)"
"  ->  Hash Left Join  (cost=4.16..31591.51 rows=1211101 width=45) (actual time=0.161..394.076 rows=1211101 loops=1)"
"        Hash Cond: (t2.trans_id = t3.trans_id)"
"        ->  Seq Scan on t2  (cost=0.00..24017.01 rows=1211101 width=12) (actual time=0.075..140.937 rows=1211101 loops=1)"
"        ->  Hash  (cost=2.96..2.96 rows=96 width=37) (actual time=0.069..0.069 rows=96 loops=1)"
"              Buckets: 1024  Batches: 1  Memory Usage: 7kB"
"              ->  Seq Scan on t3  (cost=0.00..2.96 rows=96 width=37) (actual time=0.008..0.035 rows=96 loops=1)"
"  ->  Hash  (cost=2.25..2.25 rows=1 width=4) (actual time=0.035..0.035 rows=1 loops=1)"
"        Buckets: 1024  Batches: 1  Memory Usage: 1kB"
"        ->  Seq Scan on t1  (cost=0.00..2.25 rows=1 width=4) (actual time=0.017..0.030 rows=1 loops=1)"
"              Filter: (calc_id = 20)"
"              Rows Removed by Filter: 99"
"Total runtime: 471.505 ms"

Seq scan on all tables...

First thought was - maybe because of volatile function...

but on:
SELECT v.no_index FROM t2_left_t3_volatile v INNER JOIN t1 USING (thing_id)
WHERE calc_id = 20

planner picks the same scenario... even function column is not in the query...

however, situation is fine, if we have a view without the volatile function:

CREATE OR REPLACE VIEW t2_left_t3 AS 
 SELECT t2.trans_id, t2.thing_id, t2.no_index, t3.c2_text
   FROM t2
   LEFT JOIN t3 USING (trans_id);

EXPLAIN ANALYZE
SELECT v.no_index FROM t2_left_t3 v INNER JOIN t1 USING (thing_id)
WHERE calc_id = 20


"Nested Loop  (cost=437.49..13047.74 rows=12111 width=4) (actual time=6.360..71.818 rows=12038 loops=1)"
"  ->  Seq Scan on t1  (cost=0.00..2.25 rows=1 width=4) (actual time=0.016..0.024 rows=1 loops=1)"
"        Filter: (calc_id = 20)"
"        Rows Removed by Filter: 99"
"  ->  Bitmap Heap Scan on t2  (cost=437.49..12924.38 rows=12111 width=12) (actual time=6.330..69.063 rows=12038 loops=1)"
"        Recheck Cond: (thing_id = t1.thing_id)"
"        ->  Bitmap Index Scan on t5_c2_idx  (cost=0.00..434.46 rows=12111 width=0) (actual time=4.372..4.372 rows=12038 loops=1)"
"              Index Cond: (thing_id = t1.thing_id)"
"Total runtime: 72.461 ms"

Any idea why planner picks bad plan if there is VOLATILE function?

there are no difference in result between:

SELECT v.no_index, random FROM t2_left_t3_volatile v INNER JOIN t1 USING (thing_id)
WHERE calc_id = 20

And

SELECT v.no_index, random() FROM t2_left_t3 v INNER JOIN t1 USING (thing_id)
WHERE calc_id = 20

but huge difference in plan...

And logically there is no diff to (our solution)


EXPLAIN ANALYZE
SELECT * FROM t2_left_t3_volatile
WHERE thing_id = (SELECT thing_id FROM t1 WHERE calc_id = 20)


though real scenario is a lot more complex... i.e. t1 has start_date and end_date...
t3 has date colummn as well

so on simple question:

SELECT * FROM t2_left_t3_volatile v INNER JOIN t1 USING (thing_id)
WHERE calc_id = 20 AND v.date BETWEEN t1.start_date AND t2.end_date

We would need to write 3 subqueries on the same table to dont use join...

but to dont use 3 times subquery... we use CTE

WITH calc AS
(
SELECT thing_id FROM t1 WHERE calc_id = 20
)
SELECT * FROM t2_left_t3_volatile v
WHERE v.thing_id=calc.thing_id AND v.date BETWEEN calc.start_date AND calc.end_date

And result is acceptable...

But solution is not good enough - it means, whenever we meet problem with perfomance (in production - unfortunatelly) - we will need to spend time to redefine - simple queries! :(

Now I am not sure - is this for perform or hackers list...

Any suggestion what we can do to improve things?

Or Any insights that things with planner inside Postgres will be improved in "reasonable time" - whatever it means :) :)

Thanks in advance,

Misa


pgsql-performance by date:

Previous
From: Misa Simic
Date:
Subject: Re: Performance of query
Next
From: Roman Konoval
Date:
Subject: Re: Performance of query