Re: Slow query (planner insisting on using 'external merge' sort type) - Mailing list pgsql-performance

From Igor Neyman
Subject Re: Slow query (planner insisting on using 'external merge' sort type)
Date
Msg-id A76B25F2823E954C9E45E32FA49D70ECCD489D78@mail.corp.perceptron.com
Whole thread Raw
In response to Slow query (planner insisting on using 'external merge' sort type)  (Ian Pushee <ian@intuvisiontech.com>)
Responses Re: Slow query (planner insisting on using 'external merge' sort type)
List pgsql-performance

-----Original Message-----
From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Ian Pushee
Sent: Friday, June 19, 2015 10:34 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Slow query (planner insisting on using 'external merge' sort type)

Hi Folks,

This is my first time posting here, so hopefully I manage to convey all the information needed.
We have a simple query that just started giving us problems in production when the number of rows gets too large
(>100k).
The issue seems to be that the planner wants to sort the rows using a sequential scan, rather than the index provided
specificallyfor this query. This isn't a problem with low numbers of rows, but eventually the query outgrows work_mem
anduses the disk, slowing does the query greatly. I know the common answer is to increase work_mem... but since this
tablesgrowth is unpredictable, that isn't a viable strategy.
 
I've tried increasing shared_buffers and effective_cache_size, but that doesn't appear to effect the plan chosen here.
Setting
random_page_cost=1.0 works, but I'm hoping for a more general solution that doesn't require setting that locally each
timeI run the query. I guess my real question is wether or not there is any way to get the planner to take into account
thefact that it's going to need to do an 'external merge', and that it is going to take a LONG time?
 

Table and Index Schemas:
CREATE TABLE events
(
   id serial NOT NULL,
   name character varying(64),
   eventspy_id integer NOT NULL,
   camera_id integer NOT NULL,
   start_time timestamp without time zone NOT NULL,
   millisecond smallint NOT NULL,
   uid smallint NOT NULL,
   update_time timestamp without time zone NOT NULL DEFAULT now(),
   length integer NOT NULL,
   objects text NOT NULL,
   priority smallint NOT NULL,
   type character varying(45) NOT NULL DEFAULT 'alarm'::character varying,
   status event_status NOT NULL DEFAULT 'new'::event_status,
   confidence smallint NOT NULL DEFAULT 100::smallint,
   CONSTRAINT events_pkey PRIMARY KEY (id)
)
WITH (
   OIDS=FALSE
);

CREATE INDEX events_confidnce
   ON events
   USING btree
   (confidence);

CREATE INDEX events_summary
   ON events
   USING btree
   (name COLLATE pg_catalog."default", eventspy_id, camera_id, type COLLATE pg_catalog."default", status);

Query:
SELECT name, type, eventspy_id, camera_id, status, COUNT(id), MAX(update_time), MIN(start_time), MAX(start_time) FROM
eventsWHERE 
 
confidence>=0 GROUP BY name, eventspy_id, camera_id, type, status;

Explain Analyze outputs (links as requested):
Default plan: http://explain.depesz.com/s/ib3k
Forced index (random_page_cost=1.0): http://explain.depesz.com/s/lYaP

Software/Hardware: PGSql 9.2.1, Windows 8.1, 8GB RAM
All pgsql settings are at their defaults.

Thanks for any help you can provide,
-Ian Pushee

---

Probably events_confidnce index is not very selective, that's why optimizer prefers seq scan.
I'd try to create an index on (name, eventspy_id, camera_id, type, status).

Also, the recent 9.2 is 9.2.13, you should upgrade.

Regards,
Igor Neyman


pgsql-performance by date:

Previous
From: Ian Pushee
Date:
Subject: Slow query (planner insisting on using 'external merge' sort type)
Next
From: Andreas Kretschmer
Date:
Subject: Re: Slow query (planner insisting on using 'external merge' sort type)