Home > mailing lists

Plan uses wrong index, preferring to scan pkey index instead - Mailing list pgsql-performance

From	Yuri Kunde Schlesner
Subject	Plan uses wrong index, preferring to scan pkey index instead
Date	November 16, 2014 02:16:06
Msg-id	1416093360.655916.191452721.03160196@webmail.messagingengine.com Whole thread Raw
Responses	Re: Plan uses wrong index, preferring to scan pkey index instead (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-performance

Tree view

Hi all,

Excuse me if I made any mistakes, as this is my first time posting to a mailing list.

I'm a user of Quassel, a IRC client that uses postgres a backing store for IRC logs and am running into heavy intermittent performance problems. I've tracked it down to a query that takes a very long time (around 4 minutes) to complete when its data isn't cached.

This is the layout of the table being queried and EXPLAIN ANALYZE result for the problematic query:

quassel=> \d backlog

Table "public.backlog"

Column | Type | Modifiers

-----------+-----------------------------+-------------------------------------------------------------

messageid | integer | not null default nextval('backlog_messageid_seq'::regclass)

time | timestamp without time zone | not null

bufferid | integer | not null

type | integer | not null

flags | integer | not null

senderid | integer | not null

message | text |

Indexes:

"backlog_pkey" PRIMARY KEY, btree (messageid)

"backlog_bufferid_idx" btree (bufferid, messageid DESC)

Foreign-key constraints:

"backlog_bufferid_fkey" FOREIGN KEY (bufferid) REFERENCES buffer(bufferid) ON DELETE CASCADE

"backlog_senderid_fkey" FOREIGN KEY (senderid) REFERENCES sender(senderid) ON DELETE SET NULL

quassel=> explain (analyze, buffers) SELECT messageid, time, type, flags, sender, message

FROM backlog

LEFT JOIN sender ON backlog.senderid = sender.senderid

WHERE bufferid = 39

ORDER BY messageid DESC LIMIT 10;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=0.72..37.78 rows=10 width=102) (actual time=154410.353..154410.424 rows=10 loops=1)

Buffers: shared hit=13952 read=19244

-> Nested Loop Left Join (cost=0.72..145800.61 rows=39345 width=102) (actual time=154410.350..154410.414 rows=10 loops=1)

Buffers: shared hit=13952 read=19244

-> Index Scan Backward using backlog_pkey on backlog (cost=0.43..63830.21 rows=39345 width=62) (actual time=154410.327..154410.341 rows=10 loops=1)

Filter: (bufferid = 39)

Rows Removed by Filter: 1248320

Buffers: shared hit=13921 read=19244

-> Index Scan using sender_pkey on sender (cost=0.29..2.07 rows=1 width=48) (actual time=0.005..0.005 rows=1 loops=10)

Index Cond: (backlog.senderid = senderid)

Buffers: shared hit=31

Total runtime: 154410.477 ms

(12 rows)

This plan is consistently chosen, even after ANALYZEing and REINDEXing the table. It looks like Postgres is opting to do a sequential scan of the backlog_pkey index, filtering rows by bufferid, instead of directly using the backlog_bufferid_idx index that directly maps to the operation being made by the query. I was advised on IRC to try dropping the backlog_pkey index to force Postgres to use the correct one, and that uses a better plan:

quassel=> begin;

BEGIN

quassel=> alter table backlog drop constraint backlog_pkey;

ALTER TABLE

quassel=> explain analyze SELECT messageid, time, type, flags, sender, message

FROM backlog

LEFT JOIN sender ON backlog.senderid = sender.senderid

WHERE bufferid = 39

ORDER BY messageid DESC LIMIT 10;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=0.72..40.50 rows=10 width=102) (actual time=63.826..162.134 rows=10 loops=1)

-> Nested Loop Left Join (cost=0.72..156518.91 rows=39345 width=102) (actual time=63.823..162.126 rows=10 loops=1)

-> Index Scan using backlog_bufferid_idx on backlog (cost=0.43..74548.51 rows=39345 width=62) (actual time=63.798..63.814 rows=10 loops=1)

Index Cond: (bufferid = 39)

-> Index Scan using sender_pkey on sender (cost=0.29..2.07 rows=1 width=48) (actual time=8.532..9.825 rows=1 loops=10)

Index Cond: (backlog.senderid = senderid)

Total runtime: 162.377 ms

(7 rows)

quassel=> rollback;

ROLLBACK

(This query was also run with empty caches.) bufferid=39 in particular has this issue because it hasn't had any messages posted to for a long time, so scanning backlog upwards will take a long time to gather 10 messages from it. In contrast, most other bufferid's have their messages interleaved on the last entries of backlog. I believe this might be throwing Postgres' estimates off.

Does anyone know if there's any tweaking I can do in Postgres so that it uses the appropriate plan?

Info about my setup:

PostgreSQL 9.3.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.9.1, 64-bit

Arch Linux, PostgreSQL installed from the official repositories, running inside a Xen HVM VPS.

Connecting to PostgreSQL using psql via UNIX socket.

Changed options: (locale-related ones omitted)

listen_addresses =

max_stack_depth = 2MB

shared_buffers = 256MB (Issue is also present with default value)

Total RAM: 1GB

Thanks,

--yuriks

pgsql-performance by date:

From: Mark Kirkwood
Date: 15 November 2014, 06:29:48
Subject: Re: pgtune + configurations with 9.3

From: Yuri Kunde Schlesner
Date: 16 November 2014, 02:24:09
Subject: Re: Plan uses wrong index, preferring to scan pkey index instead

Plan uses wrong index, preferring to scan pkey index instead - Mailing list pgsql-performance

Previous

Next