On Wed, Aug 7, 2013 at 12:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alexis Lê-Quôc <alq@datadoghq.com> writes: > The query itself is very simple: a primary key lookup on a 1.5x10^7 rows. > The issue is that we are looking up over 11,000 primary keys at once, > causing the db to consume a lot of CPU.
It looks like most of the runtime is probably going into checking the c.key = ANY (ARRAY[...]) construct. PG isn't especially smart about that if it fails to optimize the construct into an index operation --- I think it's just searching the array linearly for each row meeting the other restrictions on c.
You could try writing the test like this: c.key = ANY (VALUES (1), (17), (42), ...) to see if the sub-select code path gives better results than the array code path. In a quick check it looked like this might produce a hash join, which seemed promising anyway.
regards, tom lane
Thank you very much Tom, your suggestion is spot on. Runtime decreased 100-fold, from 20s to 200ms with a simple search-and-replace.