Home > mailing lists

Performance and IN clauses - Mailing list pgsql-performance

From	Kynn Jones
Subject	Performance and IN clauses
Date	November 18, 2008 11:53:29
Msg-id	c2350ba40811180753m2c6b698csdf180047e0fa621f@mail.gmail.com Whole thread Raw
Responses	Re: Performance and IN clauses
List	pgsql-performance

Tree view

Hi. I have a Perl script whose main loop generates thousands of SQL updates of the form

UPDATE edge SET keep = true WHERE node1 IN ( $node_list ) AND node2 = $node_id;

...where here $node_list stands for a comma-separated list of integers, and $node_id stands for some integer.

The list represented by $node_list can be fairly long (on average it has around 900 entries, and can be as long as 30K entries), and I'm concerned about the performance cost of testing for inclusion in such a long list. Is this done by a sequential search? If so, is there a better way to write this query? (FWIW, I have two indexes on the edge table using btree( node1 ) and btree( node2 ), respectively.)

Also, assuming that the optimal way to write the query depends on the length of $node_list, how can I estimate the "critical length" at which I should switch from one form of the query to the other?

TIA!

Kynn

pgsql-performance by date:

From: "Dave Page"
Date: 17 November 2008, 16:47:54
Subject: Re: Bad performance on simple query

From: Matthew Wakeling
Date: 18 November 2008, 12:12:30
Subject: Re: Performance and IN clauses

Performance and IN clauses - Mailing list pgsql-performance

Previous

Next