Re: optimizing large query with IN (...) - Mailing list pgsql-performance

From Steve Atkins
Subject Re: optimizing large query with IN (...)
Date
Msg-id 20040310144253.GA31063@gp.word-to-the-wise.com
Whole thread Raw
In response to optimizing large query with IN (...)  ("Marcus Andree S. Magalhaes" <marcus.magalhaes@vlinfo.com.br>)
List pgsql-performance
On Wed, Mar 10, 2004 at 12:35:15AM -0300, Marcus Andree S. Magalhaes wrote:
> Guys,
>
> I got a Java program to tune. It connects to a 7.4.1 postgresql server
> running Linux using JDBC.
>
> The program needs to update a counter on a somewhat large number of
> rows, about 1200 on a ~130k rows table. The query is something like
> the following:
>
> UPDATE table SET table.par = table.par + 1
> WHERE table.key IN ('value1', 'value2', ... , 'value1200' )
>
> This query runs on a  transaction (by issuing  a call to
> setAutoCommit(false)) and a commit() right after the query
> is sent to the backend.
>
> The process of committing and updating the values is painfully slow
> (no surprises here). Any ideas?

I posted an analysis of use of IN () like this a few weeks ago on
pgsql-general.

The approach you're using is optimal for < 3 values.

For any more than that, insert value1 ... value1200 into a temporary
table, then do

 UPDATE table SET table.par = table.par + 1
 WHERE table.key IN (SELECT value from temp_table);

Indexing the temporary table marginally increases the speed, but not
significantly.

Cheers,
  Steve



pgsql-performance by date:

Previous
From: "Shea,Dan [CIS]"
Date:
Subject: Cluster failure due to space
Next
From: Tom Lane
Date:
Subject: Re: Cluster failure due to space