Home > mailing lists

Re: optimizing large query with IN (...) - Mailing list pgsql-performance

From	Steve Atkins
Subject	Re: optimizing large query with IN (...)
Date	March 10, 2004 10:50:15
Msg-id	20040310144253.GA31063@gp.word-to-the-wise.com Whole thread Raw
In response to	optimizing large query with IN (...) ("Marcus Andree S. Magalhaes" <marcus.magalhaes@vlinfo.com.br>)
List	pgsql-performance

Tree view

On Wed, Mar 10, 2004 at 12:35:15AM -0300, Marcus Andree S. Magalhaes wrote:
> Guys,
>
> I got a Java program to tune. It connects to a 7.4.1 postgresql server
> running Linux using JDBC.
>
> The program needs to update a counter on a somewhat large number of
> rows, about 1200 on a ~130k rows table. The query is something like
> the following:
>
> UPDATE table SET table.par = table.par + 1
> WHERE table.key IN ('value1', 'value2', ... , 'value1200' )
>
> This query runs on a  transaction (by issuing  a call to
> setAutoCommit(false)) and a commit() right after the query
> is sent to the backend.
>
> The process of committing and updating the values is painfully slow
> (no surprises here). Any ideas?

I posted an analysis of use of IN () like this a few weeks ago on
pgsql-general.

The approach you're using is optimal for < 3 values.

For any more than that, insert value1 ... value1200 into a temporary
table, then do

 UPDATE table SET table.par = table.par + 1
 WHERE table.key IN (SELECT value from temp_table);

Indexing the temporary table marginally increases the speed, but not
significantly.

Cheers,
  Steve

pgsql-performance by date:

From: "Shea,Dan [CIS]"
Date: 10 March 2004, 09:56:59
Subject: Cluster failure due to space

From: Tom Lane
Date: 10 March 2004, 11:28:51
Subject: Re: Cluster failure due to space

Re: optimizing large query with IN (...) - Mailing list pgsql-performance

Previous

Next